We are very happy to release the first candidate of what will become dnsdist 1.8.0!
This release contains a significant amount of changes since the last major release, 1.7.0, which was released a bit over a year ago. We try to stick to a major release every six months, but this one took a bit longer than expected as we tackled a few challenges:
Low-end devices friendly
We know, based on the feedback we get from the users that interact with us, that dnsdist is used in a lot of different environments, from very large installations dealing with millions of queries per second to very small computers running in a closet somewhere! While we have until now been more focused on the first case, we have been getting a lot of interest coming from the very-low end of the spectrum: low-end devices, like customer premises equipment (CPEs), with very few resources. We realized that while other open-source components do a good job of providing traditional DNS services in that world, there is a need for software providing DNS over TLS and DNS over HTTPS support, to protect the confidentiality and integrity in the first mile of the internet access.
We knew that dnsdist was already successfully used on small devices, like raspberry pis, and that our memory and CPU usage was quite low, so we were surprised to learn that people were struggling to meet the very stringent requirements of some devices, and decided to have a look. This was a very interesting journey into flash-based filesystems of a few dozen megabytes, proportional set size memory usage, and low-powered CPUs.
Long story short, we managed to drastically reduce our memory usage and our CPU consumption, especially with very low QPS rates. We developed a new way of doing health-checking for these environments, only doing an actual active health-check after detecting failures from normal traffic. We also introduced a few options to reduce our binary size where it matters, like on OpenWrt builds.
We wrote the necessary code to make dnsdist play nicely with OpenWrt’s native configuration format, Unified Configuration Interface (UCI), so that it is easy to set up dnsdist via the usual interfaces, including the Web UI.
We also provide DHCP integration, so that dnsdist can learn about devices on the local network and provide native DNS resolution for these devices.
This integration is not yet merged into the OpenWrt tree as it requires some feature that will only be available once 1.8.0 final has been released. Stay tuned, or reach out if you want a quick peek!
We also realized that we could no longer rely on the network path between dnsdist and its backend to be trusted: while this is true when dnsdist is deployed on the same box, rack or datacenter as the backend, this no longer is when it is deployed on a CPE and instructed to forward its queries to a remote recursive resolver like Quad9.
Of course we strongly advise using DNS over TLS and/or DNS over HTTPS to secure that path, but this is unfortunately not always possible. We learned the hard way that in some countries ISPs are not only providing DNS over plain UDP only, without even supporting plain TCP, they are also still blocking attempts to connect to an external resolver via a more secure channel.
To work around that issue, we implemented new features to make dnsdist suitable as a proxy with an untrusted network path to the resolver, using well-known methods: random ports and random IDs. These are not enabled by default because they come at a cost, which we don’t want to impose when it is not necessary.
Discovery of Designated Resolvers
It’s one thing to support DNS over TLS and DNS over HTTPS both inbound and outbound, but it really does not help if the client does not know that you do, or if the configuration does not tell dnsdist that the backend does.
The IETF has been working for quite some time now on a new mechanism that leverages the SVCB record type to actually advertise that a secure, encrypted endpoint is available for use: Discovery of Designated Resolvers (DDR).
Since 1.7.0 dnsdist has been able to advertise DoT and DoH support to the client via SVCB records, but that requires writing a few lines of Lua to configure it. In 1.8.0, we have integrated that process into the OpenWrt configuration, requiring a single click to enable DDR advertisement to all the local clients, allowing Android and iOS devices to automatically upgrade to a secure channel.
We also taught dnsdist how to use DDR to detect whether a given backend can be upgraded from plain Do53 to DoT and DoH, so that we switch to a secure channel as soon as it becomes available, and fallback to Do53 if needed.
To be able to keep pushing for broader adoption of DoT and DoH, it is crucial to reduce the overhead of the encryption compared to plain old Do53. To do so, we have added support for:
- hardware-accelerated TLS via OpenSSL engines, like Intel Quick-Assist Technology (QAT)
- Linux’s kernel TLS acceleration, eliminating the need to copy data into user space and back for TLS operations
The technologies are still evolving quickly, and for now are marked as experimental in dnsdist but yields very promising results.
The ability to act on a Server Failure, Refused, or any specific type of responses to trigger a second DNS lookup is a feature that regularly came up. It was not easy to implement given the existing design of dnsdist, but we refactored a fair amount of code in this release to be able to process queries and responses in an asynchronous way, paving the way for external lookups without blocking dnsdist and degrading performance.
This refactoring allowed us to finally implement that second-chance lookup, so that a query can be re-sent to a different pool of servers if the obtained response is not good enough.
It is now possible to define custom counters and gauges, that can be manipulated via the Lua API and are exported via the API and prometheus like built-in metrics.
New compilations options, Link-Time optimizations
We introduced several new compile-time options:
- Link-Time Optimizations (LTO): GCC, clang and the associated linkers now support a new mode of building a binary, where information about all the individuals components, called compilation units, is made available to the linker so that it can make better optimization decisions. We have now enabled these optimizations in our own packages, via the –enable-lto option.
- For a long time, we have been automatically detecting if the compiler has support for the FORTIFY_SOURCE=2 hardening option, enabling it whenever possible. Recently a stronger version of that option has been supported by GCC and clang, FORTIFY_SOURCE=3. This stronger version can be enabled by passing either –enable-fortify-source=3 or –enable-fortify-source=auto to our configure, with the latter always selecting the best supported version. We have enabled the stronger version in our test suites, but not yet in our production builds, as we are not yet sure of the actual impact
- C++, as opposed to other languages, does not initialize its variables by default. This had led to a fair amount of security issues in the past, ranging from information disclosure to the ability to execute arbitrary code. We now have a new option, –enable-auto-var-init=zero, that can be used to zero-initialize all variables that are allocated on the stack. We have not yet enabled this option in our production builds, but we have enabled instead, in our test suites, a variant that increases the likelihood of detecting bugs by initializing the variables with specific patterns: –enable-auto-var-init=pattern
Users that can trade a bit of performance for stronger security guarantees are invited to enable both –enable-fortify-source=auto and –enable-auto-var-init=zero.
And many other improvements
- A lot of new functionalities are now accessible via Lua: helpers to interface with the system network configuration, to get the MAC address of a client, to inspect and edit queries and responses
- The scalability of MaxQPSIPRule has been improved on multi-core setups
- The handling of multiple Carbon servers was lacking, allowing a misbehaving Carbon server to impact other servers: this has now been fixed
- We introduced a new chain of rules, triggered after cache insertion
- Our eBPF and XDP code has been greatly improved by Pierre Grié, Y7n05h and Yogesh Singh
In the second part of 2022 we have mandated a security audit from the Nixu team, to have a strong look at the new features we introduced in 1.8.0 in particular (DDR, DNS over HTTPS, OpenWrt integration). This is the second audit of the dnsdist code-base realized by Nixu, and they were able to quickly focus on the new features. They went above and beyond what we expected, as they did last time, and found a potential issue in the way our ACL interacted with the OpenWrt system, in our not yet released UCI integration. In short, we were relying a bit too much on the OpenWrt firewall, and it might have opened access to the Do53, DoT and DoH ports from unintended network interfaces in some deployment scenarios where the firewall was not effective. We fixed that by being more restrictive in our default ACL.
We are immensely grateful to the PowerDNS community for the reporting of bugs, issues, feature requests, and especially to the submitters of fixes and implementations of features.