dnsdist 1.3.2 released

We are very happy to announce the 1.3.2 release of dnsdist. This release contains a few new features, but is mostly fixing bugs and documentation issues reported since the release of dnsdist 1.3.0. You might be wondering why this release is not numbered 1.3.1, we discovered a build issue on some platforms right after tagging 1.3.1 and therefore decided to release 1.3.2 right away.

Breaking changes

After discussing with several users, we noticed that quite a lot of them were not aware that enabling the dnsdist’s console without a key, even restricted to the local host, could be a security issue and allow privilege escalation by allowing an unprivileged user to connect to the console and execute Lua code as the dnsdist user. We therefore decided to refuse any connection to the console until a key has been set, so please check that you do set a key before upgrading if you use the console.

New features

The DNS over TLS feature introduced in 1.3.0 was missing the ability to support both an RSA and an ECDSA certificate at the same time, and it was not possible to switch to a new certificate without restarting dnsdist. This has now been fixed.

The packet cache has also been improved in this release, with the addition of a negative TTL option to be able to specify how long NODATA and NXDOMAIN answers should be cached, as well as a way to dump the content of the cache. We also made the detection of ECS collisions more robust, preventing two queries for the same name, type and class but a different ECS subnet from colliding even if they did hash to the same value.

This version gained the ability to insert dynamic rules that do nothing, and do not stop the processing of subsequent rules, which is very useful for testing purposes. The optimized DynblockRulesGroup introduced in 1.3.0 also gained the ability to whitelist and blacklist ranges from dynamic rules, for example to prevent some clients from ever being blocked by a rate-limiting rule.

Finally, we introduced the new SetECSAction directive to be able to force the ECS value sent to a downstream server for some or all queries.

Bug fixes

In addition to various documentation and cosmetics fixes, a few annoying bugs have been fixed in this release:

  • If the first connection attempt to a given backend failed, dnsdist didn’t properly reconnect even when the backend became available ;
  • Dynamic blocks were sometimes created with the wrong duration ;
  • The ability to iterate over the results of the Lua exceed*() functions was broken in 1.3.0, preventing manual whitelisting from Lua ;
  • Some statistics were displayed with too many decimals in the web interface ;
  • A backend outstanding queries counter could become wrong if it dropped a lot of queries for a while.

 

Please see the dnsdist website for the more complete changelog and the current documentation.

Release tarballs are available on the downloads website.

Several packages are also available on our repository.

PowerDNS Authoritative Server 4.1.3 Released

We’re pleased to announce the availability of the PowerDNS Authoritative Server version 4.1.3. This is a maintenance release addressing a performance issue in the GeoIP backend and fixes several other issues.

The changelog is below, the full changelog can be found in the documentation.

Improvements

  • #6239, #6559: pdnsutil: use new domain in b2bmigrate (Aki Tuomi)
  • #6130: Update copyright years to 2018 (Matt Nordhoff)
  • #6312, #6545: Lower ‘packet too short’ loglevel

Bug Fixes

  • #6441, #6614: Restrict creation of OPT and TSIG RRsets
  • #6228, #6370: Fix handling of user-defined axfr filters return values
  • #6584, #6585, #6608: Prevent the GeoIP backend from copying NetMaskTrees around, fixes slow-downs in certain configurations (Aki Tuomi)
  • #6654, #6659: Ensure alias answers over TCP have correct name

The tarball is on the downloads website (sig), packages for CentOS 6 and 7, Ubuntu Trusty, Xenial, Artful and Bionic, Debian Jessie and Stretch and Raspbian Jessie are available from the repositories.

PowerDNS Recursor 4.1.3 Released

This release improves the stability and resiliency of the RPZ implementation, prevents metrics gathering from slowing down the processing of DNS queries and fixes an issue related to the cleaning of EDNS Client Subnet entries from the cache.

The full changelog looks like this:

Improvements

  • #6550, #6562: Add a subtree option to the API cache flush endpoint.
  • #6566: Use a separate, non-blocking pipe to distribute queries.
  • #6567: Move carbon/webserver/control/stats handling to a separate thread.
  • #6583: Add _raw versions for QName / ComboAddresses to the FFI API.
  • #6611, #6130: Update copyright years to 2018 (Matt Nordhoff).
  • #6474, #6596, #6478: Fix a warning on botan >= 2.5.0.

Bug Fixes

  • #6313: Count a lookup into an internal auth zone as a cache miss.
  • #6467: Don’t increase the DNSSEC validations counters when running with process-no-validate.
  • #6469: Respect the AXFR timeout while connecting to the RPZ server.
  • #6418, #6179: Increase MTasker stacksize to avoid crash in exception unwinding (Chris Hofstaedtler).
  • #6419, #6086: Use the SyncRes time in our unit tests when checking cache validity (Chris Hofstaedtler).
  • #6514, #6630: Add -rdynamic to C{,XX}FLAGS when we build with LuaJIT.
  • #6588, #6237: Delay the loading of RPZ zones until the parsing is done, fixing a race condition.
  • #6595, #6542, #6516, #6358, #6517: Reorder includes to avoid boost L conflict.

The tarball is available on downloads.powerdns.com (signature) and packages for CentOS 6 and 7, Debian Jessie and Stretch, Ubuntu Artful, Bionic, Trusty and Xenial are available from repo.powerdns.com.

Please send us all feedback and issues you might have via the mailing list, or in case of a bug, via GitHub.

Authoritative server 4.1.2 released

This is the third release in the 4.1 train. Besides bug fixes, it contains some performance and usability improvements.

Please find the most important changes below. For full details, visit the changelog.

Improvements

  • API: increase serial after dnssec related updates
  • Dnsreplay: bail out on a too small outgoing buffer
  • lower ‘packet too short’ loglevel
  • Make check-zone error on rows that have content but shouldn’t
  • avoid an isane amount of new backend connections during an axfr
  • Report unparseable data in stoul invalid_argument exception
  • recheck serial when axfr is done
  • add tcp support for alias

Bug Fixes

  • allocate new statements after reconnecting to postgresql
  • bindbackend: only compare ips in ismaster() (Kees Monshouwer)
  • Rather than crash, sheepishly report no file/linenum
  • Document undocumented config vars
  • prevent cname + other data with dnsupdate

Tarball (sig) is available on the downloads website. Packages for Debian, CentOS and Ubuntu are uploaded to our repositories.

dnsdist 1.3.0 released

We are very happy to announce the 1.3.0 release of dnsdist, with a huge emphasis on privacy and scalability.

Privacy

A lot of users were interested in DNS over TLS support in dnsdist, to protect the privacy and integrity of queries and responses in transit between the client and dnsdist. We have been supporting DNSCrypt since 1.0.0, and improved it in this release by adding support for multiple active certificates and the new xchacha20 algorithm, but DNS over TLS is getting more traction and it made complete sense to support it as well in dnsdist. Our implementation can use either OpenSSL or GnuTLS, and we advise to enable both backends during compilation in order to be able to quickly switch from one to another should a serious vulnerability in one of them be found.

Scalability

As dnsdist is deployed on huge setups, we noticed that it did not scale as well as we expected over a large number of CPU cores. We investigated and found several points of contention, which we addressed by going lock-less whenever possible, and by reducing the granularity of the involved locks when it was not. This led to the optional sharding of the packet cache and our in-memory ring buffers, as well as a new per-pool mutex replacing the global Lua one for non-Lua load-balancing policies.

We had known for a while that dnsdist opening a single socket towards each backend was not performing too well in some scenarios, for example in front of a PowerDNS Recursor with multiple threads, reuseport support enabled and pdns-distribute-queries set to no, because the kernel would then not distribute queries evenly over the different threads. A known work-around was to add the same backend several times in the configuration, but it made metrics hard to understand and caused an unnecessary amount of contexts switching. Starting with 1.3.0, dnsdist supports opening a configurable amount of sockets towards a single backend.

Finally we observed that CPU pinning made a huge difference on some setups, especially on NUMA architecture, so we added the possibility to pin client and backend facing threads to specific CPU cores.

XPF

The solution to pass the client IP on to the backend in dnsdist has always been to add an EDNS Client Subnet option to the query. While it does work nicely, ECS was not designed for this use case and thus lacks some relevant information like the original source and destination ports, as well as the original destination IP. It also makes it impossible to keep any existing ECS information and forward the original source IP.
In coordination with the nice people from ISC, PowerDNS is working on a new solution called XPF, whose current draft is now implemented in dnsdist.

dnstap

In addition to our existing protocol buffer-based solution to export live information on queries and responses processed by dnsdist, Justin Valentini and Chris Hofstaedtler contributed support for exporting queries and responses over the dnstap protocol, which is supported by several other open source DNS servers and can be processed by third party tools.

Older versions

With the release of 1.3.0 today, we are also announcing that the 1.0 and 1.1 branches of dnsdist are now end of life and will not receive any updates, not even security fixes.
Note: Users with a commercial agreement with PowerDNS.COM BV or Open-Xchange can receive extended support for releases which are End Of Life. If you are such a user, these EOL statements do not apply to you.

Other Changes

As a final note, please be aware of three noteworthy changes in this new version:

  • First we removed the –daemon option, in which we kept finding new bugs. Very few users were actually using it, and since most OS provide at least one supervisor we decided to simply remove it ;
  •  Secondly we added the possibility to restrict access to the console using an ACL when it’s bound to a non-loopback IP. The default ACL allows connections from 127.0.0.1 and ::1 only, so you might need to update it to keep using the console over the network. Please make sure that you have enabled encryption before doing so ;
  • We finally removed some functions that were deprecated in 1.2.0 because they were redundant and made it harder to understand how the rules and actions actually work. Please have a look at the documentation to update your configuration.

Please see the dnsdist website for the more complete changelog and the current documentation.

Release tarballs are available on the downloads website.

Several packages are also available on our repository.

PowerDNS Recursor 4.1.2 Released

This release improves the stability and resiliency of the RPZ implementation and fixes several issues related to EDNS Client Subnet.

The full changelog looks like this:

New Features

  • #6344: Add FFI version of gettag().

Improvements

  • #6298, #6303, #6268, #6290: Add the option to set the AXFR timeout for RPZs.
  • #6172: IXFR: correct behavior of dealing with DNS Name with multiple records and speed up IXFR transaction (Leon Xu).
  • #6379: Add RPZ statistics endpoint to the API.

Bug Fixes

  • #6336, #6293, #6237: Retry loading RPZ zones from server when they fail initially.
  • #6300: Fix ECS-based cache entry refresh code.
  • #6320: Fix ECS-specific NS AAAA not being returned from the cache.

The tarball is available on downloads.powerdns.com (signature) and packages for CentOS 6 and 7, Debian Jessie and Stretch, Ubuntu Artful, Trusty and Xenial are available from repo.powerdns.com.

Please send us all feedback and issues you might have via the mailing list, or in case of a bug, via GitHub.

“The DNS Camel”, or, the rise in DNS complexity

This week was my first IETF visit. Although I’ve been active in several IETF WGs for nearly twenty years, I had never bothered to show up in person. I now realize this was a very big mistake – I thoroughly enjoyed meeting an extremely high concentration of capable and committed people. While RIPE, various NOG/NOFs and DNS-OARC are great venues as well, nothing is quite the circus of activity that an IETF meeting is. Much recommended!

DYv4Tt4V4AADCeT

Before visiting I read up on recent DNS standardization activity, and I noted a ton of stuff was going on. In our development work, I had also been noticing that many of the new DNS features interact in unexpected ways. In fact, there appears to be somewhat of a combinatorial explosion going on in terms of complexity.

As an example, DNAME and DNSSEC are separate features, but it turns out DNAME can only work with DNSSEC with special handling. And every time a new outgoing feature is introduced, like for exampled DNS cookies, new probing is required to detect authoritative servers that get confused by such newfangled stuff.

This led me to propose a last minute talk (video!) to the DNSOP Working Group, which I tentatively called “The DNS Camel, or, how many features can we add to this protocol before it breaks”. This ended up on the agenda as “The DNS Camel” (with no further explanation) which intrigued everyone greatly. I want to thank DNSOP chairs Suzanne and Tim for accommodating my talk which was submitted at the last moment!

Note: My “DNS is too big” story is far from original! Earlier work includes “DNS Complexity” by Paul Vixie in the ACM Queue and RFC 8324 “DNS Privacy, Authorization, Special Uses, Encoding, Characters, Matching, and Root Structure: Time for Another Look” by John Klensin. Randy Bush presented on this subject in 2000 and even has a slide describing DNS as a camel!

Based on a wonderful chart compiled by ISC, I found that DNS is now described by at least 185 RFCs. Some shell-scripting and HTML scraping later, I found that this adds up to 2781 printed pages, comfortably more than two copies of “The C++ Programming Language (4th edition)”. This book is not known for its brevity.

Screenshot from 2018-03-22 21-21-51

Artist impression of DNS complexity over time

In graph form, I summarised the rise of DNS complexity as above. My claim is that this rise is not innocent. As DNS becomes more complex, the number of people that “get it” also goes down. Notably, the advent of DNSSEC caused a number of implementations to drop out (MaraDNS, MyDNS, for example).

Also, with the rise in complexity and the decrease in number of capable contributers, the inevitable result is a drop in quality:

Screenshot from 2018-03-22 21-27-28

Orange = number of people that “get it”. Green is perceived implementation quality. Also lists work in the pipeline.

And in fact, with the advent of DNSSEC this is what we found. For several years, security & stability bugs in popular nameserver implementations were absolutely dominated by DNSSEC and cryptography related issues.

My claim is that we are heading for that territory again.

So how did this happen? We all love DNS and we don’t want to see it harmed in any way. Traditionally, protocol or product evolution is guided by forces pulling and pushing on it.

Screenshot from 2018-03-22 22-12-43

Actual number of RFC pages over time. Grows at around 2 pages/week. Shutdown of DNSEXT is barely visible

Requirements from operators ‘pull’ DNS in the direction of greater complexity. Implementors meanwhile usually push back on such changes because they fear future bugs, and because they usually have enough to do already. Operators, additionally, are weary of complexity: they are the ones on call 24/7 to fix problems. They don’t want their 3AM remedial work to be any harder than it has to be.

Finally, the standardization community may also find things that need fixing. Standardizers work hard to make the internet better (the new IETF motto I think), and they find lots of things that could be improved – either practically or theoretically.

In the DNS world, we have the unique situation that (resolver) operator feedback is largely absent. Only a few operators manifest themselves in the standardization community (Cloudflare, Comcast, Google, Salesforce being notably present). Specifically, almost no resolver operator (access provider) ever speaks at WG meetings or writes on mailing lists. In reality, large scale resolver operators are exceptionally weary of new DNS features and turn off whatever features they can to preserve their night time rest.

Screenshot from 2018-03-22 21-47-35

On the developer front, the DNS world is truly blessed with some of the most gifted programmers in the world. The current crop of resolvers and authoritative servers is truly excellent. DNS may well be the best served protocol in existence today. This high level of skill also has a downside however. DNS developers frequently see immense complexity not as a problem but as a welcome challenge to be overcome. We say yes to things we should say no to. Less gifted developer communities would have to say no automatically since they simply would not be able to implement all that new stuff. We do not have this problem. We’re also too proud to say we find something (too) hard.

Finally, the standardization community has its own issues. A ‘show of hands’ made it clear that almost no one in the WG session was actually on call for DNS issues. Standardizers enjoy complexity but do not personally bear the costs of that complexity. Standardizers are not on 24/7 call as there rarely is a need for an emergency 3AM standardization session!

Notably, a few years ago I was informed by RFC authors that ‘NSEC3’ was easy. We in the implementation community meanwhile were pondering that the ‘3’ in NSEC3 probably stood for the number of people that understood this RRTYPE! I can also report that as of 2018, the major DNSSEC validator implementations still encounter NSEC3 corner cases where it is not clear what the intended behaviour is.

Note that our standardizers, like our developers, are extremely smart people. This however is again a mixed blessing – this talent creates at the very least an acceptance of complexity and a desire to conquer really hard problems, possibly in very clever ways.

The net result of the various forces on DNS not being checked is obvious: more and more complex features.

Orthogonality of features

As noted above, adding a lot of features can lead to a combinatorial explosion. DNSSEC has to know about DNAME. CZNic contributed related the following gem they discovered during the implementation of ‘aggressive NSEC for NXDOMAIN detection’: it collides with trust-anchor signalling. The TA signalling happens in the form of a query to the root that leads to an NXDOMAIN, with associated NSEC records. These NSEC records then shut up further TA signalling, as no TA related names apparently exist! And here two unrelated features now need to know about each other:  aggressive NSEC needs to be disabled for TA signalling.

If even a limited number of features overlap (ie, are not fully orthogonal), soon the whole codebase consists of features interacting with each other.

We’re well on our way there, and this will lead to a reduction in quality, likely followed by a period of stasis where NO innovation is allowed anymore. And this would be bad. DNS is still not private and there is a lot of work to do.

Suggestions

I rounded off my talk with a few simple suggestions:

Screenshot from 2018-03-22 22-02-30

Quickly a 20 person long queue formed at the mic. It turns out that while I may have correctly diagnosed a problem, and that there is wide agreement that we are digging a hole for ourselves, I had not given sufficient thought about any solutions.

IETF GROW WG chair Job Snijders noted that the BGP-related WGs have implemented different constituencies (vendors, operators) that all have to agree. In addition, interoperable implementations are a requirement before a draft can progress to standard. This alone would cut back significantly on the flow of new standards.

Other speakers with experience in hardware and commercial software noted that in their world the commercial vendors provided ample feedback to not make life too difficult, or that such complexity would at least come at huge monetary cost. Since in open source features are free, we do not “benefit” from that feedback.

There was enthusiasm for the idea of going through the “200 DNS RFCs” and deprecating stuff we no longer thought was a good idea. This enthusiasm was more in theory than in practice though as it is known to be soul crushing work.

The concept however of reducing at least the growth in DNS complexity was very well received. And in fact, in subsequent days, there was frequent discussion about the “DNS Camel”:

camel

And in fact, a draft has even been written that simplifies DNS by specifying DNS implementations no longer need to probe for EDNS0 support. The name of the draft? draft-spacek-edns-camel-diet-00!

I’m somewhat frightened of the amount of attention my presentation got, but happy to conclude it apparently struck a nerve that needed to be struck.

Next steps

So what are the next steps? There is a lot to ponder.

I’ve been urged by several very persuasive people to not only rant about the problem but to also contribute to the solution, and I’ve decided these people are right. So please watch this space!