PowerDNS needs your help: What are we missing?

Hi everybody,

As we’re working on PowerDNS 4.x, we are wondering: what are we missing?

The somewhat longer story is that as a software developer, a sort of feature-blindness creeps up on you. We try to make the software better, faster etc, but by focusing so much on the technology, one can lose sight of the actual use cases.

In this way it is possible that a software vendor neglects to implement something, even though many users desperately want it. If so, please speak up! The short version: please mail powerdns.ideas at powerdns.com your ideas!

As concrete examples, PowerDNS took its time to add an API, and once we had it, people immediately started using it, even before we had documented the API. Similarly, for many years, we did not deliver a proper graphing solution, and now that it is there it is highly popular.

But what more are we missing? Should we expand into IPAM and do DHCP and IP address management? Should we make an out of the box NAT64/DNS64 solution? Do we need to improve replication beyond “database native” and “AXFR-based” (so ‘super-duper-slave’)?

Should we start doing versioned databases so people can roll back changes?  IXFR?

Should we add a built-in DNS based load balancer where we poll if your IP addresses are up?

Or would it be wise to move on beyond the geographical versatile backends, and simply add ‘US’ and ‘Europe’, ‘Oceania’, ‘Asia’ IP address profiles?

Should the recursor gain cache sharing abilities? Or pre-fetching? Or even TTL-faking in case auths are down?

The list above is just to prime your imagination: if you have any ideas on what you are missing, please reach out to powerdns.ideas at powerdns.com, or use our contact form.

Thanks!

PowerDNS 2.x End of Life Statement

PowerDNS 2.x End of Life Statement

21st of May 2015

PowerDNS Authoritative Server 2.9.22 was released more than 6 years ago, in January 2009. Because of its immense and durable popularity, some patch releases have been provided, the last one of which (2.9.22.6) was made available over three years ago in January 2012.

The 2.9.22.x series contains a number of probable and actual violations of the DNS standards. In addition, some behaviours of 2.9.22.x are standards conforming but cause interoperability problems in 2015. Finally, 2.9.22.4 and earlier are impacted by PowerDNS Security Advisory 2012-01, which means PowerDNS can be used in a Denial of Service attack.

Although we have long been telling users that we can no longer support the use of 2.x, and urging upgrades to 3.x, with this statement we formally declare 2.x end of life.

This means that any 2.x issues will not be addressed. This has been the case for a long time, but with this statement we make it formal.

To upgrade to 3.x, please consult the instructions on how to upgrade the database. If you need help with upgrading, we provide migration services to our supported users. If you are currently running 2.9.22 and need help to tide you over, we can also provide that as part of a support agreement.

But we urge everyone to move on to PowerDNS Authoritative Server 3.4 or later – it is a faster, more standards conforming and more powerful nameserver!

DNS-OARC Spring Workshop 2015

This weekend, PowerDNS attended the DNS-OARC Spring Workshop 2015 in force, with 100% attendance. I shamefully have to admit this is the first time I’ve gone to an OARC workshop, but I was well rewarded. Both the speakers and audience were stellar. Full video is available in four parts: Saturday morning, Saturday afternoon, Sunday morning, Sunday afternoon.

UPDATE: Geoff Huston also did a writeup with more and different details from mine. Very much worth your time!

In this post, I briefly want to summarise the big themes of the meeting. But I want to start off with describing the audience. We had people running the biggest (cc)TLDs on the planet, we had authors from all the big name servers. There were the people that run the root and plan the DNSSEC key transitions. The largest access providers on the planet. Folks instrumenting the whole internet to get statistics on DNS(SEC) performance. For fun. The people that protect our websites from denial of service attacks. In short, everyone was there. This workshop was easily the best DNS event I have ever attended.

The biggest theme of the meeting was the flood of ongoing reflection attacks. In short, bad people send random questions to open resolvers on the internet. These in turn often forward their queries to powerful recursive name servers over at providers. And these reach out frequently and insistently to numerous “authoritative servers” to find answers to these questions.

However, the actual goal is not to get answers to questions. The actual goal is to perform a powerful denial of service attack on these “authoritative servers”, which frequently don’t even run DNS. But they do get bombarded with DNS traffic from all over the world and go offline.

These attacks have been going on for the past year or so, and are the biggest thing in DNS for a long time. All large resolver implementations have had to implement changes to protect themselves from the attacks, and to attempt to no longer to take part in this malicious traffic. This has not been easy.

At the workshop, we had presentations describing how BIND and Nominum name servers implement their protection strategies, with ISC implementing various tuneable knobs that attempt to detect unresponsive servers, and Nominum doing (among other things) “threat lists” of domains currently known to be involved in attacks.

In addition, Kazunori FUJIWARA of JPRS presented how NSEC records from DNSSEC could be used to silence such attacks – an NSEC denial of existence range can be used to block many random queries, as long as we have a denying range for them. There was some discussion if this would work for NSEC3 too, and OARC attendees are now pondering that question.

Tangentially related, Microsoft presented research on how well the internet performs negative caching, specifically how long. I was very happy to see Microsoft open up and become a part of the DNS community. Microsoft has long had smart people working on DNS, but up till maybe a year ago, you’d never see anyone from Microsoft present at a workshop or working group. This has now changed, and the internet will surely be a better place with Microsoft at the table. Even if Microsoft legal still insists they carry a ‘Microsoft Confidential’ warning on every slide!

Moving on: dealing with random packet floods requires the best statistics, and John Dickinson presented work on Hedgehog which can present DSC data.

Another major theme of the meeting was ‘dealing with packet floods’, random or not. Part of this is writing smart name server software, but at some level of traffic, packets need to be processed or blocked at the network layer. Various folks presented on this, and I want to specifically thank Cloudflare for sharing their vast DoS-quenching knowledge. It is not common for the DoS protection people to open up on how they do their work, because these are of course the crown jewels. However, not everyone can be a Cloudflare customer, so it is great that we can learn from them.

In short, they had some key insights. Efficient name server software is mostly limited by the UDP stack in Linux and other operating systems, and this stack really hits a wall somewhere around 200kqps. This was repeated in several other presentations, and the reasons for this limitation appear to be pretty fundamental. With careful tuning and specific hardware configuration higher numbers can be achieved, but it is uphill work. But, you have to view this presentation, it is full of unexpected insights.

At the end of this post, I get back to both the ‘200kqps’ issue and possible new ways to deal with unrequested packet floods.

Cloudflare separately presented their gross DNSSEC hacks, which while clever don’t fill me with glee. In the words of Filippo “I’ve done stuff I ain’t proud of and the stuff I am proud of is disgusting”. Read all about their NSEC Shotgun in any case.

Verisign also opened up with various eye-popping statistics on denial of service attacks they have to weather all day, and what countermeasures they have in place. During one such attack last year, Verisign filtered a big PowerDNS user, causing mayhem for us. Piet Barber feared I would call him out on that during the presentation, but in fact we had no “hard feelings”. I mostly feel bad we were part of relaying that attack to the GTLD servers!

Measurements

Various people reported interesting measurements. Geoff Huston of APNIC did another one of his incredible presentations, this time on how well ECDSA signed domain names get resolved on the internet. Geoff really is an asset to the world, he truly has his finger on the pulse of the internet. In short, 80% of DNSSEC resolvers can validate ECDSA records. The ones that can’t behave oddly, or in the words of Geoff to the implementors present “you lot write a lot of crap”. I am sure this is true. Another key insight is that around 2000 resolver ‘pairs’ represent 95% of query load on authoritative nameservers. In another presentation, it was reported there are around 150k bona fide resolver IP addresses in the world, and this matches my own observations. In short, if you are under DoS as an auth, you could do worse than block everyone except the top 99% of resolvers.

Shumon Huque of Verisign presented measurements of the privacy enhancing qname minimisation idea, and in short, because of Akamai’s current nameserver implementation, it works very badly. Akamai is aware of the problem and has made a vague promise to do something about it one day.

Ralf Weber of Nominum did measurements how well several resolvers deal with random query attacks. Unsurprisingly, Nominum came out best, but this was a very good even-handed presentation, which showed that most modern nameservers have been updated to deal well with such attacks. NOTE: the current presentation shows unfavourable numbers for Unbound, but during the meeting Ralf and Wouter Wijngaards found out why this was the case, and Ralf will be redoing the tests. As with Microsoft, I’m very happy to see Nominum join the (public) DNS community and this can only be helpful in improving the state of DNS.

William Sotomayor of OARC presented remotely on how various countries and university networks use AS112 networks, but I have to admit most of this presentation went over my head as I’m not very good with large scale internet routing. Similarly his work on ‘RSSAC-02′ is undoubtedly very important, but outside my expertise.

Joao Luis Silva Damas worked hard to get access to actual customer DNS traffic to do statistics on that, and when he finally got the data he tore it apart and learned a lot. Recommended reading. After this presentation, various DNS trace anonymization strategies were discussed, including the PowerDNS ‘dnswasher’ and the NZ registry more sophisticated ‘keyed’ blinding solution. For both these programs however, keep in mind that data can frequently be de-anonymized with sufficient correlation!

Bruce Van Nice of Nominum showed statistics on the life of a resolver and popular domain names. I call fake on this one since not a single domain listed was of the ‘XXX’ variety ;-) I assume these were quietly filtered from the graphics so as not to upset anyone. The PowerDNS statistics I presented later had actual adult domain names in them, but we’re Dutch, so we get away with that!

Sebastien Castro of the NZ registry presented work on how to discover popular or important domain names using statistical measures, and showed how these change over the week. Similarly Francisco Cifuentes of CL NIC research presented on how to do realtime DNS analytics with Apache Storm and other technologies.

Root zone related

The root is, of course, at the root of all of DNS and thus the internet. Anything affecting the root affects us all. And there is enough stuff to think about. The root Key Signing Key is getting stale and the batteries on the Hardware Security Modules housing the KSK are similarly showing their age. I understand this is a problem. To change the KSK is a major effort however.

For one, during the change, root answer packets might get (a lot) bigger. Duane Wessels of Verisign presented numbers on what various changeover scenarios would mean for fragmentation and truncation. The good news appears to be that the sky isn’t falling.

Meanwhile, before the root and most important (cc)TLDs got signed, there was the DLV-registry, which made it possible to specify your DNSSEC keys in a parallel registry over at ISC and some other places. It is now high time to sunset this registry, and Jim Martin of ISC set out how they plan to do that. After the presentation a huge line formed at the mike to wholeheartedly support the rapid shutdown of the DLV registry.

Kazunori Fujiwara presented on the changing ratio between JP and ROOT queries on the JPRS infrastructure.

Finally at the end of the day, Edward Lewis (now of ICANN) presented about the process of changing the root KSK. This is fraught with difficulty and I for one doubt it will happen before circumstances force us to. Ed and his very capable friends, including Geoff Huston, are giving it their best however, and surprisingly (to me at least), during and after the presentation, some new ideas were raised to facilitate the transition. This involves having one root-server serve with the old keying material, perhaps giving people a chance to limp on during the transition.

Other presentations

Florian Maury of the French IT government security agency ANSSI presented on the iDNS attack they discovered, which felled PowerDNS, Bind and Unbound last year, but notably not DJBDNS since “1999” Dan Bernstein was smarter than all of us and failed to fall for that one. This provided the brief moment of drama of OARC with one audience member claiming the iDNS stuff was not news, not important, and that by publishing it, Florian had only helped potential attackers. Luckily sanity returned and this member of the audience apologized later in the day. Who says DNS is boring?

Florian also presented on the French government DNS guidelines (which are, of course, in French), but that interestingly (and correctly I think) do not propose to implement DNSSEC before a host of other best practices in DNS are implemented, including registry locks.

Patrik Wallström of .SE presented on Zonemaster, a Swedish-French collaborative zone checker, intended to supersede DNSCheck and Zonecheck.

And I did a presentation too on dnsdist, a highly DoS and DNS aware load-balancer, where I asked the audience if there is room for a ‘smart load balancer that has some features of a nameserver’. The feedback I got was overwhelmingly yes. Further discussions afterwards were instrumental in finding the limits of what dnsdist should and should not do.

Followup work

Two things stand out for me from this workshop. For one, Cloudflare and many others feel compelled to implement a ‘Super NXDOMAIN’ answer that allows an authoritative server to send a response to a bona fide resolver: “you can stop sending me queries for x.y.somedomain.com, or in fact anything within somedomain.com. It is not going to happen.”. We jokingly called this the Shut Up Packet. However, it appears this idea has merit and Olafur already wrote some text on it. We will also be working on this and studying the parallels with the (failed) ICMP Source Quench packet, or in fact even a real ‘NXDOMAIN’ response.

Secondly, too many people to mention lamented the suboptimal performance of the Linux (and UNIX in general, but specifically Linux) on dealing with UDP packets. Where you can now blast gigabits of TCP/IP, the supposedly more efficient UDP struggles to reach hundreds of thousands of packets per second. Part of this problem is neglect of UDP in the kernel. Part of this is an inefficient ‘one system call per packet’ interface (not usefully addressed by recvmmsg()).

Since we all feel the pain of this and have to buy special hardware to get better (filtering) performance, I feel it is time to liaise with various kernel folks to see what could be done. Individually, everyone I spoke to agreed, but nothing has coalesced yet. I’ll continue to agitate for something to happen, please let me know if you want to join in!

Important Update for Security Advisory 2015-01

Last week, we released Security Advisory 2015-01, with text suggesting that only specific platforms were seriously affected. We must now report that this was incorrect: all platforms are impacted. The advisory has been updated to that effect.

Furthermore, by popular demand, we have released Authoritative Server 3.3.2, an update to version 3.3.1 which includes DNSSEC improvements and of course a patch for the security issue. Click these links: release notes, tarball, debs, RPMs.

Security Advisory 2015-01

UPDATE: please also read the update posted on May 1st.

Hi everybody,

Please be aware of PowerDNS Security Advisory 2015-01
(http://doc.powerdns.com/md/security/powerdns-advisory-2015-01/).

The good news is that as far as we have seen, only
specific builds for RHEL5 are affected, but just to be sure we are doing
full releases of all recent versions of our products.

Packages and distribution tar balls of Recursor 3.6.3, Recursor 3.7.2 and Auth
3.4.4 are available in the usual places, and release announcements have just gone out.

If you prefer a minimal patch, please go to
https://downloads.powerdns.com/patches/2015-01/ and see README.txt there.

If you have problems upgrading, please either contact us on our mailing lists,
or privately via powerdns.support@powerdns.com (should you wish to make use of
our SLA-backed support program).

We want to thank Aki Tuomi for finding this issue, and really digging into it.
We also want to thank Kees Monshouwer for assisting in debugging and fixing
the offending code. Finally we want to thank Kai Storbeck for putting an
earlier, broken version of the patch into production and being understanding
about the names that broke because of it.

Recursor 3.7.2

Hi everybody,

We’re pleased to announce version 3.7.2 of our Recursor.

The most important part of this update is a fix for CVE-2015-1868.
Please see http://doc.powerdns.com/md/security/powerdns-advisory-2015-01/
for more information.

Tar.gz and packages are available on:

* https://downloads.powerdns.com/releases/
* Soon: https://www.monshouwer.eu/download/3rd_party/pdns-recursor/
(RHEL/CentOS, with the usual huge thanks to Kees Monshouwer).

The changelog with clickable links can also be found on
https://doc.powerdns.com/md/changelog/#powerdns-recursor-372

PowerDNS Recursor 3.7.2

Released 23rd of April, 2015

Among other bug fixes and improvements (as listed below), this release
incorporates a fix for CVE-2015-1868, as detailed in PowerDNS
Security Advisory 2015-01

Bug fixes:
* Fix handling of forward references in label compressed packets; fixes CVE-2015-1868
* make sure we never call sendmsg with msg_control!=NULL && msg_controllen>0. Fixes #2227
* Improve robustness of root-nx-trust.

Improvements:
* Silence warnings that always occur on FreeBSD (Ruben Kerkhof)

Recursor 3.6.3

Hi everybody,

We’re pleased to announce version 3.6.3 of our Recursor.

The most important part of this update is a fix for CVE-2015-1868.
Please see http://doc.powerdns.com/md/security/powerdns-advisory-2015-01/
for more information.

Tar.gz and packages are available on:

* https://downloads.powerdns.com/releases/
* Soon: https://www.monshouwer.eu/download/3rd_party/pdns-recursor/
(RHEL/CentOS, with the usual huge thanks to Kees Monshouwer).

Note that Recursor 3.7.2 is also available, with many improvements beyond
the fix for this CVE.

PowerDNS Recursor 3.6.3

Released 23rd of April, 2015

The only difference between Recursor 3.6.2 and 3.6.3 is a fix for
CVE-2015-1868, as detailed in PowerDNS Security Advisory 2015-01