Recursor 3.6.0 released

This is a performance, feature and bugfix update to 3.5/3.5.3. It contains important fixes for slightly broken domain names, which your users expect to work anyhow. It also brings robust resilience against certain classes of attacks.

Changes between RC1 and release:

 

New features:

  • commit aadceba: Implement minimum-ttl-override config setting, plus runtime configurability via ‘rec_control set-minimum-ttl’.
  • Lots of work on the JSON API, which is exposed via Aki Tuomi’s ‘yahttp’. Massive thanks to Christian Hofstaedtler for delivering this exciting new functionality. Documentation & demo forthcoming, but code to use it is available on GitHub.
  • Lua modules can now use ‘pdnslog(INFO..’), as described in ticket 1074, implemented in commit 674a305
  • Adopt any-to-tcp feature to the recursor. Based on a patch by Winfried Angele. Closes ticket 836commit 56b4d21 and commit e661a20.
  • commit 2c78bd5: implement built-in statistics dumper using the ‘carbon’ protocol, which is also understood by metronome (our mini-graphite). Use ‘carbon-server’, ‘carbon-ourname’ and ‘carbon-interval’ settings.
  • New setting ‘udp-truncation-threshold’ to configure from how many bytes we should truncate. commit a09a8ce.
  • Proper support for CHaos class for CHAOS TXT queries. commit c86e1f2, addition for lua in commit f94c53d, some warnings in commit 438db54 however.
  • Added support for Lua scripts to drop queries w/o further processing. commit 0478c54.
  • Kevin Holly added qtype statistics to recursor and rec_control (get-qtypelist) (commit 79332bf)
  • Add support for include-files in configuration, also reload ACLs and zones defined in them (commit 829849dcommit 242b90ecommit 302df81).
  • Paulo Anes contributed server-down-max-fails which helps combat Recursive DNS based amplification attacks. Described in this post. Also comes with new metric ‘failed-host-entries’ in commit 406f46f.
  • commit 21e7976: Implement “followCNAMERecords” feature in the Lua hooks.

Improvements:

  • commit 06ea901: make pdns-distributes-queries use a hash so related queries get sent to the same thread. Original idea by Winfried Angele. Astoundingly effective, approximately halves CPU usage!
  • commit b13e737: –help now writes to stdout instead of stderr. Thanks Winfried Angele.
  • To aid in limiting DoS attacks, when truncating a response, we actually truncate all the way so only the question remains. Suggested inticket 1092, code in commit add935a.
  • No longer experimental, the switch ‘pdns-distributes-queries’ can improve multi-threaded performance on Linux (various cleanup commits).
  • Update to embedded PolarSSL, plus remove previous AES implementation and shift to PolarSSL (commit e22d9b4commit 990ad9a)
  • commit 92c0733 moves various Lua magic constants into an enum namespace.
  • set group and supplementary groups before chroot (commit 6ee50ceticket 1198).
  • commit 4e9a20e: raise our socket buffer setting so it no longer generates a warning about lowering it.
  • commit 4e9a20e: warn about Linux suboptimal IPv6 settings if we detect them.
  • SIGUSR2 turns on a ‘trace’ of all DNS traffic, a second SIGUSR2 now turns it off again. commit 4f217ce.
  • Various fixes for Lua 5.2.
  • commit 81859ba: No longer attempt to answer questions coming in from port 0, reply would not reach them anyhow. Thanks to Niels Bakker and ‘sid3windr’ for insight & debugging. Closes ticket 844.
  • commit b1a2d6c: now, I’m not one to get OCD over things, but that log message about stats based on 1801 seconds got to me. 1800 now.

Fixes:

  • 0c9de4fc: stay away from getaddrinfo unless we really can’t help it for ascii ipv6 conversions to binary
  • commit 08f3f63: fix average latency calculation, closing ticket 424.
  • commit 75ba907: Some of our counters were still 32 bits, now 64.
  • commit 2f22827: Fix statistics and stability when running with pdns-distributes-queries.
  • commit 6196f90: avoid merging old and new additional data, fixes an issue caused by weird (but probably legal) Akamai behaviour
  • commit 3a8a4d6: make sure we don’t exceed the number of available filedescriptors for mthreads. Raises performance in case of DoS. See this post for further details.
  • commit 7313fe6: implement indexed packet cache wiping for recursor, orders of magnitude faster. Important when reloading all zones, which causes massive cache cleaning.
  • rec_control get-all would include ‘cache-bytes’ and ‘packetcache-bytes’, which were expensive operations, too expensive for frequent polling. Removed in commit 8e42d27.
  • All old workarounds for supporting Windows of the XP era have been removed.
  • Fix issues on S390X based systems which have unsigned characters (commit 916a0fd)

PowerDNS Jobs: are you available?

Hi everybody,

In short: there is a market for (small) PowerDNS jobs, and if you are available for such work, read on for where we’ll be sending people who need PowerDNS work done!

The longer story:

As PowerDNS use continues to increase, so does the number of inquiries we receive from operators that need (private) help. Such requests range from “can someone help me fix this error” or “can someone upgrade my PowerDNS from 2.9.22″ to designing & deploying installations for millions of users or domains.

PowerDNS itself and our certified consultants are able to field some of these requests, but not all of them. Sometimes we don’t have time, but other requests simply do not fit well with us or our consultants.

For example, we will typically not be able to do small scale one off upgrades or installations ourselves.

However, we do frequently get questions for such work. What we have now decided is whenever we can’t help a user commercially, we will be asking them to forward their job to elance.com and odesk.com. We’ll also be providing some guidance on how to identify experienced PowerDNS people.

So: if you have PowerDNS skills, and you are available for doing such jobs, know that we will be sending interested users to elance.com and odesk.com.

And if you have a profile there that mentions PowerDNS, you could be expecting some business.

If you are already doing consulting work, but via another site, please let us know, we can also send people your way there then.

Finally, please be advised that this is all in addition to our ‘regular’ support platforms: community based (mailing list, irc), professional (support agreements) and certified consultants. There is no change to those programs.

Thanks!

Recursor 3.6.0 Release Candidate 1

This is a performance, feature and bugfix update to 3.5/3.5.3. It contains important fixes for slightly broken domain names, which your users expect to work anyhow. It also brings robust resilience against certain classes of attacks.

New features:

  • commit aadceba: Implement minimum-ttl-override config setting, plus runtime configurability via ‘rec_control set-minimum-ttl’.
  • Lots of work on the JSON API, which is exposed via Aki Tuomi’s ‘yahttp’. Massive thanks to Christian Hofstaedtler for delivering this exciting new functionality. Documentation & demo forthcoming, but code to use it is available on GitHub.
  • Lua modules can now use ‘pdnslog(INFO..’), as described in ticket 1074, implemented in commit 674a305
  • Adopt any-to-tcp feature to the recursor. Based on a patch by Winfried Angele. Closes ticket 836commit 56b4d21 and commit e661a20.
  • commit 2c78bd5: implement built-in statistics dumper using the ‘carbon’ protocol, which is also understood by metronome (our mini-graphite). Use ‘carbon-server’, ‘carbon-ourname’ and ‘carbon-interval’ settings.
  • New setting ‘udp-truncation-threshold’ to configure from how many bytes we should truncate. commit a09a8ce.
  • Proper support for CHaos class for CHAOS TXT queries. commit c86e1f2, addition for lua in commit f94c53d, some warnings in commit 438db54 however.
  • Added support for Lua scripts to drop queries w/o further processing. commit 0478c54.
  • Kevin Holly added qtype statistics to recursor and rec_control (get-qtypelist) (commit 79332bf)
  • Add support for include-files in configuration, also reload ACLs and zones defined in them (commit 829849dcommit 242b90ecommit 302df81).
  • Paulo Anes contributed server-down-max-fails which helps combat Recursive DNS based amplification attacks. Described in this post. Also comes with new metric ‘failed-host-entries’ in commit 406f46f.
  • commit 21e7976: Implement “followCNAMERecords” feature in the Lua hooks.

Improvements:

  • commit 06ea901: make pdns-distributes-queries use a hash so related queries get sent to the same thread. Original idea by Winfried Angele. Astoundingly effective, approximately halves CPU usage!
  • commit b13e737: –help now writes to stdout instead of stderr. Thanks Winfried Angele.
  • To aid in limiting DoS attacks, when truncating a response, we actually truncate all the way so only the question remains. Suggested inticket 1092, code in commit add935a.
  • No longer experimental, the switch ‘pdns-distributes-queries’ can improve multi-threaded performance on Linux (various cleanup commits).
  • Update to embedded PolarSSL, plus remove previous AES implementation and shift to PolarSSL (commit e22d9b4commit 990ad9a)
  • commit 92c0733 moves various Lua magic constants into an enum namespace.
  • set group and supplementary groups before chroot (commit 6ee50ceticket 1198).
  • commit 4e9a20e: raise our socket buffer setting so it no longer generates a warning about lowering it.
  • commit 4e9a20e: warn about Linux suboptimal IPv6 settings if we detect them.
  • SIGUSR2 turns on a ‘trace’ of all DNS traffic, a second SIGUSR2 now turns it off again. commit 4f217ce.
  • Various fixes for Lua 5.2.
  • commit 81859ba: No longer attempt to answer questions coming in from port 0, reply would not reach them anyhow. Thanks to Niels Bakker and ‘sid3windr’ for insight & debugging. Closes ticket 844.
  • commit b1a2d6c: now, I’m not one to get OCD over things, but that log message about stats based on 1801 seconds got to me. 1800 now.

Fixes:

  • 0c9de4fc: stay away from getaddrinfo unless we really can’t help it for ascii ipv6 conversions to binary
  • commit 08f3f63: fix average latency calculation, closing ticket 424.
  • commit 75ba907: Some of our counters were still 32 bits, now 64.
  • commit 2f22827: Fix statistics and stability when running with pdns-distributes-queries.
  • commit 6196f90: avoid merging old and new additional data, fixes an issue caused by weird (but probably legal) Akamai behaviour
  • commit 3a8a4d6: make sure we don’t exceed the number of available filedescriptors for mthreads. Raises performance in case of DoS. See this post for further details.
  • commit 7313fe6: implement indexed packet cache wiping for recursor, orders of magnitude faster. Important when reloading all zones, which causes massive cache cleaning.
  • rec_control get-all would include ‘cache-bytes’ and ‘packetcache-bytes’, which were expensive operations, too expensive for frequent polling. Removed in commit 8e42d27.
  • All old workarounds for supporting Windows of the XP era have been removed.
  • Fix issues on S390X based systems which have unsigned characters (commit 916a0fd)

A surprising discovery on converting IPv6 addresses: we no longer prefer getaddrinfo()

Yesterday, we were contacted by PowerDNS user James Baer who noted strange crashes in PowerDNS (on Linux) upon adding thousands and thousands of IP addresses to his system. Notably, PowerDNS did not even use any of those thousands of addresses, but it still crashed. As James noted, this should not even be possible.

Quick investigation by PowerDNS contributors Aki Tuomi and Imre Gergely (thanks!) showed that the crashes emanated from a call to getaddrinfo() to convert an IPv6 address. Now, here over at PowerDNS, we are big fans of IPv6, to the point that we even presented at RIPE66, “Implementing Full IPv6 Support: More than Binding to an AF_INET6 Socket“. In this presentation, we noted “getaddrinfo() is _the only way_ to convert presentation format IPv4 and IPv6 addresses. Ignore anything else”.

In this post, we sadly have to recant this statement. In fact, we’d like to replace it with ‘getaddrinfo() considered slow and potentially dangerous in 2014′.

So, what did we find this morning? We were expecting our calls to getaddrinfo() to be a quick string conversion of things like ‘::1′ to the binary representation ’1′. On reading the backtraces however, we found that getaddrinfo() was talking to the kernel, and enumerating all local IP addresses (!). I was actually stunned. We also found that Akamai had noted that this was crashing their code when using 64000 IP addresses. Because talking to the kernel this way is expensive, getaddrinfo() maintains a local cache for one second. This however adds the need for locking and memory management, which it predictably gets wrong.

And lo, bug reports about getaddrinfo() not being thread safe, slow, or crash prone abound. We’ve found issues reported for Linux, Solaris, FreeBSD and OSX.

Now, it should be noted that getaddrinfo()  offers us quite a few things. It can go out on the wire and do DNS queries. It can parse gai.conf to find how we want to sort our addresses, and which address families we prefer. These days, it can even perform Internationalized Domain Name (IDN) conversions! Also, getaddrinfo() is tasked with making sure ‘local IP addresses’ get returned in preference to remote ones (‘Rule 7′). These are indeed things that may require kernel communication.

However, no such complication is required when wanting to convert “::1″ into 1. And in fact, using the flags, we can tell getaddrinfo() not to do anything complicated (by omitting AI_ADDRCONFIG). Sadly, the vast majority of getaddrinfo() implementations currently in production get it wrong. This is probably an artifact of the huge mission statement that has been heaped on this function.

So, as of 2014, we consider getaddrinfo() to be avoided for IPv6 address conversions under any except the least demanding, single threaded, applications. Within PowerDNS, we have reverted to using inet_pton() whenever we can get away with it – which is almost always, except in the case of scoped addresses.

Over time, the most common C library implementations may get their act together, or they may not. But for now, to convert from representation format, inet_pton() it is.

How to talk to an open source software project as a large scale or otherwise interesting user

The very short version: if you contact an open source project anonymously, you may not get the best help. Feel free to also reach out privately and share that your post from gmail.com is actually (say) from a very interesting deployment. Now read on for the long version!

Some months ago, the fine people from CloudFlare blogged about their new DNS implementation, and in that post they noted:

“While PowerDNS got us a long way, it started to run into issues as we scaled and dealt with an increasing number of large denial of service attacks … To their credit, the PowerDNS community responded to the first two problems with some efforts at rate limiting and other abuse detection”.

For us, this represented a sadly familiar pattern. We had lost CloudFlare as a user, and we had not been able to work with them sufficiently well to address their needs. How did it happen, and why does it matter?

For an open source project, it is important to have happy ‘lighthouse’ users. For years, we’ve proudly served the geodirection needs of Wikipedia for example. We’ve done a lot of work to keep various big users happy with their PowerDNS deployment. If people know or find out your product is being used by ‘name brand’, large scale, deployments, this helps tremendously with adoption and acceptability.

In fact, next to development resources, a great community and funding, having impressive deployments is one of the most important things for an open source project.

So when we lost CloudFlare, other users contacted us to ask what had gone wrong, and if PowerDNS was still suitable for their needs. We quickly reached out to CloudFlare, and Matthew Prince and co-workers sent us an impressive post-mortem on what had happened. They also graciously gave us permission to share the story, which is much appreciated.

It turns out they actually had shared their issues with the PowerDNS community, and they had gotten some help from us.. but not enough (by our own estimation). Importantly, they had not declared themselves as being a large or impressive deployment.

Now, we try to help everybody of course. We strive to be the friendliest name server community out there, and I think we are succeeding. But we can’t devote infinite resources to everyone. Major users, major customers, interesting deployments need and get more attention. (It should be noted that we are also spoilers for users politely asking for help and willing to run test versions of our software, by the way. Nearly infinite help awaits you in that case!)

But, back to the subject of this post, here’s the problem. Large or interesting deployments generally contact open source projects anonymously, mostly from gmail accounts, often even using fake names. And this is to be understood – in most large places (enterprise, public companies, government), sending out email to a world readable mailing list from a work email account is a sure way to get unwanted attention within your organisation (and get laughed at for your ridiculous multipage disclaimer).

Legal departments are likely to get their panties in a bunch – did you just share proprietary information? Security departments might ask if it was wise to publicly post company implementation details. Even the communication & marketing people might get in on the act. So email from gmail.com is what we get.

But this effectively means you might have a 500 server PowerDNS deployment doing really interesting things, things we’d love to help you with.. but we don’t know. Not only is this a matter of scale, we may also not realise the nature of your challenge. If we know what you are doing, our thinking about your problem may run among different lines.

Another reason why employees at big corporations are loath to contact open source projects is that they assume they won’t get help, and need to pay for support. And, while it is true that someone needs to pay the bills here, we realise that for many users, a support agreement is just not going to happen. Corporate IT might not even know the whole company is relying on an open source project. Questions might be asked. For such organisations, we are fine with providing free help on the public mailing lists. But to give proper weight and context to your issue, it helps if we know who you are.

So what about PowerDNS and CloudFlare? From our conversation, we now fully understand why they wrote their own server. But if at the time we had known the scale of their challenge, we would have loved to have tried to meet their needs. We do very much appreciate their help in the ‘post mortem’ of our silent breakup. As Matthew said “[PowerDNS] is a great piece of software that met our needs well for almost 4 years. Today our needs are extremely unique and it wouldn’t make a lot of sense for you to try and design for them.”

Concluding, on behalf of PowerDNS, and I expect a lot of open source projects: If you are a large or interesting deployment, please do post to the mailing lists. We do understand you will not be doing so from your corporate email account. But please do feel free to in parallel contact the project privately (IRC channels are typically a great way), where you can share more details of who you are and what you are doing.

Both you and the open source project will probably end up better that way.

On PowerDNS.COM and Express PowerDNS.NET Hosting by Trilab

Hi everybody,

This post serves to clarify why we (‘PowerDNS.COM’) can’t help you with issues with Express PowerDNS.NET hosting by Trilab.

The short version is: they are a fully different company, and we have no access to their servers and hardware, so even though we’d love to help you, we can’t. Here’s the long version.

A long time ago, around 1999, when the .COM boom was still in full force, we launched PowerDNS, initially as a software company, to which we later added DNS hosting. PowerDNS and a company called Trilab then resided in the same building and had substantial shared ownership and management.

Over the years, PowerDNS and Trilab have drifted apart, no longer share a building, have no shared management and are fully separate companies. Trilab kept PowerDNS.NET, we (PowerDNS.COM) took the nameserver software.

We, PowerDNS.COM, are not involved in the operation of PowerDNS.NET. However, because we parted as friends, we were fine with Trilab continuing to use the name ‘PowerDNS’ and our logo. In fact, we’re still friends.

Because the internet is a harsh place, Express PowerDNS.NET Hosting by Trilab is often the victim of Denial of Service attacks. The PowerDNS.NET low cost model means that there is no 24 hour support number for their services. However, we (PowerDNS.COM) do operate such 24/7 support, and we frequently get called at all hours of the day to inform us of issues over at Trilab, which we can’t solve.


Q: Why does Express PowerDNS.NET have such frequent issues?

A: Please ask them, but if we’d have to guess and from what we observe, substantial Denial of Service attacks are probably involved. It is not easy to defend against such attacks.


Q: Is PowerDNS.NET down because of PowerDNS software problems?

A: We don’t think so. PowerDNS can deal with over a million packets/s on well-tuned hardware.


Q: If PowerDNS.NET is such a separate company, why do they still use your logo?

A: We parted as friends, and we allowed them the use of the logo and name. UPDATE: They will drop the name & logo.


Q: I don’t believe you, PowerDNS.NET must be your service provider, you can’t just blame them!

A: Check for yourself, PowerDNS.COM runs on entirely different servers.


Q: I paid for this service!

A: Yes, but check your credit card statement, you did not pay us. You paid Trilab. Please ask them to help you.


Q: So if you can’t help me, who can?

A: The company that invoices you is Trilab.COM BV. When their servers are working, find out more about them on http://trilab.com/


Q: Should I call the PowerDNS.COM 24/7 support hotline if Express PowerDNS.NET by Trilab is down?

A: Please no, that 24/7 hotline eventually wakes up actual people who, in fact, can’t help you nor turn off their phones

 

Authoritative Server Database Schema Changes

A heads-up for everybody running (git) snapshots of the PowerDNS Authoritative Server: earlier today we merged Pull #1327 from Kees Monshouwer.

This merge deprecates the ‘old’ (pre-DNSSEC) schema for the gsql backends. In other words, from this point on, the new schema is mandatory. This simplifies documentation and debugging, and also makes development easier on our end.

Please read the upgrade notes carefully to see what changes need to be applied to your database!