On PowerDNS.COM and Express PowerDNS.NET Hosting by Trilab

Hi everybody,

This post serves to clarify why we (‘PowerDNS.COM’) can’t help you with issues with Express PowerDNS.NET hosting by Trilab.

The short version is: they are a fully different company, and we have no access to their servers and hardware, so even though we’d love to help you, we can’t. Here’s the long version.

A long time ago, around 1999, when the .COM boom was still in full force, we launched PowerDNS, initially as a software company, to which we later added DNS hosting. PowerDNS and a company called Trilab then resided in the same building and had substantial shared ownership and management.

Over the years, PowerDNS and Trilab have drifted apart, no longer share a building, have no shared management and are fully separate companies. Trilab kept PowerDNS.NET, we (PowerDNS.COM) took the nameserver software.

We, PowerDNS.COM, are not involved in the operation of PowerDNS.NET. However, because we parted as friends, we were fine with Trilab continuing to use the name ‘PowerDNS’ and our logo. In fact, we’re still friends.

Because the internet is a harsh place, Express PowerDNS.NET Hosting by Trilab is often the victim of Denial of Service attacks. The PowerDNS.NET low cost model means that there is no 24 hour support number for their services. However, we (PowerDNS.COM) do operate such 24/7 support, and we frequently get called at all hours of the day to inform us of issues over at Trilab, which we can’t solve.


Q: Why does Express PowerDNS.NET have such frequent issues?

A: Please ask them, but if we’d have to guess and from what we observe, substantial Denial of Service attacks are probably involved. It is not easy to defend against such attacks.


Q: Is PowerDNS.NET down because of PowerDNS software problems?

A: We don’t think so. PowerDNS can deal with over a million packets/s on well-tuned hardware.


Q: If PowerDNS.NET is such a separate company, why do they still use your logo?

A: We parted as friends, and we allowed them the use of the logo and name. UPDATE: They will drop the name & logo.


Q: I don’t believe you, PowerDNS.NET must be your service provider, you can’t just blame them!

A: Check for yourself, PowerDNS.COM runs on entirely different servers.


Q: I paid for this service!

A: Yes, but check your credit card statement, you did not pay us. You paid Trilab. Please ask them to help you.


Q: So if you can’t help me, who can?

A: The company that invoices you is Trilab.COM BV. When their servers are working, find out more about them on http://trilab.com/


Q: Should I call the PowerDNS.COM 24/7 support hotline if Express PowerDNS.NET by Trilab is down?

A: Please no, that 24/7 hotline eventually wakes up actual people who, in fact, can’t help you nor turn off their phones

 

Authoritative Server Database Schema Changes

A heads-up for everybody running (git) snapshots of the PowerDNS Authoritative Server: earlier today we merged Pull #1327 from Kees Monshouwer.

This merge deprecates the ‘old’ (pre-DNSSEC) schema for the gsql backends. In other words, from this point on, the new schema is mandatory. This simplifies documentation and debugging, and also makes development easier on our end.

Please read the upgrade notes carefully to see what changes need to be applied to your database!

Further DoS guidance, packages and patches available

Hi everybody,

Sadly, further DoS attacks are plaguing the world of DNS, which is bad for the targets of those DoS attacks, but also for us DNS operators that help originate them.  This post has guidance on how to make sure your PowerDNS Recursor mitigates the current attacks.

If you are under attack and need help, we’re there for you, and feel free to contact us on powerdns.support at powerdns.com.

* The attack

A current pattern of attacks is to register a domain with lots of ‘nameservers’, and then get botnets to create queries for that new domain.  Resolvers around the world then barrage these ‘nameservers’ with queries. And of course, these nameservers aren’t nameservers, they simply are targets of a DoS attack.

By crafting things just so, this creates a powerful packet amplifier, with one botnet packet turning into many many packets to the targets of the attack.

Some further details are on http://dnsamplificationattacks.blogspot.nl/2014/02/authoritative-name-server-attack.html

* What we can do about it

Although PowerDNS already checks if a server might be down, and will limit how often it queries it, this limitation does not stand up to the random nature of these attacks.

DoS victim Paulo Anes has contributed a filter that can aggressively filter out queries to dead servers, it is described in https://github.com/PowerDNS/pdns/pull/1300. Thank you Paulo!

We’ve since deployed this filter in many places, and it works VERY well against the current attacks. We recommend setting server-down-max-fails=32 for most servers, and server-down-max-fails=16 if under heavy attack.

Additionally, please make sure to read http://blog.powerdns.com/2014/02/06/related-to-recent-dos-attacks-recursor-configuration-file-guidance/ for how to properly size your operating system for dealing with incoming attacks.

* How can I get this filter?

The filter is not yet part of a released PowerDNS Recursor version. However, the following PowerDNS Recursor packages/sources are of production quality,  and will in any case fare better than 3.5.3:

64 bit Linux:
https://autotest.powerdns.com/job/recursor-git-semistatic-pkgs-amd64/789/artifact/pdns-recursor_0.0-git-20140402-1151-cc08b5a-1_amd64.deb
https://autotest.powerdns.com/job/recursor-git-semistatic-pkgs-amd64/789/artifact/pdns-recursor-0.0.20140402_1151_cc08b5a-1.x86_64.rpm

32 bit Linux:
https://autotest.powerdns.com/job/recursor-git-semistatic-pkgs-i386/789/artifact/pdns-recursor_0.0-git-20140402-1151-cc08b5a-1_i386.deb
https://autotest.powerdns.com/job/recursor-git-semistatic-pkgs-i386/789/artifact/pdns-recursor-0.0.20140402_1151_cc08b5a-1.i386.rpm

Source:
https://autotest.powerdns.com/job/recursor-git/1151/artifact/pdns/pdns-recursor-git-20140402-1151-cc08b5a.tar.bz2

https://www.monshouwer.eu/download/3rd_party/pdns-recursor/git/ has RHEL/CentOS native packages as well.

* When will PowerDNS do a release version with this filter?

Soon.

* Further questions

If you have further questions, or are currently under attack and need help, please feel free to contact powerdns.support at powerdns.com.

Related to recent DoS attacks: Recursor configuration file guidance

Hi everybody,

Over the past week we’ve been contacted by a few users reporting their PowerDNS Recursor became unresponsive under a moderate denial of service attack, one which PowerDNS should be expected to weather without issues.

In the course of investigating this issue, we’ve found that many PowerDNS installations on Linux are configured to consume (far) more filedescriptors than are actually available, wasting resources, potentially leading to unresponsiveness.

To check if this is the case for you, multiply the ‘max-mthreads’ setting by the ‘threads’ setting. Default values are 2048 and 2, leading to a theoretical FD consumption of 4096. Many Linux distributions default to 1024. So, our defaults exceed the Linux defaults by a large margin!

(FreeBSD defaults are far higher, and should not pose an issue).

To fix, there are four options:

  1. Reduce max-mthreads to 512 (or threads to 1 and max-mthreads to 1024) (max-mthreads was introduced in Recursor 3.2; but if you are running a version that old, please upgrade it!)
  2. Run ‘ulimit -n 32768′ before starting (perhaps put this in /etc/init.d/ script). There’s little reason to skip on this number.
  3. Investigate defaults in /etc/security/limits.conf
  4. Apply the patch in https://github.com/Habbie/pdns/commit/e24b124a4c7b49f38ff8bcf6926cd69077d16ad8

The patch automates 1 and 2, either raising the limit if possible, or  reducing max-mthreads until “it fits”.

Thank you for your attention, and if you have results to report to us on previous or current DoS attacks, please contact us privately.

PowerDNS Authoritative Server version 3.3.1

[Warning] Warning
Version 3.3.1 of the PowerDNS Authoritative Server is a major upgrade if you are coming from 2.9.x. There are also some important changes if you are coming from 3.0, 3.1 or 3.2. Please refer to Section 1, “From PowerDNS Authoritative Server 2.9.x to 3.0”Section 2, “From PowerDNS Authoritative Server 3.0 to 3.1”Section 3, “From PowerDNS Authoritative Server 3.1 to 3.2”Section 4, “From PowerDNS Authoritative Server 3.2 to 3.3” and Section 5, “From PowerDNS Authoritative Server 3.3 to 3.3.1” for important information on correct and stable operation, as well as notes on performance and memory use.
[Note] Note
Released December 17th, 2013

Downloads:

 

This is a bugfix update to 3.3.

Changes since 3.3:

On Ragel and char types

In which I spend a lot of time proving a platform difference is not actually a platform difference, and eventually end up proven wrong.

PowerDNS make check/testrunner output on Debian 7.0/s390x:

test-dnsrecords_cc.cc(199): error in "test_record_types": Failed to verify TXT: Unable to parse DNS TXT '"ÅLAND ISLANDS"'

We don’t observe this failure on other systems, suggesting this is an s390x-specific issue. I find this very unlikely, so I’m trying to figure out what environmental difference is causing this. So far, I have come up short (but keep reading).

There are clues, of course. The output of ragel dnslabeltext.rl is different on s390x. Diffing the output directly did not yield anything useful for me, but Ragel has a graphviz output mode (-V), and diffing a (sorted) version of that output does help:

$ diff -u <(sort amd64.dot) <(sort s390x.dot)
--- /dev/fd/63     2013-10-24 18:19:41.000000000 +0200
+++ /dev/fd/62     2013-10-24 18:19:41.000000000 +0200
@@ -1,12 +1,12 @@
      1 -> 2 [ label = "34 / segmentBegin" ];
-     2 -> 2 [ label = "DEF / reportPlain" ];
+     2 -> 2 [ label = "-128..33, 35..91, 93..127 / reportPlain" ];
      2 -> 3 [ label = "92" ];
      2 -> 7 [ label = "34" ];
      3 -> 2 [ label = "DEF / reportEscaped" ];
      3 -> 4 [ label = "48..57 / reportEscapedNumber" ];
      4 -> 5 [ label = "48..57 / reportEscapedNumber" ];
      5 -> 6 [ label = "48..57 / reportEscapedNumber" ];
-     6 -> 2 [ label = "DEF / doneEscapedNumber, reportPlain" ];
+     6 -> 2 [ label = "-128..33, 35..91, 93..127 / doneEscapedNumber, reportPlain" ];
      6 -> 3 [ label = "92 / doneEscapedNumber" ];
      6 -> 7 [ label = "34 / doneEscapedNumber" ];
      7 -> 2 [ label = "34 / segmentEnd, segmentBegin" ];

In short, on s390x, instead of the DEFault transition, a limited set is used.

The relevant bits of Ragel input:

escaped = '\\' (([^0-9]@reportEscaped) | ([0-9]{3}$reportEscapedNumber%doneEscapedNumber));
plain = ((extend-'\\'-'"')|'\n'|'\t') $ reportPlain;
txtElement = escaped | plain;

main := (('"' txtElement* '"' space?) >segmentBegin %segmentEnd)+;

The whole graphviz plot is too wide to include here, so I’ll show some snippets with explanation:

ragel-1

Parsing starts in state 1. From there, we demand 34 (the ASCII code for a double quote “) to go into state 2.

ragel-2

In state 2, there are a few options:

  1. We see 92 (a backslash), and we go into state 3 to handle an escaped character.
  2. We see something else. We pick the character up and drop right back into state 2.
  3. We get (unescaped!) 34 (“) and go into state 7 (EOF / segmentEnd) (not shown).

States 3-6 deal with escaped characters; there does not appear to be an issue there.

Where the pictures say DEF / reportPlain, on s390x instead we get a set of ranges (-128..33, 35..91, 93..127), which comes down to `everything from -128 to 127 inclusive, except for 34 and 92′. At first glance, this seems equivalent to the amd64 version, which says ’34 and 92 are special, the rest is not’. But obviously this is not true — the parser is rejecting our non-ASCII input on s390x.

From the Ragel docs:

extend: Ascii extended characters. This is the range -128..127 for signed alphabets and the range 0..255 for unsigned alphabets.

In other words, extend should cover the full 8-bit range. Then why is Ragel on s390x refusing to cooperate? Could the signed/unsigned difference be a clue?

No: as a test, I added -'x' to the definition of plain (on amd64), thus changing the parser such that no longer all of ASCII is covered:

plain = ((extend-'\\'-'"'-'x')|'\n'|'\t') $ reportPlain;

If I do this, DEF turns into -128..33, 35..91, 93..119, 121..127 – similar to the s390x version, with one more character (the ASCII code for x is 120) excluded, of course. With this change, the tests still pass (except for one that actually contains an x).

This means that on both platforms, Ragel considers our character set to be signed. Yet, on s390x it is rejecting our non-ASCII characters (that would, as far as I can see, be covered under -128..0).

In a test on amd64, I can see that the two bytes in Å (UTF-8), 0xc3 0x85, are mapped as -61 and -123 respectively, matching our understanding that Ragel considers our characters to be signed.

HOWEVER, if I stick the same character/bytes into dnslabeltext.rl on s390x, I see that they are mapped to 133 and 195. This means that although Ragel is generating a parser for signed characters on both platforms, it is considering the input to be unsigned on s390x.

Debian has kindly provided a table of how various types operate on different architectures. Indeed, this shows that char is unsigned on s390x but signed on amd64.

While trying to find more information on the signed/unsigned distinction in Ragel, I ran into the doc section about alphtype, which allows one to specify the type of the parsed alphabet. Setting it to signed is not allowed (!), suggesting that Ragel was never meant to cope with signed char alphabets. Setting it to unsigned on s390x fixed my issue, without breaking anything on amd64.

Workaround now in our source tree.

For me, this closes the case. I emailed ragel-users but no definitive reply has come in.

Recursor 3.5.3 released

This is a bugfix and performance update to 3.5.2. It brings serious performance improvements for dual stack users.

Changes since 3.5.2:

  • 3.5 replaced our ANY query with A+AAAA for users with IPv6 enabled. Extensive measurements by Darren Gamble showed that this change had a non-trivial performance impact. We now do the ANY query like before, but fall back to the individual A+AAAA queries when necessary. Change in commit 1147a8b.
  • The IPv6 address for d.root-servers.net was added in commit 66cf384, thanks Ralf van der Enden.
  • We now drop packets with a non-zero opcode (i.e. special packets like DNS UPDATE) earlier on. If the experimental pdns-distributes-queries flag is enabled, this fix avoids a crash. Normal setups were never susceptible to this crash. Code in commit 35bc40d, closes ticket 945.
  • TXT handling was somewhat improved in commit 4b57460, closing ticket 795.