Two days ago, I wrote about how to profile traffic to recognize DNS over HTTPS. This is kind of a problem for DNS over HTTPS. If you can see it, you may be able to block it. On Twitter, a few chimed in to provide feedback about recognizing DNS over HTTPS. I checked a couple of other clients, and well, didn't have a ton of time so this is still very preliminary:
- Firefox seems to be the most solid DoH implementation. Firefox DoH queries look like any other Firefox HTTP2 connection except for the packet size I observed
- Standalone DoH gateways, like "cloudflared," are easy to spot. They do use a bit unique TLS client hellos. For example, they do not use the SNI (Server Name Indication), which is a default feature in most TLS clients. I guess they are trying to avoid signatures that look for well known DoH endpoints, but in some ways, "no SNI" is a signature too.
- Chrome: I couldn't get DoH to work in Chrome 79 on macOS. Sorry. DoH is an experimental feature in Firefox, but maybe I just did something stupid. Needs more work.
But to come back to the initial observation: The DoH traffic had specific packet sizes it preferred. So I was looking at this since it didn't seem random, meaning it leaked information.
I found a couple of interesting issues. First of all, the DNS query and response are sent as two packets. The first packet contains the HTTP2 headers. The second packet, the payload, which is our DNS payload. For requests and responses, the HTTP2 headers are always the same. The AES encryption process, as it is performed by TLS 1.3 does not change the length of the data. It just adds headers and integrity checks, but the size of the encrypted data matches the size of the decrypted data. This makes it easy to identify the packets that contain the HTTP2 headers for requests and responses:
% tshark -nr doh.pcap -T fields -e tcp.len -Y 'tcp.len>0' | sort | uniq -c | sort -k2 -n > doh-tcplen.txt
gnuplot> set xrange [0:500]
gnuplot> plot "doh-tcplen.txt" using 2:1 with lines
The two most common packet sizes correspond to the responses (smaller packet, 38 Bytes ) and requests (larger packets, 45-50 bytes in this example). The HTTP2 HEADER record for requests is slightly larger than the response as it includes the URL, the "authority" (hostname), and other details. The size of these requests can be used to deduct which DoH provider is being used.
The second packet, the content of the query or the response, is interesting as the packet length is related to the size of the query. The diagram below illustrates that all other fields in the TCP payload have a fixed size, and only the query string is variable.
- 5 bytes of TLS header
- encrypted data (which breaks down further):
- HTTP2 Header (9 Bytes for the DATA segment)
- DNS header ( 12 Bytes )
- query name (variable, starting with a length byte for the first label and ending with 0)
- query type ( 2 Bytes )
- query class ( 2 Bytes )
- DNS options (19 Bytes)
- 0 byte (end of data)
- MAC (16 Bytes)
Which adds 65 Bytes of fixed-length data to the TCP length. The rest is our DNS query string.
So what is the solution? There is a solution that was defined specifically for this reason. At first, I considered this a bug in Firefox and reported it, but I was pointed to an already open bug about this issue . One of our readers left a detailed comment pointing out this issue as well. The solution is, as so often with crypto: Padding. DNS has a special option for it, and this option was specifically introduced for this use case . The idea is that the client just appends some data to the end to obscure the size of the query. RFCs suggest to always pad to a specific size. But well, random padding would probably work too. The server will also respond with padding to avoid the same attack on the response. At least that is what the RFC suggests.
I used a sample of DoH providers , and crafted an HTTP2 request with and without padding, to see how they respond. To craft the DNS payload, I used scapy. I could probably use scapy for the HTTP2 part as well but found it simpler to just send the DNS payload over HTTP2 using curl. Scapy does have a contributed HTTP2 extension if anybody wants to make this prettier. The reader commenting to the earlier post used sdig which also looks interesting to do these tests. It works like 'dig', but supports DoH. sdig is part of the PowerDNS project 
Here is the scapy part without padding:
For the "no padding" test, I still included the standard/default EDNS0 option.
and with padding:
The difference is just the final 6 bytes:
Padding Option: 000C , Padding Length: 4 Bytes, and 4 0 bytes to pad. I picked a short padding length to save myself some typing.
To verify that I got a reasonable valid packet, I used Wireshark from scapy:
Next, using scapy to extract the DNS payload in hexadecimal:
And sending the string with curl (simple copy-paste for the string and using xxd to convert it to binary for curl):
% echo -n '00 00 01 00 00 01 00 00 00 00 00 01 03 77 77 77 07 65 78 61 6D 70 6C 65 03 63 6F 6D 00 00 02 00 01 00 00 29 10 00 00 00 00 00 00 0F 00 08 00 04 00 01 00 00 00 0C 00 04 00 00 00 00' | xxd -p -r | curl -H 'content-type: application/dns-message' --data-binary @- https://doh.cleanbrowsing.org/doh/family-filter/ -o - | xxd
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 160 100 100 100 60 170 102 --:--:-- --:--:-- --:--:-- 272
00000000: 0000 8180 0001 0000 0001 0001 0377 7777 .............www
00000010: 0765 7861 6d70 6c65 0363 6f6d 0000 0200 .example.com....
00000020: 01c0 1000 0600 0100 000e 1000 2c02 6e73 ............,.ns
00000030: 0569 6361 6e6e 036f 7267 0003 6e6f 6303 .icann.org..noc.
00000040: 646e 73c0 3078 5958 9500 001c 2000 000e dns.0xYX.... ...
00000050: 1000 1275 0000 000e 1000 0029 0200 0000 ...u.......)....
00000060: 0000 0000 ....
Xxd at the end is used to inspect the result. Here are some of the result I found using a sample of public DoH endpoints:
|Cloudflare||No response with padding, but adds padding to all responses if query is not padded.|
|AdGuard||Error: "overflow unpacking opt"|
|Cisco Umbrella||Error: "Malformed DNS Query"|
|Cleanbrowsing||Valid Response, but no padding in response|
The fact that I got responses from Cleanbrowsing and Quad9 should indicate that my request was valid. But these are the only two providers that responded. Some sent errors back, so I want to go back and double-check my work above.
Lots of work left. I will publish pcaps shortly (but it is always a pain to get clean pcaps without much additional stuff.. give me a few hours).
Johannes B. Ullrich, Ph.D., Dean of Research, SANS Technology Institute