Introduction

Probably the last time any person or entity had a complete list of all hostnames on the Internet was in the mid-1980s, when the Domain Name System (DNS) replaced the old, centralized DoD Internet Host table. Some domain registries like Verisign have zone file access programs, which offer a way to download a complete zone file. But many country-code Top Level Domains (ccTLDs) do not offer such programs. Various companies have, usually through crawling large parts of the web, collected a huge number of DNS records, but none have a complete list of all domains.

And even those incomplete lists are in the hands of few companies like Google, Microsoft (Bing) and DomainTools LLC (whois.sc), presumably most major ISPs, and some research organizations like DNS-OARC, where they are treated as closely guarded company secrets. Other than the zone files provided by Verisign and some other registries, there are few datasets freely available for research or data mining.

The DNS Census 2013 is an attempt to provide a public dataset of registered domains and DNS records. It was inspired by the Internet Census 2012 which showed that releasing data anonymously via BitTorrent is a good thing to do. The dataset contains about 2.5 billion DNS records gathered in the years 2012-2013.

The data

All data is compressed using xz/LZMA2.

DNS records

DNS records are written into CSV files. There is one file for each DNS record type (A/AAAA/CNAME/DNAME/MX/NS/SOA/TXT). The records are sorted lexicographically by hostname and by time.

Registered domains lists

If you only care about currently registered domains (e.g. example.com, nominet.org.uk, …) but not subdomains (e.g. www.example.com, nom-ns1.nominet.org.uk., …), the registered domains lists are for you. The lists contain only those hostnames for which a DNS record has been observed on or after January 1st, 2013. Each file contains the hostnames registered under one top-level domain. The lists are sorted lexicographically.