Understanding DNS

The DNS in a nutshell

The Internet Identifiers - names & numbers

The Internet, a global, interconnected network of networks, uses the Internet Protocol (IP) and packet-switching technology to transfer data between different participants on the network. These participants are typically machines with or without human interfaces.

The internet protocol uses unique numerical identifiers to identify the different participants on the networks. These identifiers or IP addresses allow packets to be sent to and from the many devices connected to the Internet.

To accommodate the growth of the Internet, IP addresses evolved from 32-bit integer values (IPv4) to 128-bit integer values (IPv6).

Regardless of the size of these addresses, a secondary alphanumerical identification system would be needed to make the Internet usable for humans.

The original Requests for Comments (RFC882, RFC883) on the Domain Name System (DNS) were published in November of 1983 by Paul Mockapetris, these technical standards evolved into the DNS we know today.

The DNS, at its core, is a high-performance, distributed, hierarchical, resilient lookup service that allows systems on an IP network to translate an alphanumeric identifier, into a numeric identifier, the IP address. These alphanumerical identifiers are structured hierarchically with the hierarchy represented by dots between the so-called labels. The hierarchy is represented from right to left.

Example:

  • The system that serves this website uses the following alphanumeric identifier: www.tld-isac.eu.
  • When a user, using a laptop, connected to the internet, types in the alphanumeric identifier in the browser a process is started in the background to translate that alphanumeric identifier into a numerical identifier, the IP  address 85.132.152.91. Once translated the browser will connect to this IP address.

The full syntax of the alphanumeric identifier, also known as the fully qualified domain name (FQDN), is in this example www [dot] tld-isac [dot] eu [dot]. The dots indicate the hierarchy of the domain name system and should be read from right to left. In practice, the utmost right dot is omitted.

From right to left, we identify the root (omitted in practical usage), the top-level domain (eu), the domain name (tld-isac), the hostname (www).

DNS hierarchy

The illustration below shows the hierarchy that exists within the domain name system. From top to bottom, every layer is authoritative for the next layer.

  • The root nameservers are authoritative for the top-level domains
  • The top-level domain nameservers are authoritative for the (second-level) domains in their namespace
  • The (second-level) domain nameservers are authoritative for optional subdomains and for the hosts (systems and services) existing in that domain name.

Every layer is managed by different entities:

The root

The authoritative name servers that serve the DNS root zone, commonly known as the "root servers", are a network of hundreds of servers in many countries worldwide. They are configured in the DNS root zone as 13 named authorities and managed by thirteen different organisations.

Different parties monitor their status continuously:

Top-level domains

Top-level domains, or TLDs in short, are represented by the first (and sometimes second) labels in the DNS hierarchy. The DNS processes FQDNs from right to left; the first label encountered represents the TLD.

Top-level domains are managed technically, operationally, and commercially by different entities called registry operators (RO). Some are commercial, some are not-for-profit and some are governmental or linked to official instances like universities. It is the organisation that maintains the master database (registry) of all domain names registered in a particular top-level domain (TLD). ROs receive requests from registrars to add, delete, or modify domain names, and they make the requested changes in the registry. An RO also operates the TLD's authoritative name servers and generates the zone file.

IANA maintains a reference database for all TLDs: https://www.iana.org/domains/root/db which includes the TLD, the type, and the TLD Manager, in the overview. 

The generic TLD or gTLD

This is the class of top-level domains that includes general-purpose domains such as .com, .net, .edu, and .org. This class also includes domains associated with the New Generic Top-Level Domain Program (New gTLD Program - 2011), which includes names such as .futbol, .istanbul, and .pizza, and names in other alphabets and languages.

ICANN coordinates the development of the rules and policies that govern the registration of domain names within gTLDs.

Examples: .com, .net, .xyz, .audi, .公司, .paris, etc...

The sponsored TLD or sTLD

Some gTLDs, known as sponsored gTLDs, represent a specific community of Internet users. In these cases, the community's sponsor develops the rules and policies specific to the gTLD. Examples include .aero, .coop, and .museum.

The proposed sTLD must address the needs and interests of a clearly defined community (the Sponsored TLD Community), which can benefit from the establishment of a TLD operating in a policy formulation environment in which the community would participate.

Applicants must demonstrate that the Sponsored TLD Community is:

  • Precisely defined, so it can readily be determined which persons or entities make up that community; and
  • Comprised of persons that have needs and interests in common but which are differentiated from those of the general global Internet community.

There are only 14 sTLDs: .aero, .asia, .cat, .coop, .edu, .gov, .int, .jobs, .mil, .museum, .post, .tel, .travel, .xxx

The country code TLD or ccTLD

This class of top-level domains is reserved for use by countries, territories, and geographical locations identified in the ISO 3166-1 Country Codes list.

ccTLDs base their names on the two-letter country codes defined by the ISO 3166-1 standard (e.g., .jp for Japan, .fr for France, .ke for Kenya), or they can represent a country or territory name in a script other than US-ASCII characters.

Because ccTLDs are managed locally, the rules and policies for registering domain names vary across ccTLDs.

Internationalized Domain Names - IDN

Although the DNS only supports the lowercase characters "a" to "z", the numbers "0" to "9" and the dash "-" with a limit of 63 characters per label (the label is the text string separated by the dots in a domain name), it is possible to represent other scripts using the Punycode notation.

Punycode is an embedded encoding technique easily recognizable because of the xn—notation at the start of the label. The standard for transforming a Unicode string into an ASCII string is specified in Request for Comments RFC 3492.

These internationalized domain names or IDN can be used at all levels of the DNS hierarchy, hence the existence of the .eu TLD, the .ею TLD (Cyrillic and represented by .xn--e1a4c), and the .ευ TLD (Greek and represented by .xn--qxa6a). It is up to the application layer to detect and interpret the Punycode correctly and render it accordingly.

Example: www.belgië.be is a valid FQDN, but its Punycode representation (as used by the DNS) is www.xn--belgi-rsa.be; browsers will translate the notations in the background.

Although from a technical DNS perspective IDNs don't exist as they are just normal domain names, this encoding scheme does allow the use of all languages & scripts and even non-languages in the domain name system.

IDNs are part of the Universal Acceptance initiative.

DNSSEC

The Domain Name System Security Extensions (DNSSEC) is a suite of extension specifications by the Internet Engineering Task Force (IETF) for securing data exchanged in the Domain Name System (DNS) in Internet Protocol (IP) networks. The protocol provides cryptographic authentication of data, authenticated denial of existence, and data integrity, but not availability or confidentiality.

It is a technology that helps secure domain name lookups by incorporating a chain of digital signatures on the resource records into the lookup process. Using DNSSEC, resolvers can determine whether the query responses they receive can be authenticated and verified. By accepting only authenticated query results, resolvers can prevent attackers from hijacking the lookup process and directing Internet users to deceptive websites. Full deployment of DNSSEC ensures that users are connected to the Internet Protocol (IP) address that genuinely corresponds to the domain name specified in a uniform resource locator (URL).

The actual implementation and functioning of DNSSEC are beyond the scope of this document.

Putting it all to work

Now that we have a rudimentary understanding of the components of the domain name system, how does it actually work when browsing the internet?