What Happens When You Type google.com in Your Browser and Press Enter
This article reviews the frontend, and the backend processes that occur whenever a user from a web browser tries to access some service over the internet.
Have you ever wondered what happens behind the scenes and seemingly like magic, whenever a page appears when type google.com or some other requests to a website via your browser? Then having a dig at this article will help you answer this curiosity and help you to begin thinking like an expert. The aim of this article is to be as exhaustive as possible, and to accept criticism.
This article will try to unravel some of the critical processes that happens between your computer and the computer where google.com resides. In order for us to be on the same page, and to avoid sounding too technical, I have developed a glossary at the end of this article that explains some technical terms associated with networking articles and other similar articles.
INTRODUCTION
Whenever you communicate over the internet, the entire process of communication is guided by some set of protocols that defines how every corresponding infrastructure will interpret your message. Based on the Open Systems Interconnection (OSI) seven layer model and the TCP/IP four layer model, the top most layer regarded as the application layer is where the communication begins on a web browser. The protocols found at this layer typically include the HyperText Transfer Protocol (HTTP), Secure Shell (SSH), File Transfer Protocol (FTP), Simple Mail Transfer Protocol (SMTP), and many more. We will discuss the HTTP and its secure version HTTPS (HTTP Secure) later in this article.
Using the TCP/IP four layer model as a guide to break down the entire communication process, this article will be further subdivided into different sections that handles the communication process between a client and a server. We will to start from the client side and frontend of things and end the journey at the server side and backend where the typical clients’ request is processed, whenever the client requestshttps://google.com
or some other service and the respective server responds to the request.
THE DOMAIN NAME SYSTEM (DNS)
Hosts over the internet have unique number address that identifies each host on the internet, and under normal circumstance, no two hosts within the same network should have the same IP address. However, these numbers (the IP address) are difficult to memorize as humans, hence the need for domain names like google.com
, dee-light.tech
, amazon.com,
and many more that reference these IP address.
The DNS is thus present as a link to these different names and their respective IP addresses. In order to resolve the IP address of a domain name, the web browser will typically check its cached values for the respective domain name. If the domain name is not found in its cached values, then the browser will turn to a DNS resolver.
Usually, a DNS resolver will be your Internet Service Provider (ISP) entity/service. The resolver will first check its cached values, and if present, it will return the respective IP address. If the domain name is not present in this case, the resolver will then reach for a root DNS server.
A root DNS server contains a list of known Top Level Domain (TLD) DNS servers. The resolver then follows through to the TLD DNS server, while also caching the value of the TLD DNS server for future use. The respective TLD DNS server contains the list of the domain names authoritative name servers. For instance the .com
TLD server will contain the authoritative name servers of all websites that uses the .com
TLD, including google.com
. The authoritative name server of a domain is the server attached to a given domain name that contains information (e.g. IP address) about the attached domain name. The IP address returned by the authoritative name servers to the resolver is then returned to the browser and is cached by both the resolver and the browser for future use.
The TLD DNS servers are generally updated by the domain registry which is the organization that controls policies of domain purchase and regulations. The authoritative name server(s) attached to a domain name is managed by the company that runs the domain name in conjunction with the company that the domain name purchase was executed on — Typically called a domain registrar.
Note that this entire procedure is not always followed whenever you request for the IP address of a domain name via your browser. Cached values from previous resolutions are re-used from time to time whenever a request is made. Also cached contents will often have a Time To Live (TTL) which is basically the amount of time for which the particular domain cached value can stay in the cache before an new request is made to the DNS.
HTTP / HTTPS
HyperText Transfer Protocol / HyperText Transfer Protocol Secure is an application layer protocol that facilitates communication between a client and a server over the internet.
Whatever service that you need from a server, for instance accessing google services by executing google.com
, is made possible because there is a common language that both the server and client understands. This language is HTTP.
A client will typically send a request to a server, and the corresponding server will send back a response as a response to the request. The way HTTP works is that it encapsulates the bare messages (request or response) into a HTTP message. An HTTP message will contain a header and an optional body. The header will be some key value pairs that indicate specific commands and properties of the host that is sending the message. The optional body will include some data content whenever it is needed.
As an example, when we execute google.com
an HTTP request message is sent from our browser (the client) with the command in the header that says GET google.com home page for me. This request message will not have a body, since there is no data you are transferring to google.com in this case. The responding server to this request will send back an HTTP response message with some header information that includes some metadata about the returned message and a response code, while the body of the message will include all assets (html file, css file, js file, images, and other resources) required to render the google.com
homepage on your browser.
HTTPS (HTTP Secure) is a secure version of the HTTP. It adds an extra layer for security on top of the normal HTTP. This layer of security can use the SSL (Secure Sockets Layer) or TLS (Transport Layer Security) cryptographic protocol.
HOW THE TLS HANDSHAKE HAPPENS (CHATGPT): When a browser connects to a website server, the server sends a TLS certificate that contains the server’s asymmetric public key and other information. The browser generates a random pre-master secret and uses the server’s public key (from the TLS certificate) to encrypt the pre-master secret. This encrypted message is sent to the server. The server, using its private key, decrypts the message to obtain the pre-master secret. Both the client and server independently use the pre-master secret, along with additional random values exchanged during the handshake, to compute the master secret. The master secret is then used as the basis for generating symmetric keys. These symmetric keys are shared between the client and the server and are used for encrypting and decrypting further communication between both parties.
TCP / IP
Transmission Control Protocol / Internet Protocol is a suite of protocols that ensures that data is delivered from one point to another on a network. For instance, after our browser encapsulates our request as a HTTP request message from our browser, this request needs to get to the server and the server also needs to send back a HTTP response. The way that this transportation (forth and back) occurs from one end to another is based on the TCP/IP suite. On a most basic form, this suite can be divided into the individual protocols, i.e. Transmission Control Protocol and Internet Protocol.
The internet protocol is concerned about ensuring that every host on a network has an assigned IP address. For every host to be fully functional on a network to transmit and receive some data, it must have an IP address that uniquely identifies it on the network. Usually hierarchy of standard organizations like ICANN (Internet Corporation for Assigned Names and Numbers), RIR (Regional Internet Registries), and Internet Service Providers (ISP) are responsible for managing, monitoring and assigning IP addresses to different hosts on the internet.
While the IP ensures that every host on a network can be identified and located, the transmission control protocol ensures that data is properly transmitted from one host (with an IP address) to another host (with an IP address) on a given network. The TCP ensures the following:
a. Establish connection between two hosts for data transfer.
b. Break down data into smaller bits (called data segments) for easier and faster transmission over the network.
c. Error checksum and numbering, to ensure that the data segments lost during transit are resent and accurate assembly of the data segments when combining the data segment at the receiving end.
The TCP has a variant called the UDP (User Datagram Protocol), which works similar to the TCP but without the connection and error checksum overhead. This means that the UDP will not guarantee the delivery of all the data segments, as when errors occur during transmission, there is no way to retrieve the lost segments. There is also no established connection between hosts using UDP — typically this means that UDP doesn’t do a TCP handshake, where TCP first tries to establish a connection between hosts before sending data. The UDP is suitable for applications that can tolerate some data loss or errors, such as online gaming, online streaming, or voice over IP.
In the OSI model, the UDP and TCP are classified to be under the Transport Layer. But with the TCP/IP model, both the Transport Layer and the Network Layer are combined to be the Internet Layer.
Multiplexing on the transport layer: The TCP (or UDP) allows for multiplexing. TCP Multiplexing uses different ports on a host to distinguish delivery or receiving appropriate data segment for different running applications on the same device. Ports here does not refer to physical ports on devices, it rather means logical identifiers that distinguish different processes running on the same hosts, in order to avoid confusion for data segments sent over the same transport layer. For servers, this can mean they have different listening ports for different services running on the same server, and for simple personal devices this can mean using different receiving ports to receive data coming from different applications on the same transport layer protocol.
FIREWALLS
Firewalls are software or hardware (or both) that serve as a division between a private network and a public network (usually the internet). Firewalls will allow, limit, or block network traffic going to the private network or coming from the private network based on some set of preconfigured rules, also called Access Control List (ACL). ACL rules can be based on IP addresses, port numbers, application, or even sophisticated machine learning predictions on network traffic data packets.
Firewalls can be used to restrict access to certain resources and services on a private network by unauthorized individuals. They can either be host-based (as a software) to protect a single device or they can be network based as a hardware installed between the internet and the private network to protect a network of devices.
At this point, you must have had a good grasp on what happens from the client side when you visit any website or try to access some service on the internet. The next subject of discussion are addendum to further bolster the knowledge that we have gathered up until now. Going forward, we will be addressing fully what goes on at the server side of things. We would be considering the major infrastructures that is set up at the backend to respond to users’ request.
WEB SERVER vs. APPLICATION SERVER
The term server has been covered while defining some network terms at the beginning of this article.
There are different types of servers on the internet, and they are typically classified by the type and kind of service that they host. For instance, while considering the Domain Name System, we looked at how the domain name resolver hopped from one type of domain name server to another (namely the root DNS server, TLD DNS server, and finally the name server).
Under this section, we will consider two types of important servers which are often interchanged with one another. They are typically the kind of servers that will interact with on daily basis, and the difference between them can be blurry. The Web Server vs. The Application Server.
Web Server (Also called HTTP server) handles HTTP requests from the client, and responds with a static content (such as HTML pages, videos, images, and other kind of static content or assets), a redirection or some dynamic response that is generated by some server side application.
The main job of the web server is to handle HTTP requests and respond with some static content, which is basically some pre-stored content or information. For instance, requesting for a google home page. The way that a google home page looks is some static HTML contents and assets that is pre-stored on their database and can easily be retrieved.
The web server can perform some logic, but such logic is usually restricted. In the case where a web server is serving up some dynamic content and doing logic, such content are usually generated from some server side application and the web server simply just provides environment for the execution of this server side application. The logic implemented in this case is often integrated on the client side or the database server that the web server interacts with. The implementation of a web server with a database server is usually called a two-tier web infrastructure design.
On the other hand, the application server does more than just accept a HTTP request and return some static content. An application server exposes the business logic of a service, and also generate some dynamic content based on the business logic. Let’s say for instance, that we have an e-commerce site where purchase and transactions occur as a service. The application server will be the server to process the request from the client and generate dynamic responses based on the business logic implemented. The application server will also dynamically relate with the database to retrieve and store information on the database server using some business logic. So, here the database would not be responsible to handling logic in any case.
The application server can handle more protocols than the common HTTP. It can also serve up static content like a web server.
It is however common practice to implement both a web server and an application server on the same website in a three-tier web-infrastructure design that also includes the database server. In the three-tier web infrastructure, the web server handles HTTP requests, and the web server serve as a proxy server redirecting requests that require business logic to the application server, and then returning the generated content from the application server to the client.
DATABASE
A database is a collection of related information. The two major types of databases in use today are the Relational and Non-relational database. In a relational database the information is stored in some table where each record is stored as a row and the attributes of the record is stored as a column. The non-relational databases don’t follow this pattern.
In a typical Relational database, the system (management system) will usually have the functionality to relate with the database using Structure Query Language (SQL), which syntax nuances may vary from one management system to another.
One of the important characteristics of a database is that it can be accessed by multiple users at the same time who can read and write to it.
The topic of database is an interesting one with different nuances depending on the specific application and inclination towards the use and pragmatism towards big data.
LOAD BALANCER
The large number of users and queries that top and big companies like Google, Meta, LinkedIn, and other top organizations execute per minute, means that it will be difficult for one server stack (two-tier or three-tier stack) to be responsible for all the responses. As of 2011, a rough estimate put the total numbers of servers that google use at 900K.
The way that companies efficiently use multiple servers to handle multiple queries is to use a load balancer. A load balancer is a software or hardware that distributes different queries from different users to different servers, thereby ensuring efficiency and high availability of the service.
Take for instance that we are using three servers to respond to the queries of multiple users. A load balancer will seat in between the client and the server, such that queries from users will arrive at the load balancer, which will then decide which of the servers to send the request to.
A load balancer can be a software program that seats on a server cluster system where each of the server are connected to one another in a virtual network. A load balancer can also be a hardware load balancer where the load balancer is a dedicated hardware that has a load balancer software installed on it, and is connected individually to all the servers.
Load balancer software are usually programs that implements one or more load balancing algorithms. These load balancing algorithms or their combinations will determine the behavior of the load balancer. Examples of load balancing algorithms include round robin scheduling, weighted round robin scheduling, least connection first scheduling, and even as complex as predictive scheduling that uses machine learning techniques.
ADVANCED: Difference between Layer 4 and Layer 7 load balancing. (Reviewed later)
Load balancing is a very important topic in software architecture, and the configuration of a load balancer can shape how reliable a software service can be. With load balancing, software architecture and delivery can begin to avoid Single Point Of Failure (SPOF).
A SPOF in a software architecture and design is a system in an entire software architecture or design that will stop the delivery of a software service, if it fails. Let us imagine a hypothetical hgoogle service that uses only one server. When this server is over swamped or fails for any reason, then the service offered by hgoogle will instantly stop and our users will not like this. Using a load balancer with multiple servers can avoid SPOF on the server side. And of course, using a single load balancer is also a SPOF, that will need to be addressed.
This article likely be updated, further referenced, and open to contribution and criticism.
GLOSSARY
- Network: Imagine that every time you want to speak to a friend in a far place or send a word document to her, you had to travel that far to deliver your package. Well, thankfully, we don’t have to do this. Instead of us travelling, a typical network and its infrastructure does all of this travelling. A network is a set of interconnected devices whose primary aim is to aid communication and data transfer from one end to another. There are different types and examples of networks, but the most common and perhaps most complex is the Internet.
- Host: The first time I heard the word Host, I thought it referred to my host computer, i.e. the device that I was using. It turns out that in Networking, a host is any device on a Network that is capable of sending and receiving traffic over that Network. When we say traffic here, we mean data transfer over a network.
- Protocols: In the early days of Networking, typically, pre-internet era, different Operating System and different manufacturers adopted different rules that guided how similar data was transferred from one device to another. Basically, the rules that guided an individual network will depend on the computers on the network, such that other devices different from the computers on the network could not participate meaningfully on such network.
These rules are called protocols, and since the advent of the internet, protocols that guide different aspect of data transfer (on a network) from one device to another have become more standardized and unified to maintain ease of communication and networking. - Servers: The internet has become extremely indispensable for the human race, and this is largely because most of what we do and our interactions are mainly hosted on the internet. All of the services that we have access to via the internet are hosted physically on a highly available computer called a server.
A server is usually a resource intensive computer where the services that we want to access are housed.
Servers, or at least access to servers, are not always physically based. Cloud servers for instance are virtual instances of servers that are accessed via cloud service providers. Think of it like this; A traditional physical server (dedicated server) is like a big house designed with only a single room and for a single operation. This means you design an entire big house just to use it as a single bedroom (without the toilet, and other operations of the house). Virtual instance(s) of servers then involves actually taking this dedicated server, and splitting this single room into different rooms with different operations, where each room is independent of the other rooms. This is done through a process called virtualization.
Thus, what cloud service providers do is to provide these virtual instances of multiple servers to their consumers, and eliminate the need to purchase and manage a physical server before you can deploy a service or resource on a server. - Client: A client is simply any host that sends a request to a server. In a server-client model, where the server only responds to request of a client, the client can be the web browser that sends a request to a server and is able to receive the response from the server.
- OSI model: The Open Systems Interconnection (OSI) model is an abstract model for how hosts on a network communicate with each other. The OSI model does not pay so much attention to the hardware implementation and actual underlying software mechanism of how systems to systems actually communicates. OSI model, instead breaks down the entire communication system between hosts into seven layers, where each layer comprises of protocols and infrastructures that encapsulate or de-encapsulate the data that is being exchanged.
(If the above is long already, kindly skip the remaining part of OSI model…)
The seven layers respectively are: Physical layer, Data Link layer, Internet Layer, Transport Layer, Session Layer, Presentation Layer, and Application Layer.
The physical layer is concerned with how raw bits of data are moved around over the wire or some other physical medium.
The data link layer is concerned with how the data in the physical layer are properly assembled in respect to their destinations.
The Network layer concerns itself with logical addressing on a network.
The Transport layer concerns itself with how data packets over the network is properly transmitted to the respective destinations.
The Session Layer, Presentation Layer, and Application Layer concern themselves with session management, representation of data and how the respective data is displayed by the application using it.
While there are some distinctions in different layers, it is common to see protocols implement the combination of different layers infrastructure together. For instance the Address Resolution Protocol (ARP) uses both the Media Access Control (MAC) address and IP address to deliver data to a destination. The MAC address is typically assigned as a layer 2 infrastructure, while IP address is assigned as a layer 3 infrastructure.
Apart from the OSI model, there have been other models, but possibly the most noteworthy and widely adopted is the TCP/IP model which uses four layers, namely the Network Access Layer, Internet Layer, Transport Layer, and the Application Layer. - IP address: The IP (Internet Protocol) address is a unique number that identifies each host on a given network. Currently there are two types of IP address in circulation and in use. The IPv4 and the IPv6 (version 4 and version 6 respectively). Example of a IPv4 address is 8.8.8.8, 84.86.55.77, and an IPv6 address can be 3001:0da8:75a3:0000:0000:8a2e:0370:7334
- Domain Name Anatomy: A domain name such as `www.google.com` contains a top level domain which tells us what kind of service a particular website offers. In the case of the above example
.com
is the top level domain name and it stands for commercial. Other examples of top level domains are.org
which is for NGOs,.gov
which is for government agencies, and many more. Also a domain name will usually also contain a subdomain. In our example above,www
is the subdomain. The subdomain, is used to subclassify a given domain name. For instance,bard.google.com
is a subdomain ofgoogle.com
which offers another kind of service (calledbard
) under the umbrella domain namegoogle.com
. Thegoogle
in our example is referred to as the second level domain, and the combination of the second level domain and the top level domain gives us the domain name. - URLs: Uniform Resource Locator (URL) will typically look like this — http://www.domain.name:1234/path/to/destination. The
http
at the beginning refers to the application layer protocol of the communication with the server you are requesting to,www.domain.name
is the domain name as explained above. The 1234 after the:
is the port number that the server you are requesting to is listening for connection. Usually services and applications will usually have standard port numbers, for instance 80 for http request. If the port number is not inaccurately specified, the connection to the requested service will be wrong, and error will likely come up. Specifying the port number is like specifying the door to use to enter the server, If the server is not listening for connection on that port, that port will not be accessible by outsiders, thus it will be as if the door is locked against whoever is accessing the resource. The/path/to/destination
is the path to the resource you are looking for on the server that you are requesting from.