What happens behind the hood when you enter something in your web browser (Chrome, Safari, Mozilla)?
I’ll divide the answer into 4 parts, 1) what’s the meaning of the url you enter ? 2) what’s the model OSI ?, 3) what happens in your web browser when you press enter (Google Chrome, Safari…), 4 ) The flow from you web browser into servers in a data-center.
In this article i’ll take as example https://www.holbertonschool.com but it works the same way with your favorite website.
What’s the meaning of the url you just entered in you favorite web brower ?
The url (ex: https://www.holbertonschool.com) you enter means a lot of things in reality. To make it understandable, i’ll separate the url into pieces :
The protocol is to tell with which protocol i want to communicate with servers of the website (we’ll see more after). HTTPS it’s just the extension of HTTP but instead of doing the communication between the web browser and servers publicly, HTTPS(SSL) does it securely and encrypted the communication.
The domain name is like a shortcut of the IP Address (the IP address is a long number which all devices connected on the internet have, in order to communicate with each other, ex: 172.16.254.1), without the domain name we should use the IP address instead of the domain name to communicate with servers and get what we want. It’s more difficult to remember every IP addresses of all websites we want to use, than a name which is more understandable. So, in order to remember the address of the website, we have created domain name. (we’ll see more about domain name and DNS )
The subdomain is more like an extension of the domain name, it’s always before the domain name. It used to split the traffic of a website instead of some criteres. That can be the localisation… WWW. mean everyone on earth can use this subdomain in order to access but subdomain can be more restreint as fr.holberton.com where only french people can access with this one.
This is (almost) the most basic url that you can enter into your web browser, the url can have more specification than that, you can precise which part of the website you want to see…
What’s the model OSI ?
OSI model stands for “Open Systems Interconnection Model”, is a conceptual model used to describe the functions of a networking system. This model has been created to make sure everyone has the same network system, it can be used to understand the network and to also debug this one. The OSI model has 7 parts into his schema of the networking system:
- Application
The web browser will communicate with the application part, this is the only part of the OSI model that are “interacting” with the datas of the user directly. This part is responsible of the protocols as HTTP/HTTPS, SMTP and manipulate the data of the user where the web browser handle to make available to the user.
- Presentation
This part is responsible of encryption, compression, translation, this part takes the datas from the application part and make compression to make it available to the session part. And in the other way, it make compression.. to make it available to application part. It make it presentable for the next part.
- Session
This part is responsible of the connexion between the two parts, it makes synchronisation, make sure that the connection is open between these two to send all the datas and close the connection when everything is send. As example, if a file of 100 megaoctet is send, the session will make a control point every 5 megaoctet send. In the case of a deconnexion or something else, the session will re-start at the point where the deconnexion has been started. Without it every transfert that have problems should re-start at zero and loose “everything” done before.
- Transport
This part is responsible of the communication, it takes datas from session and split datas into segments to send to the network part. This part also make sure that every segment is send, if not, a retransmissions is made. The two main protocols is TCP (emails..) (which is more secure and make sure that every packets is send but it’s less fast) and the other one is UDP (which is less secure but more fast). UDP it used in video games (because loosing packets is not dramatical, the player just going to lag a little bit).
- Network
This part ease the transfert of datas into different network. (If it’s on the same network this part is not needed) It takes segments from transport part and decompose into small pieces called packets. Or re-assemble packets into segments in the other way. This part find also the best way to send packets to make sure that it send correctly called routing.
- Data link
This part is kind of the same as network but it ease the transfert of datas into the same network instead of a different network (Network part). It takes also the packets from network and decompose it into small pieces called “trams”. It is also responsible of the flow and errors of the transfert of the datas.
- Physical
This part is the end or “the start” of the OSI model, all the datas are transform into binary and send by fiber, coax..
Last example of the model OSI :
- Mister X want to send an email to Mister Y, Mister X write his email into hotmail.com and click to send. Hotmail will send the message to the application part with the protocol SMTP and will send datas to the presentation part. The presentation part will compress the data (email..) and will talk with session part where a communication will start. Then datas are send to the transport part, it will be segmented and transform into packets in the network part. After this process, packets are transform into trams with the data link part, then data link part send trams into the physical part which is transform trams into binary and be send with physical structures (cables, fiber..). And after that the response does the same thing but in the other way..
I hope now that we saw in details the OSI model, you have now an overview of how the network system works.
What happens in your web browser when you press enter ?
Before that, what’s the purpose of a web browser ? is a software used to locate, retrieve, display content on the World Wide Web, including images, videos, web pages… The web browser interact with a web server (a web server is a software as Apache/Nginx on a physical or virtual server that accept http/s requests to distribute web pages , commonly servers are in a data center, which is a place where there are a lot of servers in the same place ) and request informations from this one for a specific website enter by the user.
The web browser has different part of process to display a page to the user :
- The user interface
This part as you can imagine is the part that you can see right now, it composes everything visible as the search bar, favorites, signets… It's the part visible of the web browser.
- The browser engine
The browser engine does the bridge between the user interface and the rendering engine and also data persistance. In function of what the user has request, it use the part needed to perform this task.
- The rendering engine
This part is where all the magics happen, this part has been splitted into 3 parts (networking, javascript interpreter, UI backend). His main role is to render the page that the user want into the user interface.
- Networking
This is where all the communication with web server of a website is made (with the model OSI). It handles all the aspects of the internet communication and security. It may implement a cache of retrieved documents to reduce the network traffic.
- Javascript interpreter
This part interpret the Javascript codes. The interpreted results are send to the rendering engine and to the user interface. If the script is external of the web browser, the ressources is fetched first by the networking part and then interpreted.
- UI backend
This part is used to draw the interface that you can see right now, the border, boxes, windows…This part interact with the OS system of your computer in order to perform this task.
- Data persistence/storage
This is a little database installed in the user computer where the web browser is installed. It manages also all the features as cookies, cache, bookmarks and preferences from the user.
The web browser as Chrome use multiple process for each tabs you opening, so multiple rendering engine.
- The flow of the rendering engine
The networking layer will retrieve files from the web server by http of the website asked. The network layer will give html files retrieves to the rendering engine part, which will start a specific process, 1) it parse the HTML and CSS codes and construct a tree called DOM tree and style rules for CSS, all the semantics of the HTML code will be in a multiple nodes in this tree.
Example of a DOM tree :
After that, the rendering engine process create an another tree called render tree, this tree is a visualisation of the order of how all the elements will be displayed on the screen. Also will give informations about the style from the style rules of css files. This tree is to enable “painting” the contents in their current order.
Now that the render tree is created, another process comes is, the layout process this process gives coordinate/position and size of each nodes of the tree in order to be printed at a specific place and position on the screen.
The last step is the painting part, the render tree is traversed with the UI backend and the UI backend call method of the OS system to print everything on the screen.
The flow from you web browser into servers in a data-center
We saw an overview of how the network works and how the web browser process to render/retrieve a page but we haven’t seen all the specific steps it takes to reach the web server, all the infrastructure, dns, firewall, load balancer it might counter….
Before to explain, i just want to say that i’ll choose a basic infrastructure for the demonstration. It exists a lot of type of instractures, distributed, scalable, secured…. and of course when you search facebook.com this is not the same infrastructure as i’ll show and will be more more sophisticated !
- How are structured infrastructure ?
First, you have your local network, where all your devices are connected to your private network called LAN. You have a router from your provider. Then you have the DNS given by your provider. On the other side, you might have a firewalls, load balancer, multiple servers.
So now i just entered the url of the website i want to access, the first step of my infrastructure will be the cache of my web browser, as we saw the web browser store into the cache the website that you visited before in order to reduce the traffic network. So if you have already goes to this website recently and the cache hasn’t been cleared. Maybe, no communication will have to be made with the server of the website and the web browser will just have to display your page from the cache.
In other case, you never been to this website, the first thing will be did is to access to the DNS from the provider to see if the domain name given is register into the DNS to give us an IP address and make the communication. DNS has different type of records, first A record (which is map an IP adress to a domain name), CName (alias to map subdomain), MX RECORDS (Each MX-record points to the name of an e-mail server and holds a preference number for that server.)…
- The process of the DNS
It will communicate with the recursive/resolver DNS servers (is the DNS of your provider) and will check if in the cache of the DNS there is the website you want to access. (if it exist, the query will end here).
If not, the resolver will start communicate will each of the three server of the recursive resolver DNS servers.
First one, is the root server, this one will give to the next server (TLD) the address of a DNS server for (.com, .fr. net). as example : i’m asking the domain name “holberton.com”, the root server will send the address of the dns server of “.com”.
Then the recursive resolver ask to the second server called TLD to communicate with the address of the DNS server of .COM and it answers back with the IP address of the domain name server of holberton.com.
At this step we have almost the IP address, the resolver ask to the third server called AUTHORITATIVE to send a request to the domain name server of holberton.com to give us the ip address of this specific domain name, the domain name server send us back the IP address of holberton.com.
Finally, the resolver DNS gives to the web browser the IP address of holberton.com, BINGO.
We have now our first part of the infrastructure :
Now we have our IP address to communicate with the server of the website. To communicate we’ll use the protocol HTTPS which will encrypt with SSL all our communication. We’ll also use a certificate that are stored on the server of the website and which is contains important information as the public key of the website, the identity… in order to make the connexion with the web browser and the server.
Certificate are digital passports that provide authentication to protect the confidentiality and integrity of website communication with web browsers.
A web browser attempts to connect to a website (a web server) secured with SSL. The web browser requests that the web server identify itself. The web server sends the web browser a copy of its SSL certificate. The web browser checks to see whether or not it trusts the SSL certificate.
The request we send has just passed a firewall, the purpose of a firewall is to filter incoming and outgoing network traffic requests to make sure that not everybody can send request to the servers in function of the security policies set. The security of a firewall is set by the administrator of the infrastructure. Firewall can be physical or a virtually.
Our request has been arrived to the load balancer, the load balancer is a software running on a physical/virtual server. His main role is to transfer the trafic incoming into servers properly and effectively. It exists multiple strategies in order to split the incoming request, it depends of your infrastructure. The most basic one is to have two servers and one server in front of them that have a load balancer and use algorithm round robin which will send request 1 to the server A then resquest 2 to the server B etc.
The load balancer will also end the encryptions of SSL communication, every request send after the load balancer is send with HTTP. (you can also make it possible to communicate even into the lan of your cluster of servers (data center) with HTTPS)
Our request now have reach one of the server of my cluster. A server is composed of a web server (nginx, apache) that delivers static web contents (HTML, videos, images..), the code base (static files, HTML…), Database (postgreSQL, Mysql..), and an “application server” that make the connection between the code base and the database and does the business logic and generate dynamic content. Dynamic content is mostly powered by the application server and scripts that run on the server. When a user makes a request, these applications work in tandem with the web server to parse the request, generate content based on the request, and deliver the content to the user as though it were static content
Finally, the web server send a response back to the web browser with the files needed to display the website.
Wow, what a journey to have your favorite website.
I hope that you understand clearly this article and you have now an overview how it works behind the hood :) Thank you so much to have reading me. If you have some suggests about something don’t hesitate to contact me.
Thanks to :
https://medium.com/@monica1109/how-does-web-browsers-work-c95ad628a509