Esoteric Topics

How is HTML read by browser


[Is this too pedantic? Is there even a need for this?]

Whenever browser reads a tag, the tokenizer starts working. The tokenizer reads a starting tag and emits a startTag token, like when it encounters a head tag, it emits a startTagHead token. Similarly, the tokenizer will emit a endTagHead token when it encounters an end tag.

A process that consumes the tokens and converts them to node objects. The startTagHTML node will be converted to the HTML node.

This creates the DOM with all these node objects. But why is it the DOM tree? Here’s an example:

<!DOCTYPE html>
    <title>DOM Tree</title>

    <h2>Document Object Model</h2>
    <p>Node Objects</p>
How is the DOM tree created?
Tokenizer creates tokens out of the HTML start tags, content and end tags.
Then these tags are consumed and Node objects are created out of them.

The tree contains the objects, the content and the properties of the object.

The process is as follows:
Doesn't matter which backend language, it will respond with HTML.
This HTML will be parsed
The parsed data will be run through the tokenizer creating tokens

start tag
end tag

It needs to be understood that the DOM is not a part of the javaScript language. And neither is the document object. This is provided to javaScript by the browser. So,

Parsing: The parser can identify each string in the angle brackets. It also has a set of rules for each of these strings. For example, a token for a heading tag will have different properties from a token that represents an image tag.

CSSOM: gets created in the same way as DOM with parsing and tokenization. The casacde is an important aspect with CSS and hence CSSOM’s tree structure is very important

Render Tree: Browser gets the DOM and the CSSOM together to create the render tree.

Layout step: This step calculates the exact size and position of all the elements.

Critical Rendering Path