Basics of HTTP

An overview of HTTP

HTTP (Hyper Text Transfer Protocol) is a request-response protocol (where requests are initiated by the recipient, usually the Web browser) that is used to fetch resources such as HTTP documents (that may include text, layout description, images, videos, scripts, and more).

Source: developer.mozilla.org

Client: the user-agent

The user-agent is any tool that acts on the behalf of the user, usually a Web browser. Other options are programs used by engineers (such as requests module in Python) to do certain tasks such as debugging or crawling apps using a command-line tool.

Proxies

Between the Web browser and the server, numerous computers and machines relay the HTTP messages. These operating at the application layers are generally called proxies. These can be transparent, forwarding on the requests they receive without altering them in any way, or non-transparent, in which case they will change the request in some way before passing it along to the server.

Proxies may perform numerous functions:

  • caching (the cache can be public or private, like the browser cache)

  • filtering (like an antivirus scan or parental controls)

  • load balancing (to allow multiple servers to serve the different requests)

  • authentication (to control access to different resources)

  • logging (allowing the storage of historical information)

(Source: https://developer.mozilla.org/en-US/docs/Web/HTTP/Overview)

Some important aspects of HTTP

The core of HTTP is stateless meaning there is no link between two requests being successively carried out on the same connection.

HTTP cookies allow the use of stateful sessions allowing session creation on each HTTP request to share the same context, or the same state.

HTTP Flow

  1. Open a TCP connection: The TCP connection is used to send a request, or several, and receive an answer

  2. Send an HTTP message: HTTP messages (before HTTP/2) are human-readable. With HTTP/2, these simple messages are encapsulated in frames, making them impossible to read directly, but the principle remains the same. For example:

GET / HTTP/1.1
Host: developer.mozilla.org
Accept-Language: fr

3. Read the response sent by the server, such as

HTTP/1.1 200 OK
Date: Sat, 09 Oct 2010 14:28:02 GMT
Server: Apache
Last-Modified: Tue, 01 Dec 2009 20:18:22 GMT
ETag: "51142bc1-7449-479b075b2891b"
Accept-Ranges: bytes
Content-Length: 29769
Content-Type: text/html
<!DOCTYPE html... (here comes the 29769 bytes of the requested web page)

4. Close or reuse the connection for further requests.

HTTP/1.0

  • New TCP connection with each request

  • Slow

  • Buffering

HTTP 1.0

HTTP/1.1

  • Persisted TCP Connection (instead of creating a new connection for each request)

  • Low Latency

  • Streaming with chunked transfer

  • Pipelining (disabled by default)

HTTP/1.1

HTTP/2

  • Compression

  • Multiplexing (is a way of sending multiple signals or streams of information over a communications link at the same time in the form of a single, complex signal)

  • Server Push

  • SPDY

  • Secure by default

  • Protocol Negotiation during TLS (NPN/ALPN)

HTTP/2 over QUIC (HTTP/3.0)

  • Replaces TCP with QUIC (UDP with congestion control)

  • All HTTP/2 features

  • Still experimental

HTTP Request/Response

An example of HTTP request:

Source: developer.mozilla.org

Requests consists of the following elements:

  • An HTTP method, usually a verb like GET, POST or a noun like OPTIONS or HEAD that defines the operation the client wants to perform. Typically, a client wants to fetch a resource (using GET) or post the value of an HTML form (using POST), though more operations may be needed in other cases.

  • The path of the resource to fetch; the URL of the resource stripped from elements that are obvious from the context, for example without the protocol (http://), the domain (here, developer.mozilla.org), or the TCP port (here, 80).

  • The version of the HTTP protocol.

  • Optional headers that convey additional information for the servers.

  • Or a body, for some methods like POST, similar to those in responses, which contain the resource sent.

An example response:

Source: developer.mozilla.org

Responses consist of the following elements:

  • The version of the HTTP protocol they follow.

  • A status code, indicating if the request was successful, or not, and why.

  • A status message, a non-authoritative short description of the status code.

  • HTTP headers, like those for requests.

  • Optionally, a body containing the fetched resource.

HTTP request methods:

HTTP defines a set of request methods to indicate the desired action to be performed for a given resource.

  • GET: Used to retrieve data

  • HEAD: Identical to a GET request but without the response body

  • POST: Used to submit an entity to a specified resource

  • PUT: Update a target resource

  • DELETE: Deletes a target resource

  • PATCH: Used to apply partial modification to a resource

and more...such as TRACE, OPTIONS, CONNECT, etc.

Understanding Status Code

A basic breakdown of the status codes is:

  • 100-199: Information

  • 200-299: Successes (200 OK is the "normal" response for a GET)

  • 300-399: Redirects (the information you want is elsewhere)

  • 400-499: Client errors (You did something wrong, like asking for something that doesn't exist)

  • 500-599: Server errors (The server tried, but something went wrong on their side)

Resources:

HTTPS Request Methods: https://developer.mozilla.org/en-US/docs/Web/HTTP/Methods

HTTP response status codes: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status

Hyper Text Transfer Protocol Crash Course - HTTP 1.0, 1.1, HTTP/2, HTTP/3: