CDNs

🌐 Content Delivery Networks (CDNs) Explained

CDNs are a method to cache data closer to end users to reduce the need for requests to travel across long distances, such as across the entire world. This distribution of content closer to the user is achieved by having multiple CDN servers located globally, with one primary origin server holding the original data.

💡 Core Benefits of CDNs

Decreased Latency: By placing content on servers geographically closer to the user, the distance the data must travel is much shorter, resulting in faster load times.
Reduced Load on Origin Server: Distributing static content to CDN servers offloads requests that the origin server would otherwise have to handle, allowing the origin server to focus on dynamic tasks.
Increased Availability and Reliability: If one CDN server fails, clients can simply be routed to the next closest healthy CDN server, improving the system’s overall availability and reliability.

💾 Static Content Restriction

A key limitation of traditional CDNs is that they are generally designed to handle only static content—data that does not change per user and is not dynamic.

🖼️ Types of Static Content for CDNs

Static content suitable for CDNs includes:

JavaScript files: Code that is the same for every user, such as the Apple authentication script seen in the example.
Images: Profile pictures, product photos, or general images like a picture of the Eiffel Tower.
Videos: Pre-recorded video files that remain unchanged.

📝 Note: Newer “edge servers” are a developing technology designed to overcome this restriction by allowing application code to run on distributed servers, but this is a complex and evolving topic beyond the scope of a basic CDN overview.

🔄 Types of CDNs

CDNs primarily operate in one of two ways: Push or Pull. The choice depends on the application’s needs, specifically how data is expected to be used across different regions.

1️⃣ Push CDNs

How it Works: When new static content is added to the origin server (e.g., a user uploads a new profile picture), the data is immediately pushed (uploaded) to every single CDN server around the world.
Best For: Content that is expected to be accessed globally and immediately by a large percentage of users.

2️⃣ Pull CDNs (On-Demand Caching)

How it Works: The content is initially only on the origin server. A user’s first request to a local CDN server results in a cache miss. The CDN server then acts as a proxy, fetches the data from the origin server, caches it locally, and then returns it to the user. Subsequent requests to that same CDN server will result in a cache hit.
Best For: Content where user interest is regionally distributed and you want to avoid storing unnecessary data on servers where it will not be accessed. This conserves storage and bandwidth for the servers in regions that don’t need the data.

🏷️ The `Cache-Control: public` Header

The HTTP Cache-Control header can include a public directive, which is crucial for CDN operation, especially with pull CDNs.

public: This value explicitly specifies that the data (like the Apple JavaScript file) is allowed to be cached not only by the user’s browser but also by intermediate servers, such as a CDN.
private: If this were present, the CDN server would be instructed not to cache the content, forcing a request back to the origin server for every user.

In the case of a pull CDN, seeing Cache-Control: public tells the CDN that it’s safe to cache the fetched content, enabling it to serve that data as a cache hit for subsequent requests from nearby users.