Introduction to CDNs
Otto, an Idaho small business owner, maintains an e-commerce website selling used bikes. They host their website on a server in their hometown of Boise, Idaho. Their business has recently attracted the attention of a lot of customers in New York City. However, these new customers are complaining about the web page’s performance. The images of recently added bikes take too long to load, and sometimes the website would timeout with a 408 status code (indicating a request is taking too long).
After a few hours of investigating, Otto determines that due to the location of their server (Boise, Idaho), the requests are taking too long to arrive in New York City. How can Otto make sure their New York City customers can see all their beautiful pictures of used bikes? Otto can enable a CDN!
A content delivery network (CDN) is a geographically distributed fleet of servers that help cache and improve the delivery of data to users based on their location. CDNs can help speed up the delivery of various data such as HTML documents, CSS stylesheets, static assets (e.g., images), and much more! CDNs are considered a layer of the internet ecosystem and a common caching solution.
So, how exactly does a CDN work? Well, let’s return to Otto and examine how they could implement a CDN to support their growing customer base in New York City. Here is what Otto’s setup would look like:
Let’s explore Otto’s CDN setup in detail:
- On the leftmost side of the map, we have our origin server (marked in purple), which represents Otto’s server in Idaho.
- From the origin server, we can reach various user locations (marked in green) very quickly due to the short distance.
- Due to the variety of other locations that Otto’s store services, we have added CDN servers (marked in blue) to better reach various geographical locations.
- Note that instead of customers in New York City (marked in red) communicating with the origin server, we can serve specific content from a much closer CDN server.
In Otto’s setup, each CDN server could cache all of the website’s bicycle images. This would allow each CDN server to have a copy of the bike photos. When users request to see the bikes, the application no longer has to send the request all the way to the origin server. Rather customers can pull up the cached images from the CDN server closest to them.
While Otto’s bike store illustrates one advantage to using CDNs, let’s explore some other benefits of using a CDN in a software system as well as the challenges that CDNs pose.
Benefits and challenges
There is a ton of upside in implementing CDNs with an application:
- Faster content delivery: Just as we saw with Otto’s online bike store, server response time is typically faster since application content may be closer to a user.
- Increased availability: Even if an origin server becomes unavailable (e.g., offline, under maintenance), a CDN may provide greater availability if it hosts relevant data to allow users to keep using an application. Some CDNs even store entire copies of websites!
- Increased security: Since CDNs become the first layer that users communicate with (rather than the origin server), they also serve as the first layer of defense from malicious activity. This means the origin server is slightly more protected if an associated CDN server catches (and sometimes deals with) malicious activity first.
However, here are some challenges to be aware of when using a CDN:
- Out of Date Content: Since CDNs host content from an origin server, if anything is updated on the origin server, there needs to be a way for CDN servers to also get the updated data. Otherwise, users may be receiving outdated content! One way to deal with this challenge is to use cache-control HTTP headers.
- Increased Cost: CDNs are typically either physical servers (like the one Otto owned) or hosted via a third-party cloud provider. Either way, if an application needs more CDNs, the cost of the system increases.
In addition, some applications may not benefit from a CDN. Instances where a CDN may not be helpful include the following:
- There is a cybersecurity threat to the CDN, leading to a potential hacker attack.
- A webpage consistently attracts low traffic and there is no need for caching.
- An organization or country restricts access to popular CDNs.
We’ve now been introduced to the concept of CDNs. We’ve learned that CDNs do the following:
- They help reduce the time to download content by making it physically closer to end-users.
- They host (or cache) a variety of data, including HTML pages, style sheets, images, documents, client-side scripts, and files.
- They provide benefits such as increasing application speed, maximizing availability, and more security.
- They have challenges like increased cost and dealing with outdated data.
- They may not be needed if an application has a smaller audience or if there is a government-mandated restriction on access.
Some common third-party CDN providers on the market include:
Although varying in price, these CDN providers offer customizable features to help businesses quickly set up CDNs for their applications. Next time you build an application, take it around the world with CDNs!