Your submission was sent successfully! Close

Thank you for contacting us. A member of our team will be in touch shortly. Close

You have successfully unsubscribed! Close

Thank you for signing up for our newsletter!
In these regular emails you will find the latest updates about Ubuntu and upcoming events where you can meet our team.Close

Saving ubuntu.com on download day: Caching location specific pages

This article was last updated 5 years ago.


On release day we can get up to 8,000 requests a second to ubuntu.com from people trying to download the new release. In fact, last October (13.10) was the first release day in a long time that the site didn’t crash under the load at some point during the day (huge credit to the infrastructure team).

Ubuntu.com has been running on Drupal, but we’ve been gradually migrating it to a more bespoke Django based system. In March we started work on migrating the download section in time for the release of Trusty Tahr. This was a prime opportunity to look for ways to reduce some of the load on the servers.

Choosing geolocated download mirrors is hard work for an application

When someone downloads Ubuntu from ubuntu.com (on a thank-you page), they are actually sent to one of the 300 or so mirror sites that’s nearby.

To pick a mirror for the user, the application has to:

  1. Decide from the client’s IP address what country they’re in
  2. Get the list of mirrors and find the ones that are in their country
  3. Randomly pick them a mirror, while sending more people to mirrors with higher bandwidth

This process is by far the most intensive operation on the whole site, not because these tasks are particularly complicated in themselves, but because this needs to be done for each and every user – potentially 8,000 a second while every other page on the site can be aggressively cached to prevent most requests from hitting the application itself.

For the site to be able to handle this load, we’d need to load-balance requests across perhaps 40 VMs.

Can everything be done client-side?

Our first thought was to embed the entire mirror list in the thank-you page and use JavaScript in the users’ browsers to select an appropriate mirror. This would drastically reduce the load on the application, because the download page would then be effectively static and cache-able like every other page.

The only way to reliably get the user’s location client-side is with the geolocation API, which is only supported by 85% of users’ browsers. Another slight issue is that the user has to give permission before they could be assigned a mirror, which would slightly hinder their experience.

This solution would inconvenience users just a bit too much. So we found a trade-off:

A mixed solution – Apache geolocation

mod_geoip2 for Apache can apply server rules based on a user’s location and is much faster than doing geolocation at the application level. This means that we can use Apache to send users to a country-specific version of the download page (e.g. the German desktop thank-you page) by adding &country=GB to the end of the URL.

These country specific pages contain the list of mirrors for that country, and each one can now be cached, vastly reducing the load on the server. Client-side JavaScript randomly selects a mirror for the user, weighted by the bandwidth of each mirror, and kicks off their download, without the need for client-side geolocation support.

This solution was successfully implemented shortly before the release of Trusty Tahr.

(This article was also posted on robinwinslow.co.uk)

Talk to us today

Interested in running Ubuntu in your organisation?

Newsletter signup

Get the latest Ubuntu news and updates in your inbox.

By submitting this form, I confirm that I have read and agree to Canonical's Privacy Policy.

Related posts

6 facts for CentOS users who are holding on

Considering migrating to Ubuntu from other Linux platforms, such as CentOS? Find six useful facts to get started!

Why is Ubuntu Linux the leading choice to replace CentOS for financial services?

Financial services are powered by technology. The customer experience is increasingly driven by data, with tailoring of products and services to reflect...

Profile-guided optimization: A case study

Software developers spend a huge amount of effort working on optimization – extracting more speed and better performance from their algorithms and programs....