Everyone knows what a website is. Virtually every company has one. But surprisingly few businesses know what to do when things suddenly get busy.
Today, I want to talk about scaling. That can mean scaling up or scaling down. When a website gets a lot of traffic, it may be necessary to make the computer that runs the website faster or more powerful. We call such a computer a server. And servers can get overwhelmed. That’s just a fact.
Most website owners don’t notice much of this. The server they rent is often powerful enough for the number of visitors the site gets. But sometimes, a website suddenly becomes very popular, or it's a service where demand can spike unexpectedly. Think of ticket sales, or video-on-demand. In those cases, the server might not be able to keep up. So what do you do then?
Is it really too busy?
The first step is to determine whether the problems are really caused by high traffic. If a website is slow or shows error messages, that doesn’t always mean there are too many visitors. But if that is the cause, you can often see it in the so-called “load average.” That’s a series of three numbers that indicate how hard the server has been working: over the past minute, the past five minutes, and the past fifteen minutes.
Even better is to store this data and display it in a graph. That way, you can see exactly when it's busy — for example, every evening or during office hours. If it turns out that the server is structurally overloaded or regularly experiencing spikes, it’s time to take action. So that all visitors continue to have a good experience.
Vertical scaling
So what do you do then? The most obvious solution is to make the server faster. In technical terms, this is called “vertical scaling.” You could, for example, add more RAM, choose a faster processor (CPU), or switch to a faster hard drive (such as an SSD). Sometimes, that’s all it takes.
But for larger services, that’s not always enough. And then what? That’s where many people’s knowledge ends. Yet, there is a solution. Big, well-known platforms that serve millions of users every day manage just fine. So how do they do it?
Horizontal scaling
That brings us to what we call “horizontal scaling.” This means using multiple servers. A simple example is having one server in the US and another in Europe. With DNS routing, it’s possible to determine which server a visitor is sent to — usually based on location.
You can expand this further, with multiple servers per region. When traffic gets really heavy, you quickly end up using something called a “CDN” — a Content Delivery Network. That’s a network of servers all over the world, each holding a copy of the same data. Visitors then receive the content from the server closest to them.
Challenges of scaling
But this is where it gets interesting, because horizontal scaling brings its own challenges. Think, for example, of user accounts. If you create an account on the server in Europe, a server in South America won’t know about it yet. So there needs to be a solution for that. To ensure you always have access to your account, no matter which server is handling your request.
In short, horizontal scaling is a specialized field. You could write entire books on the topic. But it’s possible, and it’s used every day by major companies. It’s even essential if you're truly successful. So if you're seeing a consistent spike in visitors and don’t know how to keep up, consider diving into horizontal scaling, or get some professional advice.
Success should be celebrated, not slowed down.