Scaling expert and former CTO of Twitpic teaches us how to scale our web apps

Interviewed by Christophe Limpalair on 03/07/2015

This is the interview I wish I could have watched 5 years ago. Steve shares his incredible story from college dropout to scaling Twitpic to 50 million visitors, 20 billion HTTP requests, and petabytes of data. He tells us how he did it, and how we can scale our own applications from a few users to thousands of users per second.

He also talks about his upcoming Nginx book and we cover a lot of topics from his book Scaling PHP Applications. Honestly one of the best technical books I've ever read.

Downloads

Similar Interviews

Servers for Hackers author Chris Fidao on securing servers, deploying to them, and how Nginx + PHP-FPM really works

Phil Sturgeon on building APIs the right way in any language and PHP 7 news


Interview Snippets

I wanted to start this interview out by sharing Steve's story. Even though he's a really smart guy, he ended up failing out of college. Why? Because college isn't for everyone, and Steve shares his wisdom and thoughts on why it wasn't a good fit for him.

After he told me this, I had to ask: "If you have someone with a college degree and someone without, both applying for a job, why would you hire the one without?"

"I look for a natural curiosity" is what Steve replied.



A Practical Book - Scaling PHP Application
This book drastically changed the way I understand how different services work with one another. This means I understand how to scale much better, and don't have to scramble around all night trying to get my server back online if something were to happen.

From his experience scaling Twitpic to such a massive size, Steve ended up using the LHNMPRR (Linux, HAProxy, Nginx, MySQL, PHP, Redis, & Resque) stack instead of LAMP (Linux, Apache, MySQL, PHP).

In the interview, he explains why he ended up choosing these and how they all work together. The end result? A stack that can actually scale just by throwing more hardware at it.

(Steve actually told me he would use Nginx as a load balancer instead of HAProxy if he could go back in time. The reason being that HAProxy is only marginally better than Nginx at load balancing, so might as well stick to what you already know. )

Nginx

Even if you don't know Nginx, though, you'll get a better understanding of it after the interview and especially if you read his book. Oh, and he's in the process of writing a book on Nginx for O'Reilly.

While I haven't written an Nginx book, I've also done testing and research into installing and configuring Nginx and wrote a tutorial about it. This would be a good place to start learning.

Redis
Should you use Redis as your primary data storage? What about caching? Is it better than Memcached?

Steve explains when you would want to use Memcached over Redis (hint: it's multithreaded) and when you wouldn't (hint: most of the time).

As an example use case for Redis, he mentioned using it to store view counts. Funny enough, I wrote a tutorial on the exact same topic because it's what ScaleYourCode uses to store view counts. This is a common use case because it saves your database from thousands of reads and writes.

Not sure how to even get started with Redis? Check out my quick guide to installing it and configuring it on your server.

I also have a Redis screencast series to show different use cases and how to implement them in Laravel.

Scaling with AWS
I threw Steve a curve ball with an actual scaling scenario--

Starting out with a t2.micro instance on AWS to store our entire stack, I asked him:


  • What would fail first?

  • What would be the next step to scale for a few more users?

  • A lot more users?

  • At which point do you add a load balancer?



Monitor your server stack and applications 24/7
"Scaling isn't just performance, monitoring is also a big piece of it" - Steve Corona

This quote inspired me to write a blog post about monitoring. I was amazed at how simple (free) monitoring can be to setup, and the amount of information that it gives you is incredible. You can separate processes to see which ones are using the most CPU, memory, and a lot of other options.

If you don't already use some kind of monitoring, I highly recommend that you give it a second thought.

And a lot more...
There's so much info in this interview, that there's no way you won't learn something valuable.

When you do get something valuable from this interview, we'd both love to hear about it in the comments below!

As always, thanks for watching :)


How did this interview help you?

If you learned anything from this interview, please thank our guest for their time.

Oh, and help your followers learn by clicking the Tweet button below :)