Millions of requests per hour and processing rich video analytics at Wistia

Interviewed by Christophe Limpalair on 11/16/2015

What if you could watch a video while it was uploading? What about seeing in-depth reports of how users interact with your videos? Max Schnur peels away Wistia's layers so we can see how they process and stream videos, and how they collect video analytics. He also gives us tips on creating embeds and scaling ease of use for better customer support.

Get your free account with Rollbar, and start tracking errors in 5 minutes.


Similar Interviews

Building an API for many devices and choosing native vs. web apps at

How Shopify handles 300M uniques a month running Rails and Docker

Links and Resources


Interview Snippets

What is Wistia? Why not just use YouTube?


They serve different niches.

YouTube exists primarily to serve ads, and Wistia exists primarily to serve marketers and individual websites. This influences their decisions in different ways, because their sources of income are different. For example, with Wistia, people can monetize videos without serving ads, and they have more control over who can watch these videos for a monthly fee.

Wistia might also focus on tools around video like stats and marketing that YouTube wouldn't need to prioritize.

How did you get started at Wistia?


In college, Max majored in Computer Science. After that, he worked doing general web contracting for about 3 years, and then he decided to move to Boston where some of his friends were. They said Wistia was a good place to work, so he just kind of showed up.

There weren't many people back then, and there were no official job interviews. That was March 2011.

What do you do on a day-to-day basis at Wistia?

Depends on the day.

Some days, he's just heads down coding. Other days he helps teammates work through tough problems and figuring out the codebase.

He also spends some days putting out fires and brainstorming for changes in the product.

What scale are you running at?

Wistia has three main parts to it:
  1. The Wistia App
  2. The Distillery (stats processing)
  3. The Bakery (transcoding and serving)

These stats include numbers for all of these parts:
  • 1.5 million player loads per hour (loading a page with a player on it counts as one. Two players counts as two, etc...)
  • 18.8 million peak requests per hour to their Fastly CDN
  • 740,000 app requests per hour
  • 12,500 videos transcoded per hour
  • 150,000 plays per hour
  • 8 million stats pings per hour

They are running on Rackspace and have been for a while. They actually started out on Slicehost, which was bought out by Rackspace, and that's how Wistia ended up there.

Funny, I usually get a "it's about the cost" response to this question.

They've looked at alternatives, but haven't had a compelling reason to move yet. Rackspace has good support, according to Max.

Did you say you've moved parts to AWS? Why?

They've tried moving a few parts to AWS just to test things out, like their stats processing database for example.

They also use S3 for long term storage. It would be nice if they didn't have to send files between providers.

What exactly is the Wistia App?

It's Wistia's Hub, where users log into their accounts and interact with the application.

It's also their API origin, embeds origin, and JavaScript origin.

What challenges come from receiving videos, processing them, then serving them?

There are a ton of things.

1) They want to balance quality and deliverability, which has two sides to it:
  1. Encoding derivatives (of the original video)
  2. Knowing when to play which derivative

Derivatives, in this context, are different versions of a video. Having different quality versions reduces file size and may be required to play if a user doesn't have enough bandwidth. Otherwise they would have to constantly buffer.

Having these different versions and knowing when to play which version is crucial for a good user experience.

2) Lots of I/O
When you have users uploading a lot of videos at the same time, you end up having to move a lot of heavy files across clusters. A lot of things can go wrong here.

3) Volatility
In demand for requests and processing video. They need the ability to sustain these changes.

4) Of course, serving videos is also a major challenge
Thankfully, CDNs have different kinds of configurations to help with this.

5) On the client side, there are a ton of different browsers, device sizes, etc...
If you've ever had to deal with making websites responsive or work on older IE versions, you know exactly what we're talking about here.

How do you handle big changes in the number of video uploads?

They have boxes which they call Primes. Prime boxes receive files from user uploads.

If uploads start eating up all the available space, they can just spin up new Primes using Chef recipes. It's a bit of a manual job right now, but they don't usually get anywhere close to their limit. They have lots of space.

Example of a Chef cookbook using a recipe

What transcoding system do you use?

This is the part they call the Bakery.

The Bakery is made up of Primes we just talked about, which receive & serve files. They also have a cluster of workers that process tasks and create derivative files from uploaded videos.

This part needs to have beefy systems, because it's very resource intensive. How beefy?

They're running several hundred workers. Each worker usually performs two tasks at one time. They all have 8 GB of RAM and a pretty powerful CPU.

What do these workers do? They encode video primarily in x264 which is a fast encoding scheme. Videos can usually be encoded in about half or a third of the video length.

Videos must also be resized and they need different bitrate versions.

There are also different encoding profiles for different devices, like HLS for iPhones. These encodings need to be doubled for Flash derivatives that are also x264 encoded.

What does that whole process of uploading a video and then transcoding it look like?

Once a user uploads a video, it is queued and sent to workers for transcoding, and then slowly transferred over to S3.

Instead of sending a video to S3 right away, it is pushed to Primes in the clusters so that customers can serve videos right away. Then, over a matter of a few hours, the file is pushed to S3 for permanent storage and cleared from Prime to make space.

How do you serve the video to a customer after it is processed?

When you're hitting a video hosted on Wistia, you make a request to, which is actually hitting a CDN (they use a few different ones. I just tried it and mine went through Akamai.) That CDN goes back to Origin, which is their Primes we just talked about.

Primes run HAProxy with a special balancer behind it called Breadroute. Breadroute is a routing layer that sits in front of the Primes and balances traffic.

Wistia might have the file stored locally in the cluster (which would be faster to serve), so Breadroute is smart enough to know that and Nginx will serve the file directly from the file system. Otherwise, they proxy S3.

How do you decide which version of the video to serve depending on the user's bandwidth?

That's primarily decided on the client side.

Wistia will receive data about the user's bandwidth immediately after they hit play. Before the user even hits play, Wistia receives data about the device, the size of the video embed, and other heuristics to determine the best asset for that particular request.

The decision of whether to go HD or not is only debated when the user goes into full screen. That gives Wistia a chance to not interrupt playback at the start.

Netflix also has an interesting way of figuring out which video file to serve. More on that, here.

How do you test a user's bandwidth?

When a user clicks play, Wistia tries to download a 1MB XHR image while the video starts playing and that's how they can tell. If there's any buffering, they bail out from trying to download it.

Which, by the way, that image is of one of his coworker's dog. If you've ever watched a Wistia video, you have a cute puppy on your computer.

Meet Lenny (Source: Twitter)

What CDNs do you use?


CDNs are crucial for video streaming

Oh yeah. Not only because it gets the video content closer to the user, but also because Wistia's origin would get hammered.

Even if you have access to a lot of bandwidth, reducing latency is extremely important. By having nodes closer to each other, you reduce that latency.

If you're interested in learning more about this, check out Ilya Grigorik's book High Performance Browser Networking. It's free.

What other video streaming optimizations have we not covered?

Smart preloading.

It turns out that Chrome and other browser have limits on how many sockets can be open at the same time.

Ideally, you'd want a video to start playing immediately after a user hits play. That's what preload is for.

But, if you have too many videos on one page, you will max out your open sockets and nothing will play. To get around this, Wistia does some clever things. For example, they use postMessage within iframes, detecting how many videos are on a page, and only adding the preload attribute under certain circumstances.

Sponsored: Get a free trial from Rollbar to track errors in your apps (any language, any platform)

Monitoring and error tracking is crucial when it comes to scaling apps. Frankly, even if it's a small app it's still a necessity.

Why? How else would you know when users run into errors when they're interacting with your website? Unless they email you, you'd never know.

Using error tracking like Rollbar will alert you when errors happen. These errors are practically real-time, and the alerting can be integrated with any tool of your choice (Slack, PagerDuty, email, etc...)

I use Rollbar to track my errors for ScaleYourCode and any other app I build. Why? Because it works extremely well, takes no time to set up, and can scale no matter how big of an app I have or how big of a team we have.

Rollbar error tracking

Get your free trial they have going for ScaleYourCode fans, and thank me later :).

How will that change with HTTP/2?

It should really simplify a lot of things.

Even then, there's going to be a restriction on how much bandwidth you can use at a given time.

How do you know when videos don't load? How do you monitor that?

They have a service they call pipedream, which they use within their app and within embed codes to constantly send data back.

If a user clicks play, they get information about the size of their window, where it was, if it buffers (if it's in a playing state but the time hasn't changed after a few seconds).

A known problem with videos is slow load times. Wistia wanted to know when a video was slow to load, so they tried tracking metrics for that. Unfortunately, they only knew if it was slow if the user actually waited around for the load to finish. When users have really slow connections, they might just leave before that happens.

The answer? Heartbeats. If they have these heartbeats and they never receive a play, then the user probably bailed.

What other analytics do you collect?

They also collect analytics for customers.

Things like play, pause, seeks, conversions (entering an email, or clicking a call to action) which are shown in heatmaps. They also aggregate those heatmaps into graphs that show engagement, like the areas that people watched and rewatched.

How do you get such detailed data?

When the video is playing, they have a video tracker which is an object bound to plays, pauses, seeks. It collects those events into a data structure. Once every 10-20 seconds, it pings back to the Distillery which in turn figures out how to turn the data into heatmaps.

Why doesn't everyone do this? I asked him that, because I don't even get this kind of data from YouTube. The reason, Max says, is because it's pretty intensive. The Distillery processes a ton of stuff and has a huge database.


What kind of database do you use?

A sharded MySQL database. 4-5 very large shards.

They've used MySQL extensively for a while now. They are looking at Riak for certain systems moving forward. But for systems like the Distillery, MySQL makes a lot of sense because the data is relational. Even though they have a lot of data, sharding it makes it manageable.

What does your stack look like?

  • HAProxy
  • Nginx
  • MySQL
  • Ruby on Rails
  • Unicorn and some services run on Puma
  • nsq (they wrote the Ruby gem)
  • Redis (caching, Sidekiq for async jobs)

What are you able to cache?

In the Wistia app, there's a lot of user data which they don't consider important enough to be consistent forever, unlike stuff they store in MySQL.

For example, if you choose a different kind of embed, they save that preference.

They also store information about throttling IP addresses when they have to do that.

What do you use to track various services?

Scout (primary systems level monitoring tool)
New Relic is on a lot of their boxes
HoneyBadger reports their errors
TrackJS in the Wistia app

They use Scout in the Distillery because they can write custom plugins.

New Relic, in the Wistia app, tracks database performance, request performance, and they also use it for synthetic tests to run healthchecks on certain points in their origin, and if those fail they get notified via PagerDuty.

What is scaling ease of use?

Wistia has about 60 employees. Once you start scaling up customer base, it can be really hard to make sure you keep having good customer support.

Wistia's highest touch points are playback and embedding. These have a lot of factors that are out of their control, so they must make a choice:

1) Don't help customers
2) Help customers, but it takes a lot of time

These options weren't good enough for them. Instead, they went after the sources that caused the most customer support issues--embeds.

Their embeds have gone through multiple iterations. Why? Because they broke often. Whether it was WordPress doing something weird to an embed, or Markdown breaking the embed, Wistia has had a lot of issues with this over time. To solve this problem, they ended up dramatically simplifying embed codes.

The simpler embed codes that customers put on their web pages were running into less problems. But, it meant more complexity behind the scenes.

This is what Max means by scaling ease of use. Make it easier for customers, so that they don't have to contact customer support as often. Even if that means more engineering brainstorming, it's worth it to them.

Another example of this is with playbacks. That's why Max is really interested in implementing a kind of client-side CDN balancing which determines the lowest latency server to serve content from (similar to what Netflix does).

What kind of advice can you give to developers working with embed codes?

Respect the website that you live on.

Keep things fast, asynchronous, and keep things simple.

For example, a lot of times people have to configure embed codes, and so they'll add data attributes to the markup, or even embedded JSON. Unfortunately, that stuff gets stripped out by all kinds of things. The easiest thing that never gets stripped out is class attributes. It may sound weird and look a little weird, but technically it is all valid.

Another issue is embeds loading a lot of assets. What tip do you have for that?

Using Async is probably the biggest thing here.

Interesting note that Max added--they don't version different versions of player scripts. But, they update it all the time. This means they can't give it a year long expiration. Therefore, there needs to be a balance between giving Wistia control and caching.

What they'll do is cache for about an hour, and they always load async.

Have you had any security issues?

They haven't had any major breaches, but they've had a few issues. Max says most of these were just shortsighted.

One example from back in the day, is that they allowed JavaScript in the description of medias and it would run on people's browsers. Nice XSS feature there!

They also had another feature that would let you CNAME your domain to a Wistia domain. The problem is that it didn't support HTTPS, so you'd be logging in without HTTPS.

Other issues have come from people trying to bruteforce SSH into their servers and locking everyone out.

You have a feature that lets people sitemap videos for SEO. What's that, and how does it work?

The sitemap feature is actually going away. Google recently changed what they look for in SEO. The (relatively) new thing is called JSON-LD.

It's much better that way for Wistia, because they can just inject that data dynamically in the page and it gets crawled.

What is the development process like at Wistia?

It's definitely changed a lot over the years. Right now, they have larger projects that they call rallies.

Each rally is assigned to a team, and teams are about 2-4 people right now. Each of the teams self organizes. There are, of course, some business deadlines, but they're not super strict about deadlines otherwise.

When it comes to deployments, engineers can deploy whenever they want. There are no rules besides not deploying while someone else is deploying--and they built disco lights that go off when someone is deploying.

What does your deployment process?

They have a tool called SkyCrank.

People SSH on a box and run a command like crank wistia deploy. Within a couple of minutes, the code is deployed.

It's a custom built tool, which is a combination of Chef and knife ssh. That will pull code from GitHub.

Do you run any tests? Continuous Integration?

Yes, when they push to the master, it automatically runs through Solano.

What projects are you working on right now? Are they top secret?

Something Max is planning on starting soon is what they're calling the upload server. This upload server is going to give them the ability to do a lot of cool stuff around uploads.

As we talked about, and with most upload services, you have to wait for the file to get to the service before you can start transcoding and doing things with it.

The upload server will make it possible to transcode while uploading. This means customers could start playing their video before it's even completely uploaded to their system. They could also get the still and embed almost immediately.

How do you do that?

It's not done yet, but the idea is to have a server that serves the data in chunks to the workers who then process the video.

FFMPEG can let you do range requests, so they could serve a range of data from servers and do partial encodes.

"It's a lot of rearchitecturing involved, but I think it's worth it."

Getting in touch & Related links

Wistia Blog
Wistia GitHub
Wistia's CTO's blog (interesting post on Wistia's technical details)

How did this interview help you?

If you learned anything from this interview, please thank our guest for their time.

Oh, and help your followers learn by clicking the Tweet button below :)