Speed up your web apps with Google web performance engineer, Ilya Grigorik

Interviewed by Christophe Limpalair on 12/14/2015

Understanding performance is not an easy task. There are a lot of conflicting arguments sprinkled throughout the Internet, and it's hard to find detailed explanations that answer your questions. To make sense of the basics, this episode focuses on useful things developers should know to speed up web apps. We also talk about HTTP/2 and how it's different from HTTP/1.


Similar Interviews

Millions of requests per hour and processing rich video analytics at Wistia

How Netflix handles 36% of the U.S. bandwidth, and how reddit handles 160M users

Links and Resources


Interview Snippets

You're a Web Performance Engineer at Google and the co-chair of the W3C Web. Perf. WG. What does that mean?


Ilya Grigorik's day-to-day job involves being a developer advocate and performance engineer. He works with both internal Google teams, and external developers. This helps him understand what pieces are missing in today's web platform, and if they're not missing, how he can help make them better. The same goes for browsers—

"How can we make them better for a wider range of use cases?"

Mobile is getting more and more important, so a part of his job is also figuring out how to make pages and browsers work better for mobile users.

Ilya pointed something out that we easily forget: Some of these mobile devices and networks are slower than the dial-up we had a decade ago. That's remarkable and sad at the same time...but it's something we need to keep in mind.

Why do we need to gather performance data?


As Ilya points out, many developers will create web apps on their local host and then deploy them to the Internet. This results in (sometimes) big differences in speed, because bytes are now transferred through the network. Even if it's not slower on your machine, how can you tell if it's slow on your user's networks?

Just because it's fast on your network and device, does not mean it is for all of your users! So how can you tell?

One of the main goals of the W3C Web Performance Working Group is to define APIs that allow you to gather performance data from out in the wild. This is great, because instead of running synthetic tests, you're getting real data from real people.

While the data you can get from this (like page load times) is not as rich as, say, Chrome's DevTools, it gives you a great idea of how your web app performs in different conditions and when real people interact with it.

We cover these tools in just a little bit.

So part of the W3C group's responsibility is to find, create, or refine APIs that help people with performance?


They don't build the tools themselves, they define APIs. For example, they will define standards to measure frames per second. This standard enables you to go and measure the performance of animations on your pages without just shooting in the dark.

They want to define APIs that are easy to use and do not affect performance in a negative way. That way, you can send your data back to your own tooling and gather analytics on it.

Is there one thing people do that really slows down their pages?


There are many, which is why you need analytics. However, here are two things everyone should be doing:

1) Compression

Many sites deliver their HTML, CSS, and JS, without enabling gzip. Gzip gives a big saving on the number of bytes that need to be transferred, which is very important.

2) Optimizing images

It's common to grab a photo off of a phone and upload it to a website. Even though it's being resized in CSS, you're still transferring a massive image across the wire. (Could be 1MB+!)

If you look at web pages, images today contribute more than 50-60% of the total weight of a page. Yet, a lot of times, they're not optimized at all.

The episode before this one is all about optimizing images. I highly recommend you check it out if you haven't already.

How does PageSpeed Insights work under the hood?


Under the hood, the page is fetched via a mobile browser and desktop version. When the page is loaded, the system looks for compression, caching, and the size an image is being displayed vs. the size of the image that was transferred. There are also other checks, and all of them are relevant to your website instead of being "general" tips.

If you've ever used the tool, you might have noticed that they also try to compress images and other files on their end to see if they could get further savings.

How can you minimize the number of bytes being transferred?


1) Don't ship things you don't need!

Look at scripts and CSS files on your pages and ask yourself whether those files are actually needed for that page or not. You'll probably find some things that were added there a while back and never removed.

I go on quite a few WordPress sites that are really bad about this. You enable a plugin and that plugin loads a CSS and JavaScript file (or more than one) on every single page. WordPress sites aren't the only ones that do this. Shoot, I just checked and I'm loading an extra script on my /interviews page!

2) Fetch the most optimized version of the asset

This goes back to compression, minification, and using the right image format. For example, SVG can be served in a much smaller container.

We've also seen a resurgence of GIF files on the Internet. These GIFs are funny, but poorly optimized (ie: heavy). They're much bigger than regular video files (like MP4), because of the compression algorithms used. We're talking about 10-20x smaller.

Why are GIFs so much bigger? GIFs take each frame individually and compresses them frame by frame. If you think about it, an animation is just a bunch of back-to-back frames which are very similar. Movie compressors use that fact to compress each particular frame and also share information between frames so they don't have to encode each frame. Thanks to that, movie files are much smaller than GIFs.

In fact, Twitter actually takes user uploaded GIFs and transforms them into MP4s. You can check this by inspecting a GIF on your Twitter timeline next time you see one.

Twitter changes GIF to MP4 for optimization

Is Time to First Byte still an important metric?


It is, absolutely.

You really want the browser to receive HTML bytes as soon as possible. So, for example, a request comes in and (ideally) your data is cached. That way, the server can stream data back very quickly and the browser can start parsing it and displaying something on the screen.

When people say that a page should load quickly, it doesn't have to display the entire page at once. In fact, all browsers are built around progressive rendering.

Progressive Rendering explained

As soon as we have some HTML, we should start parsing it and displaying something to the user while everything is continuing to load.

That's where the Time to First Byte (TTFB) comes from. There's a huge difference between the browser sending a request and receiving it 5 seconds later, versus displaying something as soon as you receive that first byte. That's also why it's important to have the right data in those first bytes.

Then the main question is, how much of a relative impact will it have on my page compared to other optimizations? Measure, measure, measure.

What does Waiting (TTFB) in Chrome's DevTools really represent? Is it how long the browser waits for a response from the server?


Network Timeline showing Waiting(TTFB)

When the browser sends a request for a page, it sits there waiting for a response. The Waiting (TTFB) represents how long it took to receive the very first response byte.

This first byte response will contain HTTP headers, and a part of the HTML document.

In the previous image, Content Download represents how long it took to receive the bytes. This part is very quick (in this example) because the HTML file is small enough to fit in one round-trip.

So is the Waiting time just time waiting for my server?


No, it also includes the network time. It's the server response time plus the round-trip time over the Internet.

Waiting TTFB explanation illustrated

Keep in mind that your initial request has to travel across a number of hops before reaching your server. Then, your server starts generating a response and sending it back. It travels all the way back, and that's what represents the Waiting (TTFB).

Check out this article for more detail on measuring the different parts that make up the TTFB.

Then we must shorten the network trip time and speed up the server response


Yes. Ideally, we want to have a cached response at the CDN so we don't even have to go to the server.

Can we measure the network trip time separate from the server response time?


Yes, you can, using the Navigation Timing API. There's also a Resource Timing API.

These expose performance data to scripts, which makes it easy to use a data analytics tool to chart and track different performance metrics. All you have to do is drop in the script on your pages.

Having a better idea of what's slowing down things for your websites can help you optimize. Is it that the majority of your users have slower networks? If so, you can lower the number of images and assets on your pages. Or maybe it's because your server is slow to respond. Then, you can look into that.

Again, I cover this in a little bit more detail here.

How fast should content display?


The working group recommends displaying content to the user within 1 second. That makes it seem almost instantaneous.

Once the page is interactive, it should respond within 100ms.

These are not easy targets to hit, because we can't forget that many parts of the world (especially on mobile) have small amounts of bandwidth available to them. Ilya points out that lower generations 2G and 3G networks (which are still very much prevalent in some parts), they're seeing hundreds of kilobits of bandwidth. Downloading a 100kb file will take over a second.

How can you test on slower networks if you have fast Internet?


Try the network emulation in DevTools. You can change the network speed and then try to load a few different pages to see how slow pages are to load.

At Google, where Ilya works, they've added 2G and 3G Wi-Fi access points for people to connect to. They encourage developers to connect to them and try to use the web on them. That way, they understand the pain that many still have.

Would you recommend a speed-first approach when developing apps?


Ilya adds that you shouldn't wait to optimize performance until you're ready to ship your app. It's like security, you can't just sprinkle it on at the end. You have to bake it in from the beginning.

Different teams have different methods of accomplishing this goal. At Google, they use the slower networks. Ilya recently read an article about how Facebook instituted 2G Tuesdays. Since this is something repeated every week, developers understand that it's important and you can't just ignore it.

If you're not already using some kind of network emulator, and testing your apps on slower networks, this is definitely something you could benefit from.

It's hard for developers to think about performance when they're crunched to complete deadlines


It is, absolutely!

But, performance is a feature.

Performance, just like any other feature, will affect your engagement, conversion rate, and other things.

One of the benefits of the APIs we mentioned earlier (Navigation Timing API and Resource API) is that you can grab the information and correlate it to your other conversion data.

If you see your app running slower, take a look at your conversion rates. Study after study shows that they are related. Of course, you should always validate it for your own case.

Having real data becomes a much more compelling argument, because you see just how important it is. Then, it's easier to decide what to work if you're debating between adding a new fancy feature or making your app faster.

How can you convince your team to spend time and resources on performance?


Ilya brought this excellent point up and I really felt like I needed to emphasize it:

If you have a slow app, it can be very frustrating when your team or boss tells you to focus on other tasks instead of speeding it up. Instead of just telling your team/boss how important it is, show them with real data collected from real users that show how performance affects the bottom line! If your data doesn't show a correlation, then maybe you should work on that feature :).

What kinds of things block the loading of additional assets (or block rendering)


If you look at your timeline, sometimes there is a gap between one file being loaded and another being fetched. While there could be a few different reasons for that, it's probably because we're parsing the HTML, doing background work, or being blocked by another request.

This is where we talk about optimizing the Critical Render Path.

Chrome DevTools network timeline gap

The Critical Rendering path is the sequence of steps your browser has to take in order to get the HTML, construct the DOM, get the style information, execute critical JavaScript, and then paint the content.

CSS is obviously critical for all of this, but some of the JavaScript may not be. And yet, because of the way it's put on the page, it is treated as a critical resource and is blocking the rendering of the page.

You can see these various steps in the Timeline tab of DevTools

What does the Critical Render Path look like?


How can we make sure JavaScript doesn't block this render path?


Since the browser has no idea whether the JavaScript code is going to modify the DOM in any way, it has to assume that it will. That's why it loads it before going further.

If we're talking about an external script, that means rendering has to wait for your browser to download that script, parse it, and execute it. Only then can the browser keep rendering the page.

See how this is getting ugly?

One solution to this problem is to make scripts async. Async is probably not new to you, but you might have been using it without really understanding the reason.

Async promises the browser that it's not going to manipulate the DOM, so you can start fetching it and execute it when it's ready--without blocking the HTML parsing and rendering.

It is ideal for scripts that don't depend on anything else, and don't affect the page at all (like analytics). If your script does have dependencies, there are other ways of loading them without blocking.

What if your scripts have dependencies? Can you use async?


It depends. This is really when you need to dig in the application and see what the dependency is being used for.

If your script is constructing the page and has behavior that is absolutely critical to have in order to display the page, then you may need to block the rendering.

Often, though, you can async scripts that modify the page and it will still be perfectly functional from a user's perspective. They can see the content and they can start reading/using it. Sure, some of the actions may not be available yet, but that can wait a little bit longer.

It doesn't make sense to have scripts at the bottom of the page


Right. Because async doesn't block rendering, but the browser can start fetching the files faster if they are at the top of the HTML page.

What's the difference between async and defer?


Defer is actually deprecated. You should be using async, unless you need to target browsers like IE6 & 7, which don't support async (it came later).

How can you measure if using async is beneficial to your app or hurting it?


This is where we get in the discussion of digital progress: get something visible on the screen ASAP.

Historically, we have not had a great set of tools to actually measure that. The best tool that we had for a long time was the WebPage test film strip, which shows visual progress. You can set up a test and compare with async and without it.

More recently, DevTools now has the film view as well.

Go to Timeline -> Enable Screenshots -> Reload your page

You'll see screenshots of your page load. It's pretty nice.

Here's what you can do:
1. Toggle your network like we talked about earlier
2. Take a look at the film strip
3. Change your scripts to async/move them around
4. Compare film strips

DevTools can be overwhelming. How can we get a better understanding of how to use it?


"Good question. It's definitely something that's on the radar for the Developer Tools team."

Check out videos from Paul Lewis and others who walk through how to use DevTools. These are quite informative.

What's really cool about these videos is that they can speed up websites on the screen without having access to the website's servers. They can optimize things like jank on window scroll, or janky touch events, etc...

Of course, spend some time poking around with it as well.

Why are there (sometimes) empty blocks on DevTools' waterfall right before a file is loaded?


It is queuing. In this particular picture, HTTP1 is used which means it's most likely blocking the number of connections. There's a max number of open connections in HTTP1. We can only fetch x amount of files in parallel (usually 6 in Chrome). Since we have 6 open connections, the 7th sits there and waits for one to open up.

If you've ever downloaded more than 6 files on a single web page, you've already noticed this. The first 6 start downloading immediately, but the 7th sits there an waits.

HTTP2 does things a bit differently than that...

What is different about HTTP2?


The main differences between HTTP1 and 2 are in the framing. You can think of framing as how the request is laid out on the wire. The semantics are exactly the same, so you don't have to modify your website if you want to upgrade. The only difference is how the request is sent.

With HTTP1, you negotiate a connection to the server. If we send a request over that connection, it becomes occupied. No other request can be sent over it until we receive a response. The way around this is to open multiple connections (6 with Chrome, as we just discussed). That means we can fetch at most 6 resources at a time.

With today's websites, this bottleneck is not very nice.

With HTTP2, we can use one connection for multiple requests. How? By slicing requests and responses into smaller chunks called frames. These frames are tagged with an ID that connects the data to the request/response. That way, we can send everything over the same connection.

This also opens the door to other optimizations like prioritization. Prioritization sends hints to the server to tell it which assets are more important than others. Like main.css is more important than image1.jpg, for example, because CSS blocks rendering but images do not.

There is also Server Push. Server Push allows the server to send multiple responses. Example: you request an HTML page and the JavaScript and CSS files are needed to render that HTML page, so the server will send those files before the browser even asks for them. Nice, huh?

In addition, HTTP2 has compression for HTTP headers. We've talked about compressing the response body with gzip, but headers (like user-agent, cookies, etc...) in HTTP1 aren't compressed. That doesn't seem like a big deal until you account for cookies which can get quite big.

HTTP2 introduced HPACK which compresses headers and reduces that overhead.

Here's an example of how it works:

Whereas HTTP1 always sends the user-agent (which doesn't change), HTTP2 sends it out during the first trip, and then references it as an ID for subsequent trips. It really reduces the amount of data transferred.

You get the benefit of:
1) Smaller requests
2) Multiplexing (explanation here)
3) Prioritizing requests
4) Makes the server be smarter about how it responds with data
5) Gives the server more flexibility in how it sends the response

Do we still need to concatenate and minify with HTTP/2?


Yes and no.

Minification, yes. Reducing the number of bytes we transfer still helps.

Concatenation, mostly not. Since we can send potentially hundreds of requests in one connection, we don't have to bundle files together anymore. Now, that doesn't mean it's a good idea to separate your scripts into hundreds of different scripts (nightmare to maintain and what would be the point?), but it's not a benefit anymore with HTTP2.

What's one thing you're working on that's really exciting to you?


"The hard part is picking just one!"

There are a number of APIs they are working through that are very exciting. One is the Intersection Observer API.

There are many times when developers would like to know when an element is coming into the viewport, visible within a viewport, or is about to be visible (infinite scrolling, analytics, etc...). This is a very common requirement for many applications, but it's very expensive to implement well. It adds overhead to scrolling.

Intersection Observer API allows you to subscribe to notifications. For example: "I want to know when this element comes into the viewport."

How to get in touch?

Ilya's website is where he posts a lot of interesting information. His excellent posts will keep you occupied for a while!
He's also on Twitter.

How did this interview help you?

If you learned anything from this interview, please thank our guest for their time.

Oh, and help your followers learn by clicking the Tweet button below :)