An Image Server for the Global Web19 July 2018
At Delivery Hero we serve images of food to millions of customers every day. Our centralized image service handles over a billion requests per month and serves these to hungry customers in over 20 countries. We see improvements to our image services as part of lowering page weight, and ultimately part of a better customer experience. While building our image service, we researched ways to minimize transmission time with a particular focus on image formats.
A digital image is a collection of pixels on a screen. To the computer displaying the image, each pixel is a few bytes (e.g. red, green and blue), and the whole image is a matrix of those bytes. The specific way an image is represented as bytes is called an image format. Some of the more common image formats are JPEG, GIF, and PNG. We also focused on two modern formats: JPEG 2000 and WebP. The image format determines what features are available (such as a transparent background or animation), the visual quality of the image and the image’s file size.
Most image formats support compression to minimize file size. All compression techniques can be categorized as using either lossy or lossless compression. Lossy compression loses some of the original content in order to improve compression; for example, low-quality JPEGs have obvious compression artifacts. Lossless compression does not lose the original content, but generally takes up more space. For example, JPEG¹ and GIF are lossy formats; PNG is a lossless format. Below are two pictures: the left is losslessly compressed and the right is lossy.
Different image formats also provide different features and compatibility. JPEG, for example, does not support transparent backgrounds, and WebP is not compatible with Safari browsers. The table below summarizes some of the most important features for Delivery Hero’s brands.
Image format feature support. Here is a more extensive table.
An Optimal Image Archive
In the ideal case, the image uploaded to the server is in the only format that will ever be requested, and any cropping or resizing is done well in advance. The ideal request is then for the exact image content in the archive — and the server’s job is to simply pass the bytes along.
A more realistic use case would be some high-resolution lossless image (e.g. a PNG) uploaded to the server, and then clients requesting some transcoded and possibly downsampled (e.g. JPEG at 70% quality, and scaled to 60% size) variant depending on the end-user’s display. Another client may request the same image at the same time, but with different parameters. Transcoding takes time, and it is heavily dependent on the source and target formats.
We chose to introduce some processing at the upload step to convert all incoming images into lossless WebP. To reach this conclusion, we evaluated the time it takes to transcode from some format to another. We established two requirements to limit the formats we considered:
- The image archive must at least contain the original image data in the maximum size.
- The preferred archive format must minimize transcode time into likely served formats.
The table below shows transcoding bitrate from archival format candidates (left column) to likely served formats (top row) in Mb per second.
Transcoding Bitrates², higher is faster.
From the table, it’s clear that if we disregard the first requirement of a lossless archive then JPEG at 95% quality is the fastest. On average, transcoding an image from JPEG-Q95 to JPEG-Q70 is nearly thirty times faster than transcoding the same image from JPEG2000 to JPEG-Q70. It is also clear that WebP is the fastest to transcode, yet the slowest format to encode. PNG to JPEG-Q70 is almost five times faster than PNG to WEBP-Q70. The total transcode time for WebP is about 11% faster than PNG, however.
The target formats we evaluated are a small subset of the possible format and quality parameter combinations. In all requests, the target image format determines how much data needs to be delivered to the end user, and ultimately the perceived load speed.
Data over the Wire
Independent of how quickly we can produce the desired image, apply the necessary transformations, transcode into the desired format, or leverage the cache to prepare our payload, the most important step is to actually send the image data to the end-user. No matter how clever the caching, if the image is 20 MB, a customer on a spotty cellular connection will have to wait far too long to see it. The primary benefit of lossy compression is space savings, and this is most important when trying to minimize the data going over the wire.
Lossy image formats visually degrade as the file size decreases. Formats like WebP and JPEG provide a “quality parameter” from 0 to 100 which lets you control file size at the cost of perceived quality. To evaluate the compression characteristics of various image formats, we collected hundreds of random images and encoded them in all of the different configurations. The result is visualized in the plot below.
Simply put, WebP is the most space efficient format we evaluated. For an average 3.3 megapixel image (10 MB uncompressed), we would expect its PNG version to be about 3.5 MB, its lossless WebP to be about 2.7 MB, its JPEG-Q95 to be about 1 MB, and its WEBP-Q90 to be 500 KB. One other notable conclusion to draw from the above plot is that the PNG version of an image takes more storage space than the combination of lossless WebP and the WEBP-Q95 and the WEBP-Q75. This space savings lets us avoid the time cost to encode WEBP after the first time.
It helps also to visualize the loss of quality from 100% to 5%, because not all formats degrade equally. Below is a grid of images to show just this. Generally, WebP degrades in image fidelity more gracefully than JPEG.
For rendered images of text, JPEG and WebP compression are particularly bad. A good text image is characterized by crisp edges, but both lossy compression formats do not preserve “crispness.” This is because both JPEG and WebP use a Discrete Cosine Transform that loses high-frequency (i.e. crisp edges) information. WebP preserves the image quality better, but generally text images are both smaller and higher quality when formatted as vector graphics. An example of how a text image degrades is below.
This post is by no means a comprehensive analysis of all our options. It’s also important to note that there are guides to selecting a target format, such as 99 Designs. There were additional experiments we could’ve run to evaluate vector graphics (e.g. SVG) to their rasterized counterparts. Additionally, we observed that the user’s physical display is one of the most important factors in perceived quality. In particular, Apple Retina displays generally demand images of about double resolution to be perceived as equal to other screens. Regardless, we collected the results of our experiments into the following best practices for a high performance image server:
- The ideal image format to archive images is WebP. It supports lossless compression, it has the highest compression ratio, and it has the fastest bitrate for transcoding to other formats (e.g. JPEG).
- The preferred image format to serve photographs is WebP. In its lossy mode, image fidelity is kept even with a low quality parameter; it has a more efficient quality/file size ratio than JPEG. Unfortunately, native support is limited to Google Chrome and Android.
- 70% quality is reasonable for WebP, and JPEG should be higher. Most images (especially photographs) have imperceptible artifacts at this quality, and the payload savings are major — a WebP at 70% quality is approximately 7% the file size of the same image as a PNG.
- Extra care should be taken for images of text. When possible, text should be a vector graphic (e.g. SVG) to completely eliminate artifacts and minimize file size. Even minor lossy compression can be detrimental to rasterized image quality.
- Strictly speaking, JPEG is not a “format” but a standard. There are multiple ways to compress an image as JPEG (e.g. libjpeg, MozJpeg, Guetzli), and they’re technically different formats. Here that distinction is less important.
- Averages calculated by transcoding 100 random images with Python Pillow 5.1.0 and libjpeg libraries. JPEG2000, PNG, and WEBP are the three lossless formats we evaluated