2018, Cache and Latency - Technology Performance Pulse

Foundation Model for Personalized Recommendation

The Netflix TechBlog

MARCH 28, 2025

Yet, many are confined to a brief temporal window due to constraints in serving latency or training costs. In recommendation systems, context windows during inference are often limited to hundreds of eventsnot due to model capability but because these services typically require millisecond-level latency. Kang and J. 2018.00035.

Tuning

Tuning Efficiency Latency Strategy

Self-Host Your Static Assets

CSS Wizardry

MAY 31, 2019

Users might already have the file cached. If website-a.com links to [link] , and a user goes from there to website-b.com who also links to [link] , then the user will already have that file in their cache. On a slower, higher-latency connection, the story is much, mush worse. Penalty: Caching. Risk: Service Shutdowns.

Cache

Cache Latency Infrastructure Website

Stuff The Internet Says On Scalability For July 20th, 2018

High Scalability

JULY 20, 2018

That means multiple data indirections mean multiple cache misses. Mark LaPedus : MRAM, a next-generation memory type, is being touted as a replacement for embedded flash and cache applications. Cliff Click : The JVM is very good at eliminating the cost of code abstraction, but not the cost of data abstraction. They are very expensive.

Internet

Internet Internet Scalability Automotive

KeyCDN Launches New POPs in 2021

KeyCDN

MARCH 10, 2021

The image below shows a significant drop in latency once we've launched the new point of presence in Israel. In fact, latency has been reduced by almost 50%! With a total of 5 POPs in Oceania, this continent benefits from lower latency with every POP added. With a population of 2.5 What's next?

Latency

Latency Internet Internet Speed

USENIX SREcon APAC 2022: Computing Performance: What's on the Horizon

Brendan Gregg

FEBRUARY 28, 2023

My personal opinion is that I don't see a widespread need for more capacity given horizontal scaling and servers that can already exceed 1 Tbyte of DRAM; bandwidth is also helpful, but I'd be concerned about the increased latency for adding a hop to more memory. Ford, et al., “TCP

Performance

Performance Latency Cache Virtualization

Expanding the Cloud – An AWS Region is coming to Hong Kong

All Things Distributed

JUNE 20, 2017

The new AWS Asia Pacific (Hong Kong) Region will have three Availability Zones and be ready for customers for use in 2018. This enables customers to serve content to their end users with low latency, giving them the best application experience. Over the past decade, we have seen tremendous growth at AWS.

AWS

AWS Logistics Cloud Social Media

Fixing a slow site iteratively

CSS - Tricks

APRIL 1, 2021

Google’s industry benchmarks from 2018 also provide a striking breakdown of how each second of loading affects bounce rates. Source: Google /SOASTA Research, 2018. I’m going to update my referenced URL to the new site to help decrease latency that adds drag to the initial page load. Compressing, minifying and caching assets.

Cache

Cache Social Media Media Website

5 data integration trends that will define the future of ETL in 2018

Abhishek Tiwari

DECEMBER 27, 2017

There are several emerging data trends that will define the future of ETL in 2018. In 2018, we anticipate that ETL will either lose relevance or the ETL process will disintegrate and be consumed by new data architectures. In contrast, Alluxio a middleware for data access - think Alluxio storage layer as fast cache.

Big Data

Big Data Artificial Intelligence Storage Hardware

Analyzing a High Rate of Paging

Brendan Gregg

AUGUST 29, 2021

1072-aws (xxx) 12/18/2018 _x86_64_ (16 CPU) avg-cpu: %user %nice %system %iowait %steal %idle 5.03 Reads usually have apps waiting on them; writes may not (write-back caching). biolatency From [bcc], this eBPF tool shows a latency histogram of disk I/O. This is a 64-Gbyte memory system, and 48 Gbytes is in the page cache.

Cache

Cache C++ AWS Java

DBLog: A Generic Change-Data-Capture Framework

The Netflix TechBlog

DECEMBER 17, 2019

This way, log event processing can resume event-by-event afterwards, eventually discovering the watermarks, without ever needing to cache log event entries. Passive instances across regions are also possible, though it is recommended to operate in the same region as the database host in order to keep the change capture latencies low.

Database

Database Traffic Transportation Open Source

DBLog: A Generic Change-Data-Capture Framework

The Netflix TechBlog

DECEMBER 17, 2019

This way, log event processing can resume event-by-event afterwards, eventually discovering the watermarks, without ever needing to cache log event entries. Passive instances across regions are also possible, though it is recommended to operate in the same region as the database host in order to keep the change capture latencies low.

Database

Database Traffic Transportation Open Source

How to use Server Timing to get backend transparency from your CDN

Speed Curve

FEBRUARY 5, 2024

Charlie Vazac introduced server timing in a Performance Calendar post circa 2018. Caching the base page/HTML is common, and it should have a positive impact on backend times. Key things to understand from your CDN Cache Hit/Cache Miss – Was the resource served from the edge, or did the request have to go to origin?

Servers

Servers Cache Retail Benchmarking

A thorough introduction to bpftrace

Brendan Gregg

AUGUST 18, 2019

For example, iostat(1), or a monitoring agent, may tell you your average disk latency, but not the distribution of this latency. For smaller environments, it can be of more use helping eliminate latency outliers. Block I/O latency as a histogram. 6 * 7 * Copyright 2018 Netflix, Inc. 1 /* 2 * biolatency.bt

Latency

Latency C++ Cache Programming

The Performance Inequality Gap, 2021

Alex Russell

MARCH 6, 2021

A then-representative $200USD device had 4-8 slow (in-order, low-cache) cores, ~2GiB of RAM, and relatively slow MLC NAND flash storage. India became a 4G-centric market sometime in 2018. Sadly, data on latency is harder to get, even from Google's perch, so progress there is somewhat more difficult to judge.

Performance

Performance Network Mobile Metrics

Three Other Models of Computer System Performance: Part 1

ACM Sigarch

MARCH 18, 2019

How many buffers are needed to track pending requests as a function of needed bandwidth and expected latency? Can one both minimize latency and maximize throughput for unscheduled work? These models are useful for insight regarding the basic computer system performance metrics of latency and throughput (bandwidth). Little’s Law.

Systems

Systems Latency Performance Analytics

USENIX SREcon APAC 2022: Computing Performance: What's on the Horizon

Brendan Gregg

FEBRUARY 28, 2023

My personal opinion is that I don't see a widespread need for more capacity given horizontal scaling and servers that can already exceed 1 Tbyte of DRAM; bandwidth is also helpful, but I'd be concerned about the increased latency for adding a hop to more memory. Ford, et al., “TCP

Performance

Performance Latency Cache Virtualization

Three Other Models of Computer System Performance: Part 2

ACM Sigarch

MARCH 25, 2019

How many buffers are needed to track pending requests as a function of needed bandwidth and expected latency? Can one both minimize latency and maximize throughput for unscheduled work? The M/M/1 queue will show us a required trade-off among (a) allowing unscheduled task arrivals, (b) minimizing latency, and (c) maximizing throughput.

Systems

Systems Latency Performance C++

Optimize Images for Web

KeyCDN

SEPTEMBER 12, 2019

KeyCDN’s Cache Enabler plugin is fully compatible the HTML attributes that make images responsive. The main reason is because it decreases the latency to the user where they are located by serving your images from a POP physically closest to them. The Cache Enabler plugin then delivers WebP images based to supported browsers.

Social Media

Social Media Media Google Website

Answering Common Questions About Interpreting Page Speed Reports

Smashing Magazine

OCTOBER 31, 2023

The truth is that the two tools were fairly distinct until PSI was updated in 2018 to use Lighthouse reporting. INP is a measure of the latency for all interactions on a given page, where the highest latency — or close to it — informs the final score. It’s right there in the name!

Speed

Speed Google Website Metrics

Tuning Autovacuum in PostgreSQL and Autovacuum Internals

Percona

AUGUST 10, 2018

sec < 2018-08-06 07:22:35.199 EDT > LOG: automatic analyze of table "vactest.scott.employee" system usage: CPU 0.00s/0.02u sec elapsed 0.15 sec < 2018-08-06 07:22:35.199 EDT > LOG: automatic analyze of table "vactest.scott.employee" system usage: CPU 0.00s/0.02u sec elapsed 0.15 1 second = 1000 milliseconds).

Tuning

Tuning Cache Database Storage

Progress Delayed Is Progress Denied

Alex Russell

APRIL 29, 2021

After years of standards discussion and the first delivered to other platforms in 2018, iOS 14.5 April 2018 , but not usable until several releases later). For heavily latency-sensitive use-cases like WebXR, this is a critical component in delivering a good experience. finally shipped Audio Worklets this week. Pointer Lock.

Media

Media Games Education Engineering

KeyCDN Launches POP in Helsinki

KeyCDN

OCTOBER 24, 2018

Although both countries are relatively close to one another, they are separated by a distance of approximately 500km, which adds up in terms of latency. As of August 2018, Helsinki alone had a population of almost 650,000 people. As for Finland, it covers an area of over 338,000 square kilometers and has a population of over 5.5

Internet

Internet Internet Speed Network

Analyzing a High Rate of Paging

Brendan Gregg

AUGUST 29, 2021

1072-aws (xxx) 12/18/2018 _x86_64_ (16 CPU) avg-cpu: %user %nice %system %iowait %steal %idle 5.03 1072-aws (xxx) 12/18/2018 _x86_64_ (16 CPU) avg-cpu: %user %nice %system %iowait %steal %idle 5.03 Reads usually have apps waiting on them; writes may not (write-back caching). This is "cache busting." 17 48011 [.]

Cache

Cache C++ AWS Systems

Can You Afford It?: Real-world Web Performance Budgets

Alex Russell

OCTOBER 22, 2017

It simulates a link with a 400ms RTT and 400-600Kbps of throughput (plus latency variability and simulated packet loss). Simulated packet loss and variable latency, however, can make benchmarking extremely difficult and slow. Our baseline, then, should probably trade lower throughput/higher-latency for packet loss.

Performance

Performance Network Benchmarking Mobile

Service Workers can save the environment!

Dean Hume

APRIL 24, 2018

Without effective caching on the client, the server will see an increase in workload, more CPU usage and ultimately increased latency for the end user. They allow you to cache resources on the user's device when they visit your site for the first time. CPU Utilization and Power Consumption (Source: Blackburn 2008).

Energy

Energy Cache Traffic Website

Service Workers can save the environment!

Dean Hume

APRIL 24, 2018

Without effective caching on the client, the server will see an increase in workload, more CPU usage and ultimately increased latency for the end user. They allow you to cache resources on the user's device when they visit your site for the first time. CPU Utilization and Power Consumption (Source: Blackburn 2008).

Energy

Energy Cache Traffic Website

Service Workers can save the environment!

Dean Hume

APRIL 24, 2018

Without effective caching on the client, the server will see an increase in workload, more CPU usage and ultimately increased latency for the end user. They allow you to cache resources on the user's device when they visit your site for the first time. CPU Utilization and Power Consumption (Source: Blackburn 2008).

Energy

Energy Cache Traffic Website

HTTP/3: Practical Deployment Options (Part 3)

Smashing Magazine

SEPTEMBER 6, 2021

This approach was touted to be better for fine-grained caching because each subresource could be cached individually and the full bundle didn’t need to be redownloaded if one of them changed. Finally, not inlining resources has an added latency cost because the file needs to be requested. Large preview ). What Does It All Mean?

Network

Network Servers Cache Traffic

How to improve Redo, Transaction Log and WAL throughput for HammerDB benchmarks

HammerDB

NOVEMBER 5, 2018

Additionally for the log disk component it is latency for an individual write that is crucial rather than the total I/O bandwidth. To illustrate the data reads on Oracle we can flush the buffer cache. Consequently we now need to increase the buffer cache in size if we are to see more CPU activity. so what are your options?

Benchmarking

Benchmarking Database C++ Virtualization

Front-End Performance Checklist 2021

Smashing Magazine

JANUARY 11, 2021

Build Optimizations JavaScript modules, module/nomodule pattern, tree-shaking, code-splitting, scope-hoisting, Webpack, differential serving, web worker, WebAssembly, JavaScript bundles, React, SPA, partial hydration, import on interaction, 3rd-parties, cache. This improves page load time and caching during navigations.

Performance

Performance Cache Metrics Media

Front-End Performance Checklist 2020 [PDF, Apple Pages, MS Word]

Smashing Magazine

JANUARY 6, 2020

Globally in 2018–2019, according to the IDC, 87% of all shipped mobile phones are Android devices. Estimated Input Latency tells us if we are hitting that threshold, and ideally, it should be below 50ms. So additionally conducting research on common devices in your target group might be a good idea.

Performance

Performance Cache Servers Network

Front-End Performance Checklist 2019 [PDF, Apple Pages, MS Word]

Smashing Magazine

JANUARY 7, 2019

Estimated Input Latency tells us if we are hitting that threshold, and ideally, it should be below 50ms. PRPL stands for Pushing critical resource, Rendering initial route, Pre-caching remaining routes and Lazy-loading remaining routes on demand. In 2018, the Alliance of Open Media has released a new promising video format called AV1.

Performance

Performance Cache Network Metrics

Technology Performance Pulse

Foundation Model for Personalized Recommendation

Self-Host Your Static Assets

Trending Sources

Stuff The Internet Says On Scalability For July 20th, 2018

KeyCDN Launches New POPs in 2021

USENIX SREcon APAC 2022: Computing Performance: What's on the Horizon

Expanding the Cloud – An AWS Region is coming to Hong Kong

Fixing a slow site iteratively

5 data integration trends that will define the future of ETL in 2018

Analyzing a High Rate of Paging

DBLog: A Generic Change-Data-Capture Framework

DBLog: A Generic Change-Data-Capture Framework

How to use Server Timing to get backend transparency from your CDN

A thorough introduction to bpftrace

The Performance Inequality Gap, 2021

Three Other Models of Computer System Performance: Part 1

USENIX SREcon APAC 2022: Computing Performance: What's on the Horizon

Three Other Models of Computer System Performance: Part 2

Optimize Images for Web

Answering Common Questions About Interpreting Page Speed Reports

Tuning Autovacuum in PostgreSQL and Autovacuum Internals

Progress Delayed Is Progress Denied

KeyCDN Launches POP in Helsinki

Analyzing a High Rate of Paging

Can You Afford It?: Real-world Web Performance Budgets

Service Workers can save the environment!

Service Workers can save the environment!

Service Workers can save the environment!

HTTP/3: Practical Deployment Options (Part 3)

How to improve Redo, Transaction Log and WAL throughput for HammerDB benchmarks

Front-End Performance Checklist 2021

Front-End Performance Checklist 2020 [PDF, Apple Pages, MS Word]

Front-End Performance Checklist 2019 [PDF, Apple Pages, MS Word]

Stay Connected