Cache, Data and Latency - Technology Performance Pulse

The Three Cs: Concatenate, Compress, Cache

CSS Wizardry

OCTOBER 16, 2023

Caching them at the other end: How long should we cache files on a user’s device? Plotted on the same horizontal axis of 1.6s, the waterfalls speak for themselves: 201ms of cumulative latency; 109ms of cumulative download. 4,362ms of cumulative latency; 240ms of cumulative download. Read the complete test methodology.

Cache

Cache Latency Strategy Speed

Netflix’s Distributed Counter Abstraction

The Netflix TechBlog

NOVEMBER 12, 2024

By: Rajiv Shringi , Oleksii Tkachuk , Kartik Sathyanarayanan Introduction In our previous blog post, we introduced Netflix’s TimeSeries Abstraction , a distributed service designed to store and query large volumes of temporal event data with low millisecond latencies. Today, we’re excited to present the Distributed Counter Abstraction.

Latency

Latency Cache Infrastructure Strategy

Optimising for High Latency Environments

CSS Wizardry

SEPTEMBER 16, 2024

Last week, I posted a short update on LinkedIn about CrUX’s new RTT data. Chrome have recently begun adding Round-Trip-Time (RTT) data to the Chrome User Experience Report (CrUX). This gives fascinating insights into the network topography of our visitors, and how much we might be impacted by high latency regions. What is RTT?

Latency

Latency Cache Transportation Mobile

Consistent caching mechanism in Titus Gateway

The Netflix TechBlog

NOVEMBER 3, 2022

We introduce a caching mechanism in the API gateway layer, allowing us to offload processing from singleton leader elected controllers without giving up strict data consistency and guarantees clients observe. Active data includes jobs and tasks that are currently running. Titus Gateway handles user requests.

Cache

Cache Latency Traffic Systems

The Power of Caching: Boosting API Performance and Scalability

DZone

AUGUST 16, 2023

Caching is the process of storing frequently accessed data or resources in a temporary storage location, such as memory or disk, to improve retrieval speed and reduce the need for repetitive processing.

Cache

Cache Scalability Performance Latency

Single-core memory bandwidth: Latency, Bandwidth, and Concurrency

John McCalpin

FEBRUARY 17, 2025

In my previous post , I reviewed historical data on single-core/single-thread memory bandwidth in multicore processors from Intel and AMD from 2010 to the present. “Concurrency” is the amount of data that must be “in flight” between the core and the memory in order to maintain a steady-state system.

Latency

Latency Hardware Cache Systems

Introducing Netflix’s Key-Value Data Abstraction Layer

The Netflix TechBlog

SEPTEMBER 18, 2024

Second, developers had to constantly re-learn new data modeling practices and common yet critical data access patterns. These include challenges with tail latency and idempotency, managing “wide” partitions with many rows, handling single large “fat” columns, and slow response pagination.

Latency

Latency Storage Cache Efficiency

Introducing Netflix TimeSeries Data Abstraction Layer

The Netflix TechBlog

OCTOBER 8, 2024

Rajiv Shringi Vinay Chella Kaidan Fullerton Oleksii Tkachuk Joey Lynch Introduction As Netflix continues to expand and diversify into various sectors like Video on Demand and Gaming , the ability to ingest and store vast amounts of temporal data — often reaching petabytes — with millisecond access latency has become increasingly vital.

Latency

Latency Storage Traffic Tuning

Data ingestion pipeline with Operation Management

The Netflix TechBlog

MARCH 7, 2023

These media focused machine learning algorithms as well as other teams generate a lot of data from the media files, which we described in our previous blog , are stored as annotations in Marken. But we cannot search or present low latency retrievals from files Etc. in a video file. This is obviously very expensive.

Media

Media Latency Architecture Database

Architectural Insights: Designing Efficient Multi-Layered Caching With Instagram Example

DZone

FEBRUARY 27, 2024

Caching is a critical technique for optimizing application performance by temporarily storing frequently accessed data, allowing for faster retrieval during subsequent requests. Multi-layered caching involves using multiple levels of cache to store and retrieve data.

Cache

Cache Efficiency Architecture Design

Foundation Model for Personalized Recommendation

The Netflix TechBlog

MARCH 28, 2025

Furthermore, it was difficult to transfer innovations from one model to another, given that most are independently trained despite using common data sources. Yet, many are confined to a brief temporal window due to constraints in serving latency or training costs.

Tuning

Tuning Efficiency Latency Strategy

Benchmark (YCSB) numbers for Redis, MongoDB, Couchbase2, Yugabyte and BangDB

High Scalability

FEBRUARY 17, 2021

This is guest post by Sachin Sinha who is passionate about data, analytics and machine learning at scale. Load stage is to load the data and then run stage we run the test. Load is consistent for all dbs for all tests as expected as this phase is to load the data. Again Yugabyte latency is quite high.

Benchmarking

Benchmarking Latency C++ Database

Front-End: Cache Strategies You Should Know

DZone

MAY 1, 2023

Caches are very useful software components that all engineers must know. It is a transversal component that applies to all the tech areas and architecture layers such as operating systems, data platforms, backend, frontend, and other components. What Is a Cache?

Cache

Cache Strategy Latency Operating System

What is a data lakehouse? Combining data lakes and warehouses for the best of both worlds

Dynatrace

OCTOBER 4, 2022

While data lakes and data warehousing architectures are commonly used modes for storing and analyzing data, a data lakehouse is an efficient third way to store and analyze data that unifies the two architectures while preserving the benefits of both. What is a data lakehouse? How does a data lakehouse work?

Artificial Intelligence

Artificial Intelligence Storage Analytics Government

Migrating Netflix to GraphQL Safely

The Netflix TechBlog

JUNE 14, 2023

The GraphQL shim enabled client engineers to move quickly onto GraphQL, figure out client-side concerns like cache normalization, experiment with different GraphQL clients, and investigate client performance without being blocked by server-side migrations. In such cases, we were not testing for response data but overall behavior.

Traffic

Traffic Latency Metrics Cache

Predictive CPU isolation of containers at Netflix

The Netflix TechBlog

JUNE 4, 2019

Because microprocessors are so fast, computer architecture design has evolved towards adding various levels of caching between compute units and the main memory, in order to hide the latency of bringing the bits to the brains. This avoids thrashing caches too much for B and evens out the pressure on the L3 caches of the machine.

Cache

Cache Latency Airlines Logistics

Designing Instagram

High Scalability

JANUARY 11, 2022

from a client it performs two parallel operations: i) persisting the action in the data store ii) publish the action in a streaming data store for a pub-sub model. User Feed Service, Media Counter Service) read the actions from the streaming data store and performs their specific tasks. Data Models. Graph Data Models.

Design

Design Media Storage Logistics

AI-driven analysis of Spring Micrometer metrics in context, with typology at scale

Dynatrace

APRIL 7, 2022

One of these solutions is Micrometer which provides 17+ pre-instrumented JVM-based frameworks for data collection and enables instrumentation code with a vendor-neutral API. That’s a large amount of data to handle. This creates a lot of complexity given different data sources, components, and tools. of Micrometer.

Metrics

Metrics Java Latency Cache

Dynatrace accelerates business transformation with new AI observability solution

Dynatrace

JANUARY 31, 2024

This blog post explores how AI observability enables organizations to predict and control costs, performance, and data reliability. It also shows how data observability relates to business outcomes as organizations embrace generative AI. GenAI is prone to erratic behavior due to unforeseen data scenarios or underlying system issues.

Cache

Cache Azure Infrastructure Monitoring

Time To First Byte: Beyond Server Response Time

Smashing Magazine

FEBRUARY 12, 2025

What Network Latency Means For Time To First Byte Lets add up all the network round trips in the example above: 2 server connections: 6 round trips. That means that before we even get the first response byte for our page we actually have to send data back and forth between the browser and a server eight times!

Servers

Servers Latency Cache Website

Seamlessly Swapping the API backend of the Netflix Android app

The Netflix TechBlog

SEPTEMBER 8, 2020

This allowed Android engineers to have much more control and observability over how we get our data. Background The Netflix Android app uses the falcor data model and query protocol. For example, the artwork service is separate from the video metadata service, but we need the data from both in the detail key.

Latency

Latency Cache Java Traffic

Migrating Critical Traffic At Scale with No Downtime?—?Part 1

The Netflix TechBlog

MAY 4, 2023

It can happen on an edge API system servicing customer devices, between the edge and mid-tier services, or from mid-tiers to data stores. It provides a good read on the availability and latency ranges under different production conditions. For instance, envision a response payload that delivers media streams for a playback session.

Traffic

Traffic Latency Tuning Systems

Netflix Cloud Packaging in the Terabyte Era

The Netflix TechBlog

SEPTEMBER 24, 2021

As described by the white paper Apple ProRes ( link ), the target data rate of the Apple ProRes HQ for 1920x1080 at 29.97 Uploading and downloading data always come with a penalty, namely latency. The problematic pattern of packagers is that they do not always generate data linearly. is 220 Mbps.

Cloud

Cloud Media Storage Cache

Time to First Byte: What It Is and Why It Matters

CSS Wizardry

AUGUST 7, 2019

The first—and often most surprising for people to learn—thing that I want to draw your attention to is that TTFB counts one whole round trip of latency. TTFB isn’t just time spent on the server, it is also the time spent getting from our device to the sever and back again (carrying, that’s right, the first byte of data!).

Latency

Latency Ecommerce Servers Mobile

Dynatrace supports Azure Managed Instance for Apache Cassandra

Dynatrace

MAY 13, 2022

From there, you can dive deeper into infrastructure metrics (cluster, datacenter, racks, and nodes) and data metrics (keyspaces and tables). In addition to the built-in views, Dynatrace provides data analysis that enhances your ability to query and chart metrics. You can also analyze table metrics, such as cache hits and misses.

Azure

Azure Latency Metrics Infrastructure

Pushy to the Limit: Evolving Netflix’s WebSocket proxy for the future

The Netflix TechBlog

SEPTEMBER 10, 2024

This allows data to be sent to the device from backend services on demand, without the need for continually polling requests from the device. In our case, we value low latency — the faster we can read from KeyValue, the faster these messages can get delivered. As Pushy’s portfolio grew, we experienced some pain points with Dynomite.

Latency

Latency Cache Tuning Efficiency

Improved Alerting with Atlas Streaming Eval

The Netflix TechBlog

APRIL 27, 2023

Atlas is an in-memory time-series database that ingests multiple billions of time-series per day and retains the last two weeks of data. Moreover, common database optimizations like caching recently queried data don’t really work for alerting queries because, generally speaking, the last received datapoint is required for correctness.

Storage

Storage Cache Metrics Database

Best practices and key metrics for improving mobile app performance

Dynatrace

DECEMBER 13, 2023

By tracking these KPIs and similar, organizations can gain valuable insights into the performance of their mobile apps and make data-driven decisions to improve the user experience and drive growth. Here are some ways observability data is important to mobile app performance monitoring. Load time and network latency metrics.

Best Practices

Best Practices Mobile Metrics Performance

DBLog: A Generic Change-Data-Capture Framework

The Netflix TechBlog

DECEMBER 17, 2019

Andreas Andreakis , Ioannis Papapanagiotou Overview Change-Data-Capture (CDC) allows capturing committed changes from a database in real-time and propagating those changes to downstream consumers [1][2]. Requirements In a previous blog post, we discussed Delta , a data enrichment and synchronization platform.

Database

Database Traffic Transportation Open Source

DBLog: A Generic Change-Data-Capture Framework

The Netflix TechBlog

DECEMBER 17, 2019

Andreas Andreakis , Ioannis Papapanagiotou Overview Change-Data-Capture (CDC) allows capturing committed changes from a database in real-time and propagating those changes to downstream consumers [1][2]. Requirements In a previous blog post, we discussed Delta , a data enrichment and synchronization platform.

Database

Database Traffic Transportation Open Source

Speed Up Presto at Uber with Alluxio Local Cache

Uber Engineering

JANUARY 16, 2023

Uber’s interactive analytics team shares how they integrated Alluxio’s data caching into Presto, the SQL query engine powering thousands of daily active users on petabyte scale at Uber, to dramatically reduce data scan latencies through leveraging Presto on local disks.

Cache

Cache Speed Latency Analytics

Expanding the Cloud: More memory, more caching and more performance for your data

All Things Distributed

SEPTEMBER 3, 2013

Amazon ElastiCache is a fully managed, in-memory caching service for customers to optimize the latency, performance and cost of their read workloads. As the amount of data stored increases constantly, the amount of memory needed also goes up. For features that need data structures like sorted sets (e.g.,

Cache

Cache Cloud Performance Retail

Supporting Diverse ML Systems at Netflix

The Netflix TechBlog

MARCH 7, 2024

Berg , Romain Cledat , Kayla Seeley , Shashank Srikanth , Chaoying Wang , Darin Yu Netflix uses data science and machine learning across all facets of the company, powering a wide range of business applications from our internal infrastructure and content demand modeling to media understanding.

Systems

Systems Media Cache Open Source

Re-Architecting the Video Gatekeeper

The Netflix TechBlog

JULY 12, 2019

Gatekeeper accomplishes its prescribed task by aggregating data from multiple upstream systems, applying some business logic, then producing an output detailing the status of each video in each country. there is no eviction policy, and there are no cache misses. there is no eviction policy, and there are no cache misses.

Cache

Cache Architecture Engineering Latency

Redis on Azure Performance Benchmark – ScaleGrid for Redis™ vs. Azure Cache

Scalegrid

NOVEMBER 26, 2020

It has high throughput and runs from memory, but also has the ability to persist data on disk. Redis is a great caching solution for highly demanding applications, and there are […]. In fact, it is the number one key value store and eighth most popular database in the world.

Cache

Cache Azure Benchmarking Performance

Implementing AWS well-architected pillars with automated workflows

Dynatrace

SEPTEMBER 13, 2023

These workflows also utilize Davis® , the Dynatrace causal AI engine, and all your observability and security data across all platforms, in context, at scale, and in real-time. Storing frequently accessed data in faster storage, usually in-memory caching, improves data retrieval speed and overall system performance. Beyond

AWS

AWS Efficiency Azure Cloud

Observability vs. monitoring: What’s the difference?

Dynatrace

NOVEMBER 3, 2021

Logging provides additional data but is typically viewed in isolation of a broader system context. Observability is the ability to understand a system’s internal state by analyzing the data it generates, such as logs, metrics, and traces. Monitoring typically provides a limited view of system data focused on individual metrics.

Monitoring

Monitoring Metrics DevOps Scalability

Crucial Redis Monitoring Metrics You Must Watch

Scalegrid

JANUARY 25, 2024

Key Takeaways Critical performance indicators such as latency, CPU usage, memory utilization, hit rate, and number of connected clients/slaves/evictions must be monitored to maintain Redis’s high throughput and low latency capabilities. These essential data points heavily influence both stability and efficiency within the system.

Metrics

Metrics Monitoring Latency Cache

MezzFS?—?Mounting object storage in Netflix’s media processing platform

The Netflix TechBlog

MARCH 6, 2019

Disk Caching? — ? MezzFS can be configured to cache objects on the local disk. Regional caching? —?Netflix If an application in region A is using MezzFS to read from an object stored in region B, MezzFS will cache the object in region A. we only pay the transfer costs for one worker, and the rest use the cached object.

Media

Media Storage Processing Cache

Making Cloud.typography Fast(er)

CSS Wizardry

AUGUST 13, 2019

Entry (10) lives on a different origin again, so we have more connection overhead to contend with, and the file seems to take an incredibly long time to download (as evidenced by the sheer amount of dark green—sending data). The file is so big because it actually contains all of our fonts as Base64 encoded data URIs.

Latency

Latency Cache Strategy Media

In-Stream Big Data Processing

Highly Scalable

AUGUST 20, 2013

The shortcomings and drawbacks of batch-oriented data processing were widely recognized by the Big Data community quite a long time ago. This system has been designed to supplement and succeed the existing Hadoop-based system that had too high latency of data processing and too high maintenance costs.

Big Data

Big Data Processing Lambda Database

GraphQL Search Indexing

The Netflix TechBlog

NOVEMBER 4, 2019

This post looks at the other side of search: how to index data and make it searchable. Various creatives Marketing Tech supports To enable our marketing stakeholders to manage these creatives, we need to pull together data that is spread across many services?—?GraphQL GraphQL makes this aggregation easy.

Database

Database Cache Servers Performance

Accelerating Data: Faster and More Scalable ElastiCache for Redis

All Things Distributed

OCTOBER 12, 2016

Fast Data is an emerging industry term for information that is arriving at high volume and incredible rates, faster than traditional databases can manage. Three years ago, as part of our AWS Fast Data journey we introduced Amazon ElastiCache for Redis , a fully managed in-memory data store that operates at sub-millisecond latency.

Scalability

Scalability Cache Analytics AWS

Performance Hero: Annie Sullivan

Speed Curve

JANUARY 19, 2025

Of course writes were much less common than reads, so I added a caching layer for reads, and that did the trick. So in addition to all the optimization work we did for Google Docs, I got to spend a lot of time and energy working on the measurement problem: how can we get end-to-end latency numbers?

Performance

Performance Google Speed Metrics

The Three Cs: Concatenate, Compress, Cache

Netflix’s Distributed Counter Abstraction

Trending Sources

Optimising for High Latency Environments

Consistent caching mechanism in Titus Gateway

The Power of Caching: Boosting API Performance and Scalability

Single-core memory bandwidth: Latency, Bandwidth, and Concurrency

Introducing Netflix’s Key-Value Data Abstraction Layer

Introducing Netflix TimeSeries Data Abstraction Layer

Data ingestion pipeline with Operation Management

Architectural Insights: Designing Efficient Multi-Layered Caching With Instagram Example

Foundation Model for Personalized Recommendation

Benchmark (YCSB) numbers for Redis, MongoDB, Couchbase2, Yugabyte and BangDB

Front-End: Cache Strategies You Should Know

What is a data lakehouse? Combining data lakes and warehouses for the best of both worlds

Migrating Netflix to GraphQL Safely

Predictive CPU isolation of containers at Netflix

Designing Instagram

AI-driven analysis of Spring Micrometer metrics in context, with typology at scale

Dynatrace accelerates business transformation with new AI observability solution

Time To First Byte: Beyond Server Response Time

Seamlessly Swapping the API backend of the Netflix Android app

Migrating Critical Traffic At Scale with No Downtime?—?Part 1

Netflix Cloud Packaging in the Terabyte Era

Time to First Byte: What It Is and Why It Matters

Dynatrace supports Azure Managed Instance for Apache Cassandra

Pushy to the Limit: Evolving Netflix’s WebSocket proxy for the future

Improved Alerting with Atlas Streaming Eval

Best practices and key metrics for improving mobile app performance

DBLog: A Generic Change-Data-Capture Framework

DBLog: A Generic Change-Data-Capture Framework

Speed Up Presto at Uber with Alluxio Local Cache

Expanding the Cloud: More memory, more caching and more performance for your data

Supporting Diverse ML Systems at Netflix

Re-Architecting the Video Gatekeeper

Redis on Azure Performance Benchmark – ScaleGrid for Redis™ vs. Azure Cache

Implementing AWS well-architected pillars with automated workflows

Observability vs. monitoring: What’s the difference?

Crucial Redis Monitoring Metrics You Must Watch

MezzFS?—?Mounting object storage in Netflix’s media processing platform

Making Cloud.typography Fast(er)

In-Stream Big Data Processing

GraphQL Search Indexing

Accelerating Data: Faster and More Scalable ElastiCache for Redis

Performance Hero: Annie Sullivan

Stay Connected