Cache, Engineering and Latency - Technology Performance Pulse

Optimising for High Latency Environments

CSS Wizardry

SEPTEMBER 16, 2024

This gives fascinating insights into the network topography of our visitors, and how much we might be impacted by high latency regions. Round-trip-time (RTT) is basically a measure of latency—how long did it take to get from one endpoint to another and back again? What is RTT? RTT isn’t a you-thing, it’s a them-thing.

Latency

Latency Cache Transportation Mobile

Consistent caching mechanism in Titus Gateway

The Netflix TechBlog

NOVEMBER 3, 2022

We introduce a caching mechanism in the API gateway layer, allowing us to offload processing from singleton leader elected controllers without giving up strict data consistency and guarantees clients observe. We started seeing increased response latencies and leader servers running at dangerously high utilization.

Cache

Cache Latency Traffic Systems

Migrating Netflix to GraphQL Safely

The Netflix TechBlog

JUNE 14, 2023

By the summer of 2020, many UI engineers were ready to move to GraphQL. The GraphQL shim enabled client engineers to move quickly onto GraphQL, figure out client-side concerns like cache normalization, experiment with different GraphQL clients, and investigate client performance without being blocked by server-side migrations.

Traffic

Traffic Latency Metrics Cache

Introducing Netflix’s Key-Value Data Abstraction Layer

The Netflix TechBlog

SEPTEMBER 18, 2024

These include challenges with tail latency and idempotency, managing “wide” partitions with many rows, handling single large “fat” columns, and slow response pagination. It also serves as central configuration of access patterns such as consistency or latency targets. Useful for keeping “n-newest” or prefix path deletion.

Latency

Latency Storage Cache Servers

Front-End: Cache Strategies You Should Know

DZone

MAY 1, 2023

Caches are very useful software components that all engineers must know. In this article, we are going to describe what is a cache and explain specific use cases focusing on the frontend and client side. What Is a Cache?

Cache

Cache Strategy Latency Operating System

Predictive CPU isolation of containers at Netflix

The Netflix TechBlog

JUNE 4, 2019

Because microprocessors are so fast, computer architecture design has evolved towards adding various levels of caching between compute units and the main memory, in order to hide the latency of bringing the bits to the brains. This avoids thrashing caches too much for B and evens out the pressure on the L3 caches of the machine.

Cache

Cache Latency Airlines Logistics

Designing Instagram

High Scalability

JANUARY 11, 2022

Machine Learning Engineer at Amazon and has led several machine-learning initiatives across the Amazon ecosystem. FUN FACT : In this talk , Rodrigo Schmidt, director of engineering at Instagram talks about the different challenges they have faced in scaling the data infrastructure at Instagram. This is a guest post by Ankit Sirmorya.

Design

Design Media Storage Logistics

Dynatrace supports SnapStart for Lambda as an AWS launch partner

Dynatrace

NOVEMBER 28, 2022

The new Amazon capability enables customers to improve the startup latency of their functions from several seconds to as low as sub-second (up to 10 times faster) at P99 (the 99th latency percentile). This can cause latency outliers and may lead to a poor end-user experience for latency-sensitive applications.

Lambda

Lambda AWS Serverless Latency

Foundation Model for Personalized Recommendation

The Netflix TechBlog

MARCH 28, 2025

Yet, many are confined to a brief temporal window due to constraints in serving latency or training costs. Key insights from this shiftinclude: A Data-Centric Approach : Shifting focus from model-centric strategies, which heavily rely on feature engineering, to a data-centric one.

Tuning

Tuning Efficiency Latency Strategy

Dynatrace accelerates business transformation with new AI observability solution

Dynatrace

JANUARY 31, 2024

The RAG process begins by summarizing and converting user prompts into queries that are sent to a search platform that uses semantic similarities to find relevant data in vector databases, semantic caches, or other online data sources.

Cache

Cache Azure Infrastructure Monitoring

AI-driven analysis of Spring Micrometer metrics in context, with typology at scale

Dynatrace

APRIL 7, 2022

Spring Boot 2 uses Micrometer as its default application metrics collector and automatically registers metrics for a wide variety of technologies, like JVM, CPU Usage, Spring MVC, and WebFlux request latencies, cache utilization, data source utilization, Rabbit MQ connection factories, and more. That’s a large amount of data to handle.

Metrics

Metrics Java Latency Cache

Pushy to the Limit: Evolving Netflix’s WebSocket proxy for the future

The Netflix TechBlog

SEPTEMBER 10, 2024

Dynomite is a Netflix open source wrapper around Redis that provides a few additional features like auto-sharding and cross-region replication, and it provided Pushy with low latency and easy record expiry, both of which are critical for Pushy’s workload. As Pushy’s portfolio grew, we experienced some pain points with Dynomite.

Latency

Latency Cache Tuning Efficiency

Netflix Cloud Packaging in the Terabyte Era

The Netflix TechBlog

SEPTEMBER 24, 2021

Uploading and downloading data always come with a penalty, namely latency. Figure 3: Video Processing with Index and Virtual Assembly Using virtual assembly greatly improves the latency performance of the ProRes 422 HQ proxy generation by removing one round trip of cloud downloading and cloud uploading by the physical assembler.

Cloud

Cloud Media Storage Cache

Mastering Disk Space Management with MongoDB® Storage Engines

Scalegrid

MAY 11, 2024

MongoDB offers several storage engines that cater to various use cases. The default storage engine in earlier versions was MMAPv1, which utilized memory-mapped files and document-level locking. The newer, pluggable storage engine, WiredTiger, addresses this by using prefix compression, collection-level locking, and row-based storage.

Storage

Storage Engineering Cache Database

Dynatrace supports Azure Managed Instance for Apache Cassandra

Dynatrace

MAY 13, 2022

With the Dynatrace Data Explorer, you can easily analyze metrics, such as client read/write latency by Cassandra nodes and disk space usage by keyspaces. You can also analyze table metrics, such as cache hits and misses. Provide a foundation for calculating metrics in dashboard charts.

Azure

Azure Latency Metrics Infrastructure

Seamlessly Swapping the API backend of the Netflix Android app

The Netflix TechBlog

SEPTEMBER 8, 2020

This allowed Android engineers to have much more control and observability over how we get our data. This allows the app to query a list of “paths” in each HTTP request, and get specially formatted JSON (jsonGraph) that we use to cache the data and hydrate the UI. This meant that data that was static (e.g.

Latency

Latency Cache Java Traffic

Improved Alerting with Atlas Streaming Eval

The Netflix TechBlog

APRIL 27, 2023

Engineers want their alerting system to be realtime, reliable, and actionable. A few years ago, we were paged by our SRE team due to our Metrics Alerting System falling behind — critical application health alerts reached engineers 45 minutes late! It opens doors to support more exciting use-cases.

Storage

Storage Cache Metrics Database

Re-Architecting the Video Gatekeeper

The Netflix TechBlog

JULY 12, 2019

By Drew Koszewnik This is the story about how the Content Setup Engineering team used Hollow, a Netflix OSS technology, to re-architect and simplify an essential component in our content pipeline?—?delivering there is no eviction policy, and there are no cache misses. The Idea We decided to employ a total high-density near cache (i.e.,

Cache

Cache Architecture Latency Engineering

Speed Up Presto at Uber with Alluxio Local Cache

Uber Engineering

JANUARY 16, 2023

Uber’s interactive analytics team shares how they integrated Alluxio’s data caching into Presto, the SQL query engine powering thousands of daily active users on petabyte scale at Uber, to dramatically reduce data scan latencies through leveraging Presto on local disks.

Cache

Cache Speed Latency Analytics

Introducing Netflix TimeSeries Data Abstraction Layer

The Netflix TechBlog

OCTOBER 8, 2024

Rajiv Shringi Vinay Chella Kaidan Fullerton Oleksii Tkachuk Joey Lynch Introduction As Netflix continues to expand and diversify into various sectors like Video on Demand and Gaming , the ability to ingest and store vast amounts of temporal data — often reaching petabytes — with millisecond access latency has become increasingly vital.

Latency

Latency Storage Traffic Tuning

Supporting Diverse ML Systems at Netflix

The Netflix TechBlog

MARCH 7, 2024

Without these integrations, projects would be stuck at the prototyping stage, or they would have to be maintained as outliers outside the systems maintained by our engineering teams, incurring unsustainable operational overhead. Importantly, all the use cases were engineered by practitioners themselves.

Systems

Systems Media Cache Open Source

Implementing AWS well-architected pillars with automated workflows

Dynatrace

SEPTEMBER 13, 2023

These workflows also utilize Davis® , the Dynatrace causal AI engine, and all your observability and security data across all platforms, in context, at scale, and in real-time. Storing frequently accessed data in faster storage, usually in-memory caching, improves data retrieval speed and overall system performance. Beyond

AWS

AWS Efficiency Azure Cloud

MezzFS?—?Mounting object storage in Netflix’s media processing platform

The Netflix TechBlog

MARCH 6, 2019

Mounting object storage in Netflix’s media processing platform By Barak Alon (on behalf of Netflix’s Media Cloud Engineering team) MezzFS (short for “Mezzanine File System”) is a tool we’ve developed at Netflix that mounts cloud objects as local files via FUSE. Disk Caching? — ? Regional caching? —?Netflix Replays? —?More

Media

Media Storage Processing Cache

Expanding the Cloud: More memory, more caching and more performance for your data

All Things Distributed

SEPTEMBER 3, 2013

Amazon ElastiCache is a fully managed, in-memory caching service for customers to optimize the latency, performance and cost of their read workloads. Similar to how we offer multiple engines in Amazon RDS, starting today, we are supporting Redis as a new engine choice in Amazon ElastiCache, in addition to Memcached.

Cache

Cache Cloud Performance Retail

Observability vs. monitoring: What’s the difference?

Dynatrace

NOVEMBER 3, 2021

For example, when monitoring a database, you’ll want to know about any latency when writing data to a disk or average query response time. Examples include a spike in memory utilization, a decrease in cache hit ratio, or an increase in CPU utilization.

Monitoring

Monitoring Metrics DevOps Scalability

Formulating ‘Out of Memory Kill’ Prediction on the Netflix App as a Machine Learning Problem

The Netflix TechBlog

JULY 21, 2022

Since memory management is not something one usually associates with classification problems, this blog focuses on formulating the problem as an ML problem and the data engineering that goes along with it. Some nuances while creating this dataset come from the on-field domain knowledge of our engineers. of the time (False Positives).

Big Data

Big Data Cache Engineering Data Engineering

5.5 mm in 1.25 nanoseconds

Randon ASCII

JANUARY 12, 2022

That meant I started having regular meetings with the hardware engineers who were working with IBM on the CPU which gave me even more expertise on this CPU, which was critical in helping me discover a design flaw in one of its instructions , and in helping game developers master this finicky beast. register files? arithmetic units?)

Cache

Cache Latency Benchmarking Hardware

Performance Hero: Annie Sullivan

Speed Curve

JANUARY 19, 2025

Of course writes were much less common than reads, so I added a caching layer for reads, and that did the trick. So in addition to all the optimization work we did for Google Docs, I got to spend a lot of time and energy working on the measurement problem: how can we get end-to-end latency numbers?

Performance

Performance Google Speed Metrics

Amazon DynamoDB ? a Fast and Scalable NoSQL Database.

All Things Distributed

JANUARY 18, 2012

Amazon DynamoDB offers low, predictable latencies at any scale. This is not just predictability of median performance and latency, but also at the end of the distribution (the 99.9th percentile), so we could provide acceptable performance for virtually every customer. Another important requirement for Dynamo was predictability.

Scalability

Scalability Database Ecommerce Latency

AI-driven analysis of Spring Micrometer metrics in context, with topology at scale

Dynatrace

APRIL 7, 2022

Spring Boot 2 uses Micrometer as its default application metrics collector and automatically registers metrics for a wide variety of technologies, like JVM, CPU Usage, Spring MVC, and WebFlux request latencies, cache utilization, data source utilization, Rabbit MQ connection factories, and more. That’s a large amount of data to handle.

Metrics

Metrics Java Latency Cache

AI-driven analysis of Spring Micrometer metrics in context, with topology at scale

Dynatrace

APRIL 7, 2022

Spring Boot 2 uses Micrometer as its default application metrics collector and automatically registers metrics for a wide variety of technologies, like JVM, CPU Usage, Spring MVC, and WebFlux request latencies, cache utilization, data source utilization, Rabbit MQ connection factories, and more. That’s a large amount of data to handle.

Metrics

Metrics Java Latency Cache

Embrace event-driven computing: Amazon expands DynamoDB with streams, cross-region replication, and database triggers

All Things Distributed

JULY 14, 2015

Streams provide you with the underlying infrastructure to create new applications, such as continuously updated free-text search indexes, caches, or other creative extensions requiring up-to-date table changes. DynamoDB Streams enables your application to get real-time notifications of your tables’ item-level changes.

Database

Database Lambda AWS IoT

Redis® Monitoring Strategies for 2025

Scalegrid

JANUARY 21, 2025

Identifying key Redis metrics such as latency, CPU usage, and memory metrics is crucial for effective Redis monitoring. To monitor Redis instances effectively, collect Redis metrics focusing on cache hit ratio, memory allocated, and latency threshold.

Strategy

Strategy Monitoring Latency DevOps

A one size fits all database doesn't fit anyone

All Things Distributed

JUNE 21, 2018

Use cases such as gaming, ad tech, and IoT lend themselves particularly well to the key-value data model where the access patterns require low-latency Gets/Puts for known key values. The purpose of DynamoDB is to provide consistent single-digit millisecond latency for any scale of workloads.

Database

Database AWS Games Latency

Cloudburst: stateful functions-as-a-service

The Morning Paper

FEBRUARY 6, 2020

Last week we looked at a function shipping solution to the problem; Cloudburst uses the more common data shipping to bring data to caches next to function runtimes (though you could also make a case that the scheduling algorithm placing function execution in locations where the data is cached a flavour of function-shipping too).

Serverless

Serverless Lambda Cache Latency

Redis vs Memcached in 2024

Scalegrid

MARCH 28, 2024

Key Takeaways Redis offers complex data structures and additional features for versatile data handling, while Memcached excels in simplicity with a fast, multi-threaded architecture for basic caching needs. Redis is better suited for complex data models, and Memcached is better suited for high-throughput, string-based caching scenarios.

Cache

Cache Storage Scalability Architecture

Native App Network Performance Analysis

DZone

APRIL 7, 2021

Once you're done with the app development and all looks good on the simulators and internal network devices—but out there in the wild, with bandwidth restrictions, TCP congestion, cache hit/miss, the device configuration, your user may not experience what you intend to provide & not every unhappy customer leaves feedback; they just stop coming. (..)

Network

Network Performance Cache Internet

Meet Hydrogen: A React Framework For Dynamic, Contextual And Personalized E-Commerce

Smashing Magazine

NOVEMBER 8, 2021

As developers, we rightfully obsess about the customer experience, relentlessly working to squeeze every millisecond out of the critical rendering path, optimize input latency, and eliminate jank. On top of this foundation, we add layers of caching, prerendering and edge delivery optimizations — not the other way around.

Cache

Cache Best Practices Strategy Servers

Redis® Monitoring Strategies for 2024

Scalegrid

DECEMBER 21, 2023

Identifying key Redis® metrics such as latency, CPU usage, and memory metrics is crucial for effective Redis monitoring. To monitor Redis® instances effectively, collect Redis metrics focusing on cache hit ratio, memory allocated, and latency threshold.

Strategy

Strategy Monitoring Latency DevOps

The Most Important MySQL Setting

Percona

APRIL 7, 2023

Here’s how the same test performed when running Percona Distribution for PostgreSQL 14 on these same servers: Queries: reads Queries: writes Queries: other Queries: total Transactions Latency (95th) MySQL (A) 1584986 1645000 245322 3475308 122277 20137.61 MySQL (B) 2517529 2610323 389048 5516900 194140 11523.48

Tuning

Tuning Cache Servers Benchmarking

Amazon DynamoDB Accelerator (DAX): Speed Up DynamoDB Response Times from Milliseconds to Microseconds without Application Rewrite.

All Things Distributed

JUNE 21, 2017

Today, I'm excited to announce the general availability of Amazon DynamoDB Accelerator (DAX) , a fully managed, highly available, in-memory cache that can speed up DynamoDB response times from milliseconds to microseconds, even at millions of requests per second. Adding caching when your app is already experiencing load is not easy.

Speed

Speed Cache Latency AWS

Five Data-Loading Patterns To Improve Frontend Performance

Smashing Magazine

SEPTEMBER 28, 2022

An SSR application will generally have templating engines that inject the variables into an HTML when given to the client. Active Memory Caching. When you want to get data that you already had quickly, you need to do caching — caching stores data that a user recently retrieved. Caching Schemes.

Cache

Cache Performance Servers Architecture

Expanding the Cloud – An AWS Region is coming to Hong Kong

All Things Distributed

JUNE 20, 2017

This enables customers to serve content to their end users with low latency, giving them the best application experience. In 2008, AWS opened a point of presence (PoP) in Hong Kong to enable customers to serve content to their end users with low latency. Since then, AWS has added two more PoPs in Hong Kong, the latest in 2016.

AWS

AWS Logistics Cloud Social Media

Expanding the Cloud with DNS - Introducing Amazon Route 53 - All.

All Things Distributed

DECEMBER 5, 2010

There is more than one Werner Vogels in this world and although I never get emails, snail mail or phones calls for any of my peers, I am sure they are somewhat frustrated if they type in our name in a search engine :-). There are two main types of DNS servers: authoritative servers and caching resolvers.

Cloud

Cloud Internet Internet AWS

Optimising for High Latency Environments

Consistent caching mechanism in Titus Gateway

Trending Sources

Migrating Netflix to GraphQL Safely

Introducing Netflix’s Key-Value Data Abstraction Layer

Front-End: Cache Strategies You Should Know

Predictive CPU isolation of containers at Netflix

Designing Instagram

Dynatrace supports SnapStart for Lambda as an AWS launch partner

Foundation Model for Personalized Recommendation

Dynatrace accelerates business transformation with new AI observability solution

AI-driven analysis of Spring Micrometer metrics in context, with typology at scale

Pushy to the Limit: Evolving Netflix’s WebSocket proxy for the future

Netflix Cloud Packaging in the Terabyte Era

Mastering Disk Space Management with MongoDB® Storage Engines

Dynatrace supports Azure Managed Instance for Apache Cassandra

Seamlessly Swapping the API backend of the Netflix Android app

Improved Alerting with Atlas Streaming Eval

Re-Architecting the Video Gatekeeper

Speed Up Presto at Uber with Alluxio Local Cache

Introducing Netflix TimeSeries Data Abstraction Layer

Supporting Diverse ML Systems at Netflix

Implementing AWS well-architected pillars with automated workflows

MezzFS?—?Mounting object storage in Netflix’s media processing platform

Expanding the Cloud: More memory, more caching and more performance for your data

Observability vs. monitoring: What’s the difference?

Formulating ‘Out of Memory Kill’ Prediction on the Netflix App as a Machine Learning Problem

5.5 mm in 1.25 nanoseconds

Performance Hero: Annie Sullivan

Amazon DynamoDB ? a Fast and Scalable NoSQL Database.

AI-driven analysis of Spring Micrometer metrics in context, with topology at scale

AI-driven analysis of Spring Micrometer metrics in context, with topology at scale

Embrace event-driven computing: Amazon expands DynamoDB with streams, cross-region replication, and database triggers

Redis® Monitoring Strategies for 2025

A one size fits all database doesn't fit anyone

Cloudburst: stateful functions-as-a-service

Redis vs Memcached in 2024

Native App Network Performance Analysis

Meet Hydrogen: A React Framework For Dynamic, Contextual And Personalized E-Commerce

Redis® Monitoring Strategies for 2024

The Most Important MySQL Setting

Amazon DynamoDB Accelerator (DAX): Speed Up DynamoDB Response Times from Milliseconds to Microseconds without Application Rewrite.

Five Data-Loading Patterns To Improve Frontend Performance

Expanding the Cloud – An AWS Region is coming to Hong Kong

Expanding the Cloud with DNS - Introducing Amazon Route 53 - All.

Stay Connected