Cache, Infrastructure and Latency - Technology Performance Pulse

Netflix’s Distributed Counter Abstraction

The Netflix TechBlog

NOVEMBER 12, 2024

By: Rajiv Shringi , Oleksii Tkachuk , Kartik Sathyanarayanan Introduction In our previous blog post, we introduced Netflix’s TimeSeries Abstraction , a distributed service designed to store and query large volumes of temporal event data with low millisecond latencies. Today, we’re excited to present the Distributed Counter Abstraction.

Latency

Latency Cache Infrastructure Strategy

The Three Cs: Concatenate, Compress, Cache

CSS Wizardry

OCTOBER 16, 2023

Caching them at the other end: How long should we cache files on a user’s device? Plotted on the same horizontal axis of 1.6s, the waterfalls speak for themselves: 201ms of cumulative latency; 109ms of cumulative download. 4,362ms of cumulative latency; 240ms of cumulative download. Cache This is the easy one.

Cache

Cache Latency Strategy Speed

Why Replace External Database Caches?

DZone

AUGUST 28, 2024

Teams often consider external caches when the existing database cannot meet the required service-level agreement (SLA). However, external caches are not as simple as they are often made out to be. This is a clear performance-oriented decision.

Cache

Cache Database Latency Traffic

The Power of Caching: Boosting API Performance and Scalability

DZone

AUGUST 16, 2023

Caching is the process of storing frequently accessed data or resources in a temporary storage location, such as memory or disk, to improve retrieval speed and reduce the need for repetitive processing. Bandwidth optimization: Caching reduces the amount of data transferred over the network, minimizing bandwidth usage and improving efficiency.

Cache

Cache Scalability Performance Latency

Self-Host Your Static Assets

CSS Wizardry

MAY 31, 2019

One of the quickest wins—and one of the first things I recommend my clients do—to make websites faster can at first seem counter-intuitive: you should self-host all of your static assets, forgoing others’ CDNs/infrastructure. Users might already have the file cached. Risk: Slowdowns and Outages. You’re going to suffer, too.

Cache

Cache Latency Infrastructure Website

Introducing Netflix’s Key-Value Data Abstraction Layer

The Netflix TechBlog

SEPTEMBER 18, 2024

Vidhya Arvind , Rajasekhar Ummadisetty , Joey Lynch , Vinay Chella Introduction At Netflix our ability to deliver seamless, high-quality, streaming experiences to millions of users hinges on robust, global backend infrastructure. It also serves as central configuration of access patterns such as consistency or latency targets.

Latency

Latency Storage Cache Servers

Dynatrace supports SnapStart for Lambda as an AWS launch partner

Dynatrace

NOVEMBER 28, 2022

The new Amazon capability enables customers to improve the startup latency of their functions from several seconds to as low as sub-second (up to 10 times faster) at P99 (the 99th latency percentile). This can cause latency outliers and may lead to a poor end-user experience for latency-sensitive applications.

Lambda

Lambda AWS Serverless Latency

Migrating Netflix to GraphQL Safely

The Netflix TechBlog

JUNE 14, 2023

The GraphQL shim enabled client engineers to move quickly onto GraphQL, figure out client-side concerns like cache normalization, experiment with different GraphQL clients, and investigate client performance without being blocked by server-side migrations. To launch Phase 1 safely, we used AB Testing.

Traffic

Traffic Latency Metrics Cache

Predictive CPU isolation of containers at Netflix

The Netflix TechBlog

JUNE 4, 2019

Because microprocessors are so fast, computer architecture design has evolved towards adding various levels of caching between compute units and the main memory, in order to hide the latency of bringing the bits to the brains. This avoids thrashing caches too much for B and evens out the pressure on the L3 caches of the machine.

Cache

Cache Latency Airlines Logistics

Dynatrace accelerates business transformation with new AI observability solution

Dynatrace

JANUARY 31, 2024

The RAG process begins by summarizing and converting user prompts into queries that are sent to a search platform that uses semantic similarities to find relevant data in vector databases, semantic caches, or other online data sources. Estimates show that NVIDIA, a semiconductor manufacturer, could release 1.5

Cache

Cache Azure Infrastructure Monitoring

Foundation Model for Personalized Recommendation

The Netflix TechBlog

MARCH 28, 2025

Yet, many are confined to a brief temporal window due to constraints in serving latency or training costs. In recommendation systems, context windows during inference are often limited to hundreds of eventsnot due to model capability but because these services typically require millisecond-level latency.

Tuning

Tuning Efficiency Latency Strategy

Dynatrace supports Azure Managed Instance for Apache Cassandra

Dynatrace

MAY 13, 2022

It also removes the need for developers and database administrators to manage infrastructure or update database versions. From there, you can dive deeper into infrastructure metrics (cluster, datacenter, racks, and nodes) and data metrics (keyspaces and tables). You can also analyze table metrics, such as cache hits and misses.

Azure

Azure Latency Metrics Infrastructure

Introducing Netflix TimeSeries Data Abstraction Layer

The Netflix TechBlog

OCTOBER 8, 2024

Rajiv Shringi Vinay Chella Kaidan Fullerton Oleksii Tkachuk Joey Lynch Introduction As Netflix continues to expand and diversify into various sectors like Video on Demand and Gaming , the ability to ingest and store vast amounts of temporal data — often reaching petabytes — with millisecond access latency has become increasingly vital.

Latency

Latency Storage Traffic Tuning

Seamlessly Swapping the API backend of the Netflix Android app

The Netflix TechBlog

SEPTEMBER 8, 2020

This allows the app to query a list of “paths” in each HTTP request, and get specially formatted JSON (jsonGraph) that we use to cache the data and hydrate the UI. The big difference from the monolith, though, is that this is now a standalone service deployed as a separate “application” (service) in our cloud infrastructure.

Latency

Latency Cache Java Traffic

Pushy to the Limit: Evolving Netflix’s WebSocket proxy for the future

The Netflix TechBlog

SEPTEMBER 10, 2024

This is particularly important as we build out new functionality that relies on Pushy; a strong, stable infrastructure foundation allows our partners to continue to build on top of Pushy with confidence. In our case, we value low latency — the faster we can read from KeyValue, the faster these messages can get delivered.

Latency

Latency Cache Tuning Efficiency

Designing Instagram

High Scalability

JANUARY 11, 2022

FUN FACT : In this talk , Rodrigo Schmidt, director of engineering at Instagram talks about the different challenges they have faced in scaling the data infrastructure at Instagram. When a user requests for feed then there will be two parallel threads involved in fetching the user feeds to optimize for latency. System Components.

Design

Design Media Storage Logistics

Best practices and key metrics for improving mobile app performance

Dynatrace

DECEMBER 13, 2023

Examples of observability data include metrics, logs, and traces which provide visibility into the app’s behavior and performance at different levels of the stack, including the application code, infrastructure, and network. Load time and network latency metrics. Issue remediation. Performance optimization. Capacity planning.

Best Practices

Best Practices Mobile Metrics Performance

Time to First Byte: What It Is and Why It Matters

CSS Wizardry

AUGUST 7, 2019

The first—and often most surprising for people to learn—thing that I want to draw your attention to is that TTFB counts one whole round trip of latency. The reason is because mobile networks are, as a rule, high latency connections. only to find that the resource they’re requesting isn’t in that PoP ’s cache.

Latency

Latency Ecommerce Servers Mobile

Supporting Diverse ML Systems at Netflix

The Netflix TechBlog

MARCH 7, 2024

Berg , Romain Cledat , Kayla Seeley , Shashank Srikanth , Chaoying Wang , Darin Yu Netflix uses data science and machine learning across all facets of the company, powering a wide range of business applications from our internal infrastructure and content demand modeling to media understanding.

Systems

Systems Media Cache Open Source

Re-Architecting the Video Gatekeeper

The Netflix TechBlog

JULY 12, 2019

The Tech Hollow , an OSS technology we released a few years ago, has been best described as a total high-density near cache : Total : The entire dataset is cached on each node?—?there there is no eviction policy, and there are no cache misses. Near : the cache exists in RAM on any instance which requires access to the dataset.

Cache

Cache Architecture Latency Engineering

What is a data lakehouse? Combining data lakes and warehouses for the best of both worlds

Dynatrace

OCTOBER 4, 2022

Data lakehouses deliver the query response with minimal latency. A dynamic map of interactions and relationships between applications and the underlying infrastructure also helps to zoom in and out of an issue at different stages of analysis. Massively parallel processing. This is simply not possible with conventional architectures.

Artificial Intelligence

Artificial Intelligence Storage Analytics Government

Implementing AWS well-architected pillars with automated workflows

Dynatrace

SEPTEMBER 13, 2023

The Site Reliability Guardian helps automate release validation based on SLOs and important signals that define the expected behavior of your applications in terms of availability, performance errors, throughput, latency, etc. A study by Amazon found that increasing page load time by just 100 milliseconds costs 1% in sales.

AWS

AWS Efficiency Azure Cloud

Expanding the Cloud: More memory, more caching and more performance for your data

All Things Distributed

SEPTEMBER 3, 2013

Since you now have lots of choices to address your high performance database needs, I decided to write this blog to help you select the most appropriate services for your workload using lessons I have learnt by scaling the infrastructure for Amazon.com.

Cache

Cache Cloud Performance Retail

Bring Your Own Cloud (BYOC) vs. Dedicated Hosting at ScaleGrid

Scalegrid

APRIL 16, 2020

Are you comfortable setting up your own cloud infrastructure through AWS or Azure? Amazon Virtual Private Clouds (VPC) and Azure Virtual Networks (VNET) are private, isolated sections of the cloud infrastructure where you can launch resources. This becomes really important for cache solutions like Redis™. Expert Tip.

Cloud

Cloud Azure AWS Database

Making Cloud.typography Fast(er)

CSS Wizardry

AUGUST 13, 2019

To further exacerbate the problem, the 302 response has a Cache-Control: must-revalidate, private. header , meaning that we will always make an outgoing request for this resource regardless of whether or not we’re hitting the site from a cold or a warm cache. com , which introduces yet more latency for the connection setup.

Latency

Latency Cache Strategy Media

Meet Hydrogen: A React Framework For Dynamic, Contextual And Personalized E-Commerce

Smashing Magazine

NOVEMBER 8, 2021

As developers, we rightfully obsess about the customer experience, relentlessly working to squeeze every millisecond out of the critical rendering path, optimize input latency, and eliminate jank. On top of this foundation, we add layers of caching, prerendering and edge delivery optimizations — not the other way around.

Cache

Cache Best Practices Strategy Servers

Redis® Monitoring Strategies for 2025

Scalegrid

JANUARY 21, 2025

Identifying key Redis metrics such as latency, CPU usage, and memory metrics is crucial for effective Redis monitoring. To monitor Redis instances effectively, collect Redis metrics focusing on cache hit ratio, memory allocated, and latency threshold.

Strategy

Strategy Monitoring Latency DevOps

Expanding the Cloud – An AWS Region is coming to Hong Kong

All Things Distributed

JUNE 20, 2017

This enables customers to serve content to their end users with low latency, giving them the best application experience. In 2008, AWS opened a point of presence (PoP) in Hong Kong to enable customers to serve content to their end users with low latency. Since then, AWS has added two more PoPs in Hong Kong, the latest in 2016.

AWS

AWS Logistics Cloud Social Media

Cloudburst: stateful functions-as-a-service

The Morning Paper

FEBRUARY 6, 2020

Last week we looked at a function shipping solution to the problem; Cloudburst uses the more common data shipping to bring data to caches next to function runtimes (though you could also make a case that the scheduling algorithm placing function execution in locations where the data is cached a flavour of function-shipping too).

Serverless

Serverless Lambda Cache Latency

Amazon DynamoDB ? a Fast and Scalable NoSQL Database.

All Things Distributed

JANUARY 18, 2012

Amazon DynamoDB offers low, predictable latencies at any scale. These services also require the ability to scale infrastructure incrementally to accommodate growth in request rates or dataset sizes. s read latency, particularly as dataset sizes grow. Amazon DynamoDB provides high throughput at very low latency.

Scalability

Scalability Database Ecommerce Latency

Redis® Monitoring Strategies for 2024

Scalegrid

DECEMBER 21, 2023

Identifying key Redis® metrics such as latency, CPU usage, and memory metrics is crucial for effective Redis monitoring. To monitor Redis® instances effectively, collect Redis metrics focusing on cache hit ratio, memory allocated, and latency threshold.

Strategy

Strategy Monitoring Latency DevOps

Stuff The Internet Says On Scalability For July 20th, 2018

High Scalability

JULY 20, 2018

That means multiple data indirections mean multiple cache misses. Mark LaPedus : MRAM, a next-generation memory type, is being touted as a replacement for embedded flash and cache applications. The third wing of the architecture piece is the “domain specific system-on-chip.” They are very expensive.

Internet

Internet Internet Scalability Automotive

Embrace event-driven computing: Amazon expands DynamoDB with streams, cross-region replication, and database triggers

All Things Distributed

JULY 14, 2015

Streams provide you with the underlying infrastructure to create new applications, such as continuously updated free-text search indexes, caches, or other creative extensions requiring up-to-date table changes. DynamoDB Streams enables your application to get real-time notifications of your tables’ item-level changes.

Database

Database Lambda AWS IoT

Understanding operational 5G: a first measurement study on its coverage, performance and energy consumption

The Morning Paper

OCTOBER 4, 2020

We are standing on the eve of the 5G era… 5G, as a monumental shift in cellular communication technology, holds tremendous potential for spurring innovations across many vertical industries, with its promised multi-Gbps speed, sub-10 ms low latency, and massive connectivity. Throughput and latency. energy consumption).

Energy

Energy Latency Performance Network

Accelerating Data: Faster and More Scalable ElastiCache for Redis

All Things Distributed

OCTOBER 12, 2016

Three years ago, as part of our AWS Fast Data journey we introduced Amazon ElastiCache for Redis , a fully managed in-memory data store that operates at sub-millisecond latency. While caching continues to be a dominant use of ElastiCache for Redis, we see customers increasingly use it as an in-memory NoSQL database.

Scalability

Scalability Cache Analytics AWS

Expanding the Cloud with DNS - Introducing Amazon Route 53 - All.

All Things Distributed

DECEMBER 5, 2010

DNS is an absolutely critical piece of the internet infrastructure. There are two main types of DNS servers: authoritative servers and caching resolvers. But the real robustness of the DNS system comes through the way lookups are handled, which is what caching resolvers do. Authoritative servers hold the definitive mappings.

Cloud

Cloud Internet Internet AWS

How to Reduce Your CDN Infrastructure Expenses

IO River

NOVEMBER 2, 2023

Common Infrastructure ExpensesYour first step in optimizing CDN expenses isnâ€™t to look for the best-priced solution but to remember that a cheaper price isnâ€™t always the best deal. For example, if youâ€™re deploying the infrastructure for an e-commerce website, security becomes a fundamental requirement.

Infrastructure

Infrastructure Traffic Cache Strategy

Procella: unifying serving and analytical data at YouTube

The Morning Paper

SEPTEMBER 10, 2019

Typically, organizations build specialized infrastructure for each of these use cases. This, however, creates silos of data and processing, and results in a complex, expensive, and harder to maintain infrastructure. Oh, and in additional to low latency, “ we require access to fresh data.” Cache all the things.

Analytics

Analytics Latency Cache Google

Expanding the Cloud: Enabling Globally Distributed Applications and Disaster Recovery

All Things Distributed

NOVEMBER 26, 2013

About 5 years ago, I introduced you to AWS Availability Zones, which are distinct locations within a Region that are engineered to be insulated from failures in other Availability Zones and provide inexpensive, low latency network connectivity to other Availability Zones in the same region.

Cloud

Cloud AWS Traffic Latency

Scaling Amazon ElastiCache for Redis with Online Cluster Resizing

All Things Distributed

NOVEMBER 21, 2017

Redis's microsecond latency has made it a de facto choice for caching. Four years ago, as part of our AWS fast data journey, we introduced Amazon ElastiCache for Redis , a fully managed, in-memory data store that operates at microsecond latency. TB of in-memory capacity in a single cluster.

Games

Games Retail Latency Education

AppFabric Caching: Retry Later

ScaleOut Software

MAY 15, 2014

Likewise, object access paths must be heavily multi-threaded and avoid lock contention to minimize access latency and maximize throughput. Given all this, we thought it would be a good opportunity to see how we are doing relative to the competition, and in particular, relative to Microsoft’s AppFabric caching for Windows on-premise servers.

Cache

Cache Servers Network Design

Percentiles don’t work: Analyzing the distribution of response times for web services

Adrian Cockcroft

JANUARY 29, 2023

The mean and percentile measurements hide this structure, but the rest of this post will show how the structure can be measured and analyzed so that you can figure out a useful model of your system, understand what is driving the long tail of latencies and come up with better SLAs and measures of capacity.

Lambda

Lambda Latency Cache C++

How to Reduce Your CDN Infrastructure Expenses

IO River

NOVEMBER 2, 2023

Common Infrastructure ExpensesYour first step in optimizing CDN expenses isn’t to look for the best-priced solution but to remember that a cheaper price isn’t always the best deal. For example, if you’re deploying the infrastructure for an e-commerce website, security becomes a fundamental requirement.

Infrastructure

Infrastructure Traffic Cache Strategy

A Decade of Dynamo: Powering the next wave of high-performance, internet-scale applications

All Things Distributed

OCTOBER 2, 2017

Our straining database infrastructure on Oracle led us to evaluate if we could develop a purpose-built database that would support our business needs for the long term. Performant – DynamoDB consistently delivers single-digit millisecond latencies even as your traffic volume increases.

Internet

Internet Internet AWS Performance

Netflix’s Distributed Counter Abstraction

The Three Cs: Concatenate, Compress, Cache

Trending Sources

Why Replace External Database Caches?

The Power of Caching: Boosting API Performance and Scalability

Self-Host Your Static Assets

Introducing Netflix’s Key-Value Data Abstraction Layer

Dynatrace supports SnapStart for Lambda as an AWS launch partner

Migrating Netflix to GraphQL Safely

Predictive CPU isolation of containers at Netflix

Dynatrace accelerates business transformation with new AI observability solution

Foundation Model for Personalized Recommendation

Dynatrace supports Azure Managed Instance for Apache Cassandra

Introducing Netflix TimeSeries Data Abstraction Layer

Seamlessly Swapping the API backend of the Netflix Android app

Pushy to the Limit: Evolving Netflix’s WebSocket proxy for the future

Designing Instagram

Best practices and key metrics for improving mobile app performance

Time to First Byte: What It Is and Why It Matters

Supporting Diverse ML Systems at Netflix

Re-Architecting the Video Gatekeeper

What is a data lakehouse? Combining data lakes and warehouses for the best of both worlds

Implementing AWS well-architected pillars with automated workflows

Expanding the Cloud: More memory, more caching and more performance for your data

Bring Your Own Cloud (BYOC) vs. Dedicated Hosting at ScaleGrid

Making Cloud.typography Fast(er)

Meet Hydrogen: A React Framework For Dynamic, Contextual And Personalized E-Commerce

Redis® Monitoring Strategies for 2025

Expanding the Cloud – An AWS Region is coming to Hong Kong

Cloudburst: stateful functions-as-a-service

Amazon DynamoDB ? a Fast and Scalable NoSQL Database.

Redis® Monitoring Strategies for 2024

Stuff The Internet Says On Scalability For July 20th, 2018

Embrace event-driven computing: Amazon expands DynamoDB with streams, cross-region replication, and database triggers

Understanding operational 5G: a first measurement study on its coverage, performance and energy consumption

Accelerating Data: Faster and More Scalable ElastiCache for Redis

Expanding the Cloud with DNS - Introducing Amazon Route 53 - All.

How to Reduce Your CDN Infrastructure Expenses

Procella: unifying serving and analytical data at YouTube

Expanding the Cloud: Enabling Globally Distributed Applications and Disaster Recovery

Scaling Amazon ElastiCache for Redis with Online Cluster Resizing

AppFabric Caching: Retry Later

Percentiles don’t work: Analyzing the distribution of response times for web services

How to Reduce Your CDN Infrastructure Expenses

A Decade of Dynamo: Powering the next wave of high-performance, internet-scale applications

Stay Connected