Latency, Strategy and Tuning - Technology Performance Pulse

Foundation Model for Personalized Recommendation

The Netflix TechBlog

MARCH 28, 2025

Yet, many are confined to a brief temporal window due to constraints in serving latency or training costs. It facilitates the distribution of these learnings to other models, either through shared model weights for fine tuning or directly through embeddings.

Tuning

Tuning Efficiency Latency Strategy

Netflix’s Distributed Counter Abstraction

The Netflix TechBlog

NOVEMBER 12, 2024

By: Rajiv Shringi , Oleksii Tkachuk , Kartik Sathyanarayanan Introduction In our previous blog post, we introduced Netflix’s TimeSeries Abstraction , a distributed service designed to store and query large volumes of temporal event data with low millisecond latencies. Today, we’re excited to present the Distributed Counter Abstraction.

Latency

Latency Cache Infrastructure Strategy

How to Optimize CPU Performance Through Isolation and System Tuning

DZone

MAY 1, 2023

CPU isolation and efficient system management are critical for any application which requires low-latency and high-performance computing. To achieve this level of performance, such systems require dedicated CPU cores that are free from interruptions by other processes, together with wider system tuning.

Tuning

Tuning Systems Latency Performance

Introducing Impressions at Netflix

The Netflix TechBlog

FEBRUARY 14, 2025

We can experiment with different content placements or promotional strategies to boost visibility and engagement. Analyzing impression history, for example, might help determine how well a specific row on the home page is functioning or assess the effectiveness of a merchandising strategy.

Tuning

Tuning Latency Efficiency Storage

RabbitMQ vs. Kafka: Key Differences

Scalegrid

FEBRUARY 6, 2025

With its exchange feature, RabbitMQ enables advanced routing strategies, making it well-suited for workflows that require controlled message flow and guaranteed delivery. Its partitioned log architecture supports both queuing and publish-subscribe models, allowing it to handle large-scale event processing with minimal latency.

Latency

Latency Analytics Architecture Storage

Title Launch Observability at Netflix Scale

The Netflix TechBlog

DECEMBER 17, 2024

The Challenge of Title Launch Observability As engineers, were wired to track system metrics like error rates, latencies, and CPU utilizationbut what about metrics that matter to a titlessuccess? Stay tuned for a closer look at the innovation behind thescenes!

Traffic

Traffic Scalability Strategy Monitoring

Why applying chaos engineering to data-intensive applications matters

Dynatrace

MAY 23, 2024

Stream processing systems, designed for continuous, low-latency processing, demand swift recovery mechanisms to tolerate and mitigate failures effectively. After failures, Kafka Streams’ partition assignment strategy, triggered by rebalances, causes its executions to accumulate more lag. This significantly increases event latency.

Engineering

Engineering Tuning Latency Open Source

Migrating Critical Traffic At Scale with No Downtime?—?Part 1

The Netflix TechBlog

MAY 4, 2023

Migrating Critical Traffic At Scale with No Downtime — Part 1 Shyam Gala , Javier Fernandez-Ivern , Anup Rokkam Pratap , Devang Shah Hundreds of millions of customers tune into Netflix every day, expecting an uninterrupted and immersive streaming experience. This approach has a handful of benefits.

Traffic

Traffic Latency Tuning Systems

Best Practices for Scaling RabbitMQ

Scalegrid

FEBRUARY 24, 2025

Youll also learn strategies for maintaining data safety and managing node failures so your RabbitMQ setup is always up to the task. They can be mirrored and configured for either availability or consistency, providing different strategies for managing network partitions.

Best Practices

Best Practices Traffic Strategy Efficiency

Migrating Netflix to GraphQL Safely

The Netflix TechBlog

JUNE 14, 2023

The three strategies we will discuss today are AB Testing , Replay Testing, and Sticky Canaries. Let’s discuss the three testing strategies in further detail. To determine customer impact, we could compare various metrics such as error rates, latencies, and time to render.

Traffic

Traffic Latency Metrics Cache

Comparing PostgreSQL DigitalOcean Performance & Pricing – ScaleGrid vs. DigitalOcean Managed Databases

Scalegrid

JUNE 4, 2020

Compare Latency. lower latency compared to DigitalOcean for PostgreSQL. Now, let’s take a look at the throughput and latency performance of our comparison. Next, we are going to test and compare the latency performance between ScaleGrid and DigitalOcean for PostgreSQL. PostgreSQL DigitalOcean Latency Averages (ms).

Database

Database Latency Benchmarking Performance

Introducing Netflix’s Key-Value Data Abstraction Layer

The Netflix TechBlog

SEPTEMBER 18, 2024

These include challenges with tail latency and idempotency, managing “wide” partitions with many rows, handling single large “fat” columns, and slow response pagination. It also serves as central configuration of access patterns such as consistency or latency targets. Useful for keeping “n-newest” or prefix path deletion.

Latency

Latency Storage Cache Servers

Introducing Netflix TimeSeries Data Abstraction Layer

The Netflix TechBlog

OCTOBER 8, 2024

Rajiv Shringi Vinay Chella Kaidan Fullerton Oleksii Tkachuk Joey Lynch Introduction As Netflix continues to expand and diversify into various sectors like Video on Demand and Gaming , the ability to ingest and store vast amounts of temporal data — often reaching petabytes — with millisecond access latency has become increasingly vital.

Latency

Latency Storage Traffic Tuning

OpenTelemetry 101: A nontechnical guide for IT leaders and enthusiasts

Dynatrace

JULY 22, 2024

Traces are used for performance analysis, latency optimization, and root cause analysis. OpenTelemetry provides [extensive documentation]([link] and examples to help you fine-tune your configuration for maximum effectiveness. Capture critical performance indicators such as request latency, error rates, and resource usage.

Latency

Latency Best Practices Metrics Open Source

Plan Your Multi Cloud Strategy

Scalegrid

MARCH 22, 2024

A well-planned multi cloud strategy can seriously upgrade your business’s tech game, making you more agile. Key Takeaways Multi-cloud strategies have become increasingly popular due to the need for flexibility, innovation, and the avoidance of vendor lock-in. They can also bolster uptime and limit latency issues or potential downtimes.

Strategy

Strategy Cloud Government Innovation

Migrating Critical Traffic At Scale with No Downtime?—?Part 2

The Netflix TechBlog

JUNE 13, 2023

By collecting and analyzing key performance metrics of the service over time, we can assess the impact of the new changes and determine if they meet the availability, latency, and performance requirements. They enable us to further fine-tune and configure the system, ensuring the new changes are integrated smoothly and seamlessly.

Traffic

Traffic Metrics Systems Strategy

PostgreSQL Connection Pooling: Part 1 – Pros & Cons

Scalegrid

OCTOBER 17, 2019

While there is plenty of well-documented benefits to using a connection pooler, there are some arguments to be made against using one: Introducing a middleware in the communication inevitably introduces some latency. A middleware becomes a single point of failure. Should You Use a PostgreSQL Connection Pooler?

Architecture

Architecture Database Latency Servers

Building Netflix’s Distributed Tracing Infrastructure

The Netflix TechBlog

OCTOBER 19, 2020

If we had an ID for each streaming session then distributed tracing could easily reconstruct session failure by providing service topology, retry and error tags, and latency measurements for all service calls. Our engineering teams tuned their services for performance after factoring in increased resource utilization due to tracing.

Infrastructure

Infrastructure Transportation Storage Open Source

Optimizing data warehouse storage

The Netflix TechBlog

DECEMBER 21, 2020

These principles reduce resource usage by being more efficient and effective while lowering the end-to-end latency in data processing. Orient: Gather tuning parameters for a particular table that changed. AutoAnalyze In short, AutoAnalyze finds the best tuning/configuration parameters for a table. More processing resources.

Storage

Storage Latency Efficiency Data Engineering

Data Movement in Netflix Studio via Data Mesh

The Netflix TechBlog

JULY 26, 2021

Operational Reporting is a reporting paradigm specialized in covering high-resolution, low-latency data sets, serving detailed day-to-day activities¹ and processes of a business domain. Most of the business views created on top of the Iceberg tables can tolerate a few minutes of latency. Please stay tuned! Dehghani, Zhamak.

Big Data

Big Data Government Processing Analytics

AI Essentials for Tech Executives

O'Reilly

FEBRUARY 18, 2025

The eval process combines: Human review Model-based evaluation A/B testing The results then inform two parallel streams: Fine-tuning with carefully curated data Prompt engineering improvements These both feed into model improvements, which starts the cycle again. Were experiencing high latency in responses.

Latency

Latency Tuning Metrics Testing

KeyCDN Launches New POP in Mexico

KeyCDN

NOVEMBER 4, 2021

The POP is strategially located within the country and lowers latency overall. KeyCDN is always on the lookout for ways to minimize latency and accelerate asset delivery worldwide. Hola Mexico! We've launched our new point of presence (POP) in Mexico City.

Latency

Latency Tuning Cache Traffic

Escaping POC Purgatory: Evaluation-Driven Development for AI Systems

O'Reilly

MARCH 25, 2025

They also realized that, although LlamaIndex was cool to get this POC out the door, they couldnt easily figure out what prompt it was throwing to the LLM, what embedding model was being used, the chunking strategy, and so on. They used some local embeddings and played around with different chunking strategies.

Systems

Systems Development Tuning Monitoring

Crucial Redis Monitoring Metrics You Must Watch

Scalegrid

JANUARY 25, 2024

Key Takeaways Critical performance indicators such as latency, CPU usage, memory utilization, hit rate, and number of connected clients/slaves/evictions must be monitored to maintain Redis’s high throughput and low latency capabilities. Similarly, an increased throughput signifies an intensive workload on a server and a larger latency.

Metrics

Metrics Monitoring Latency Cache

Netflix Video Quality at Scale with Cosmos Microservices

The Netflix TechBlog

NOVEMBER 2, 2021

This enables us to use our scale to increase throughput and reduce latencies. Here, based on the video length, the throughput and latency requirements, available scale etc., Stay tuned for more details on these algorithmic innovations. VQS is called using the measureQuality endpoint. The workflow is initiated.

Media

Media Innovation Metrics Latency

Meet Hydrogen: A React Framework For Dynamic, Contextual And Personalized E-Commerce

Smashing Magazine

NOVEMBER 8, 2021

As developers, we rightfully obsess about the customer experience, relentlessly working to squeeze every millisecond out of the critical rendering path, optimize input latency, and eliminate jank. At the limit, statically generated, edge delivered, and HTML-first pages look like the optimal strategy. Stay tuned for more in 2022!

Cache

Cache Best Practices Strategy Servers

Formulating ‘Out of Memory Kill’ Prediction on the Netflix App as a Machine Learning Problem

The Netflix TechBlog

JULY 21, 2022

We also cover the distinction between non-changing attributes and runtime attributes and strategies to join them to make one cohesive dataset for OOM kill prediction. We covered labeling strategies that involved graded window based approaches and explored some graphical analysis on the structured dataset.

Big Data

Big Data Cache Engineering Data Engineering

Best Practices for a Seamless MongoDB Upgrade

Percona

NOVEMBER 2, 2023

Improved performance : MongoDB continually fine-tunes its database engine, resulting in faster query execution and reduced latency. Regulatory compliance : Upgrading your database is vital for compliance with various legal and regulatory standards, where data management and security play a pivotal role.

Best Practices

Best Practices Hardware Tuning Scalability

Most Common RabbitMQ Use Cases

Scalegrid

AUGUST 27, 2024

The software also extends capabilities allowing fine-tuning consumption parameters through QoS (Quality of Service) prefetch limits catered toward balancing load among numerous consumers, thus preventing overwhelming any single consumer entity. Take Softonic’s platform as an example.

Ecommerce

Ecommerce IoT Games Scalability

MySQL Performance Tuning 101: Key Tips to Improve MySQL Database Performance

Percona

SEPTEMBER 1, 2023

While there is no magic bullet for MySQL performance tuning, there are a few areas that can be focused on upfront that can dramatically improve the performance of your MySQL installation. What are the Benefits of MySQL Performance Tuning? A finely tuned database processes queries more efficiently, leading to swifter results.

Tuning

Tuning Database Performance Hardware

Incremental Processing using Netflix Maestro and Apache Iceberg

The Netflix TechBlog

NOVEMBER 20, 2023

As our business scales globally, the demand for data is growing and the needs for scalable low latency incremental processing begin to emerge. We are taking Big Data Orchestration to the next level and constantly solving new problems and challenges, please stay tuned. There are three common issues that the dataset owners usually face.

Processing

Processing Big Data Efficiency Engineering

Latency: Will it undermine the most interesting 5G use cases?

VoltDB

JUNE 19, 2017

Unfortunately, this means that the age-old Telco bugbears will rear their ugly heads again, including latency. 5G, as a fundamental requirement, mandates a 1 millisecond latency from the datasource to its destination. This requires 1 ms network latency. The post Latency: Will it undermine the most interesting 5G use cases?

Latency

Latency Retail Virtualization Hardware

Latency: Will it undermine the most interesting 5G use cases?

VoltDB

JUNE 19, 2017

Unfortunately, this means that the age-old Telco bugbears will rear their ugly heads again, including latency. 5G, as a fundamental requirement, mandates a 1 millisecond latency from the datasource to its destination. This requires 1 ms network latency. The post Latency: Will it undermine the most interesting 5G use cases?

Latency

Latency Retail Virtualization Hardware

DBLog: A Generic Change-Data-Capture Framework

The Netflix TechBlog

DECEMBER 17, 2019

However, this strategy does not work for all databases. Passive instances across regions are also possible, though it is recommended to operate in the same region as the database host in order to keep the change capture latencies low. In the latter case, write traffic is blocked until the dump completes. Figure 4— Delta Connector.

Database

Database Traffic Transportation Open Source

DBLog: A Generic Change-Data-Capture Framework

The Netflix TechBlog

DECEMBER 17, 2019

However, this strategy does not work for all databases. Passive instances across regions are also possible, though it is recommended to operate in the same region as the database host in order to keep the change capture latencies low. In the latter case, write traffic is blocked until the dump completes. Figure 4— Delta Connector.

Database

Database Traffic Transportation Open Source

A Decade of Dynamo: Powering the next wave of high-performance, internet-scale applications

All Things Distributed

OCTOBER 2, 2017

Performant – DynamoDB consistently delivers single-digit millisecond latencies even as your traffic volume increases. DynamoDB automatically re-distributes your data to healthy servers to ensure there are always multiple replicas of your data without you needing to intervene.

Internet

Internet Internet AWS Performance

Friends don't let friends build data pipelines

Abhishek Tiwari

JULY 12, 2018

Here are 8 fallacies of data pipeline The pipeline is reliable Topology is stateless Pipeline is infinitely scalable Processing latency is minimum Everything is observable There is no domino effect Pipeline is cost-effective Data is homogeneous The pipeline is reliable The inconvenient truth is that pipeline is not reliable.

Latency

Latency Analytics Scalability Engineering

Why Preventive Maintenance Requires Real-Time Decisioning

VoltDB

SEPTEMBER 26, 2024

Preventive maintenance is a strategy aimed at anticipating equipment failures before they occur, enabling timely interventions that can prevent unplanned downtimes and extend the lifespan of machinery. Beyond just predicting failures, real-time data can be used to fine-tune equipment operations for optimal performance.

IoT

IoT Analytics Strategy Efficiency

Why Predictive Maintenance Requires Real-Time Decisioning

VoltDB

SEPTEMBER 26, 2024

Predictive maintenance is a strategy aimed at anticipating equipment failures before they occur, enabling timely interventions that can prevent unplanned downtimes and extend the lifespan of machinery. Beyond just predicting failures, real-time data can be used to fine-tune equipment operations for optimal performance.

IoT

IoT Analytics Strategy Efficiency

MongoDB Best Practices: Security, Data Modeling, & Schema Design

Percona

APRIL 17, 2023

The main objective of this post is to share my experience over the past years tuning MongoDB and centralize the diverse sources that I crossed in this journey in a unique place. The CFQ works well for many general use cases but lacks latency guarantees. Spoiler alert: This post focuses on MongoDB 3.6.X tcp_fin_timeout = 30 net.ipv4.tcp_keepalive_intvl

Best Practices

Best Practices Design Tuning Database

HTTP/3: Practical Deployment Options (Part 3)

Smashing Magazine

SEPTEMBER 6, 2021

Finally, not inlining resources has an added latency cost because the file needs to be requested. As such, a micro-optimization is, again, how you probably need to fine-tune things on a low level to really benefit from it. There is no need to change your bundling or inlining strategy, nor to consolidate or shard your server farm.

Network

Network Servers Cache Traffic

SQL Server I/O Basics Chapter #1

SQL Server According to Bob

JANUARY 11, 2020

Many high-end disk subsystems provide high-speed cache facilities to reduce the latency of read and write operations. Important Always consult with your hardware manufacturer for proper stable media strategies. For specific information on I/O tuning and balancing, you will find more details in the following document.

Servers

Servers Cache Media Hardware

Tuning Autovacuum in PostgreSQL and Autovacuum Internals

Percona

AUGUST 10, 2018

This may help tune your table level autovacuum settings appropriately. Tuning Autovacuum in PostgreSQL. How do we identify the tables that need their autovacuum settings tuned ? . In order to tune autovacuum for tables individually, you must know the number of inserts/deletes/updates on a table for an interval.

Tuning

Tuning Cache Database Storage

SQL Server I/O Basics Chapter #2

SQL Server According to Bob

JANUARY 11, 2020

Restore backups on secondary systems frequently to make sure that your complete backup strategy is functioning correctly. Make sure that you understand any write-caching strategies that the utility uses.

Servers

Servers Cache Database Media

Foundation Model for Personalized Recommendation

Netflix’s Distributed Counter Abstraction

Trending Sources

How to Optimize CPU Performance Through Isolation and System Tuning

Introducing Impressions at Netflix

RabbitMQ vs. Kafka: Key Differences

Title Launch Observability at Netflix Scale

Why applying chaos engineering to data-intensive applications matters

Migrating Critical Traffic At Scale with No Downtime?—?Part 1

Best Practices for Scaling RabbitMQ

Migrating Netflix to GraphQL Safely

Comparing PostgreSQL DigitalOcean Performance & Pricing – ScaleGrid vs. DigitalOcean Managed Databases

Introducing Netflix’s Key-Value Data Abstraction Layer

Introducing Netflix TimeSeries Data Abstraction Layer

OpenTelemetry 101: A nontechnical guide for IT leaders and enthusiasts

Plan Your Multi Cloud Strategy

Migrating Critical Traffic At Scale with No Downtime?—?Part 2

PostgreSQL Connection Pooling: Part 1 – Pros & Cons

Building Netflix’s Distributed Tracing Infrastructure

Optimizing data warehouse storage

Data Movement in Netflix Studio via Data Mesh

AI Essentials for Tech Executives

KeyCDN Launches New POP in Mexico

Escaping POC Purgatory: Evaluation-Driven Development for AI Systems

Crucial Redis Monitoring Metrics You Must Watch

Netflix Video Quality at Scale with Cosmos Microservices

Meet Hydrogen: A React Framework For Dynamic, Contextual And Personalized E-Commerce

Formulating ‘Out of Memory Kill’ Prediction on the Netflix App as a Machine Learning Problem

Best Practices for a Seamless MongoDB Upgrade

Most Common RabbitMQ Use Cases

MySQL Performance Tuning 101: Key Tips to Improve MySQL Database Performance

Incremental Processing using Netflix Maestro and Apache Iceberg

Latency: Will it undermine the most interesting 5G use cases?

Latency: Will it undermine the most interesting 5G use cases?

DBLog: A Generic Change-Data-Capture Framework

DBLog: A Generic Change-Data-Capture Framework

A Decade of Dynamo: Powering the next wave of high-performance, internet-scale applications

Friends don't let friends build data pipelines

Why Preventive Maintenance Requires Real-Time Decisioning

Why Predictive Maintenance Requires Real-Time Decisioning

MongoDB Best Practices: Security, Data Modeling, & Schema Design

HTTP/3: Practical Deployment Options (Part 3)

SQL Server I/O Basics Chapter #1

Tuning Autovacuum in PostgreSQL and Autovacuum Internals

SQL Server I/O Basics Chapter #2

Stay Connected