Latency and Tuning - Technology Performance Pulse

Foundation Model for Personalized Recommendation

The Netflix TechBlog

MARCH 28, 2025

Yet, many are confined to a brief temporal window due to constraints in serving latency or training costs. It facilitates the distribution of these learnings to other models, either through shared model weights for fine tuning or directly through embeddings.

Tuning

Tuning Efficiency Latency Strategy

How to Optimize CPU Performance Through Isolation and System Tuning

DZone

MAY 1, 2023

CPU isolation and efficient system management are critical for any application which requires low-latency and high-performance computing. To achieve this level of performance, such systems require dedicated CPU cores that are free from interruptions by other processes, together with wider system tuning.

Tuning

Tuning Systems Latency Performance

Netflix’s Distributed Counter Abstraction

The Netflix TechBlog

NOVEMBER 12, 2024

By: Rajiv Shringi , Oleksii Tkachuk , Kartik Sathyanarayanan Introduction In our previous blog post, we introduced Netflix’s TimeSeries Abstraction , a distributed service designed to store and query large volumes of temporal event data with low millisecond latencies. Today, we’re excited to present the Distributed Counter Abstraction.

Latency

Latency Cache Infrastructure Strategy

Introducing Impressions at Netflix

The Netflix TechBlog

FEBRUARY 14, 2025

This dual-path approach leverages Kafkas capability for low-latency streaming and Icebergs efficient management of large-scale, immutable datasets, ensuring both real-time responsiveness and comprehensive historical data availability. million impression events globally every second, with each event approximately 1.2KB in size.

Tuning

Tuning Latency Efficiency Storage

RabbitMQ vs. Kafka: Key Differences

Scalegrid

FEBRUARY 6, 2025

Its partitioned log architecture supports both queuing and publish-subscribe models, allowing it to handle large-scale event processing with minimal latency. Apache Kafka uses a custom TCP/IP protocol for high throughput and low latency. Apache Kafka, designed for distributed event streaming, maintains low latency at scale.

Latency

Latency Analytics Architecture Storage

Why applying chaos engineering to data-intensive applications matters

Dynatrace

MAY 23, 2024

Stream processing systems, designed for continuous, low-latency processing, demand swift recovery mechanisms to tolerate and mitigate failures effectively. This significantly increases event latency. Spark Structured Streaming can also provide consistent fault recovery for applications where latency is not a critical requirement.

Engineering

Engineering Tuning Latency Open Source

Performance Tuning Java Applications in Linux

DZone

DECEMBER 4, 2019

You may also like: How to Properly Plan JVM Performance Tuning. While Performance Tuning an application both Code and Hardware running the code should be accounted for. For low latency, applications use Concurrent Mark and Sweep Algorithm — CMS or G1 GC. Learn how to make your Java applications performance perfectly.

Java

Java Tuning Performance Hardware

Migrating Critical Traffic At Scale with No Downtime?—?Part 1

The Netflix TechBlog

MAY 4, 2023

Migrating Critical Traffic At Scale with No Downtime — Part 1 Shyam Gala , Javier Fernandez-Ivern , Anup Rokkam Pratap , Devang Shah Hundreds of millions of customers tune into Netflix every day, expecting an uninterrupted and immersive streaming experience. Logging is selective to cases where the old and new responses do not match.

Traffic

Traffic Latency Tuning Systems

Title Launch Observability at Netflix Scale

The Netflix TechBlog

DECEMBER 17, 2024

The Challenge of Title Launch Observability As engineers, were wired to track system metrics like error rates, latencies, and CPU utilizationbut what about metrics that matter to a titlessuccess? Stay tuned for a closer look at the innovation behind thescenes!

Traffic

Traffic Scalability Strategy Monitoring

Comparing PostgreSQL DigitalOcean Performance & Pricing – ScaleGrid vs. DigitalOcean Managed Databases

Scalegrid

JUNE 4, 2020

Compare Latency. lower latency compared to DigitalOcean for PostgreSQL. Now, let’s take a look at the throughput and latency performance of our comparison. Next, we are going to test and compare the latency performance between ScaleGrid and DigitalOcean for PostgreSQL. PostgreSQL DigitalOcean Latency Averages (ms).

Database

Database Latency Benchmarking Performance

Optimizing your Kubernetes clusters without breaking the bank

Dynatrace

JANUARY 14, 2022

Tuning thousands of parameters has become an impossible task to achieve via a manual and time-consuming approach. The optimization goal was to improve the application efficiency, that is to improve the ratio between service throughput and cloud costs while not increasing the application latency (e.g. The Akamas approach.

Latency

Latency Tuning Efficiency AWS

Introducing Netflix’s Key-Value Data Abstraction Layer

The Netflix TechBlog

SEPTEMBER 18, 2024

These include challenges with tail latency and idempotency, managing “wide” partitions with many rows, handling single large “fat” columns, and slow response pagination. It also serves as central configuration of access patterns such as consistency or latency targets. Useful for keeping “n-newest” or prefix path deletion.

Latency

Latency Storage Cache Servers

Best Practice for Creating Indexes on your MySQL Tables

Scalegrid

NOVEMBER 20, 2019

95th Percentile Latency. The 95th percentile latency of queries was also 1.8 Stay tuned for my follow-on blog post with more details! Workload Throughput (Queries Per Second). Index Creation on Master. Rolling Index Creation. times higher when the index creation happened on the master server.

Best Practices

Best Practices Latency Tuning Database

Best MySQL DigitalOcean Performance – ScaleGrid vs. DigitalOcean Managed Databases

Scalegrid

JUNE 22, 2020

Compare Latency. On average, ScaleGrid achieves almost 30% lower latency over DigitalOcean for the same deployment configurations. Now that we’ve compared throughput performance, let’s take a look at ScaleGrid vs. DigitalOcean latency for MySQL. Read-Intensive Latency Benchmark. Balanced Workload Latency Benchmark.

Database

Database Benchmarking Latency Performance

Bending pause times to your will with Generational ZGC

The Netflix TechBlog

MARCH 5, 2024

Reduced tail latencies In both our GRPC and DGS Framework services, GC pauses are a significant source of tail latencies. For a given CPU utilization target, ZGC improves both average and P99 latencies with equal or better CPU utilization when compared to G1. No explicit tuning has been required to achieve these results.

Latency

Latency Java Tuning Efficiency

Introducing Netflix TimeSeries Data Abstraction Layer

The Netflix TechBlog

OCTOBER 8, 2024

Rajiv Shringi Vinay Chella Kaidan Fullerton Oleksii Tkachuk Joey Lynch Introduction As Netflix continues to expand and diversify into various sectors like Video on Demand and Gaming , the ability to ingest and store vast amounts of temporal data — often reaching petabytes — with millisecond access latency has become increasingly vital.

Latency

Latency Storage Traffic Tuning

Migrating Netflix to GraphQL Safely

The Netflix TechBlog

JUNE 14, 2023

To determine customer impact, we could compare various metrics such as error rates, latencies, and time to render. The control group’s traffic utilized the legacy Falcor stack, while the experiment population leveraged the new GraphQL client and was directed to the GraphQL Shim. We then collect and analyze the performance of the two clusters.

Traffic

Traffic Latency Metrics Cache

Allegro Reduces Kafka Producer Latency Outliers by 82% After Switching to XFS

InfoQ

APRIL 26, 2024

Allegro experimented with different performance optimization options to improve Apache Kafka producer tail latency and eventually switched all its clusters to the XFS filesystem. The company used Kafka protocol sniffing, JVM profiling, and eBPF, which proved instrumental in identifying and eliminating performance bottlenecks.

Latency

Latency Performance Tuning Scalability

Pushy to the Limit: Evolving Netflix’s WebSocket proxy for the future

The Netflix TechBlog

SEPTEMBER 10, 2024

Dynomite is a Netflix open source wrapper around Redis that provides a few additional features like auto-sharding and cross-region replication, and it provided Pushy with low latency and easy record expiry, both of which are critical for Pushy’s workload. As Pushy’s portfolio grew, we experienced some pain points with Dynomite.

Latency

Latency Cache Tuning Efficiency

Best Practices for Scaling RabbitMQ

Scalegrid

FEBRUARY 24, 2025

While clustering across wide-area networks (WANs) is discouraged due to latency issues, leased links can mitigate some connectivity challenges. With 24/7 expert support, ScaleGrid assists with troubleshooting, performance tuning, and migration processes. Keeping queues short maintains a responsive and efficient RabbitMQ setup.

Best Practices

Best Practices Traffic Strategy Scalability

OpenTelemetry 101: A nontechnical guide for IT leaders and enthusiasts

Dynatrace

JULY 22, 2024

Traces are used for performance analysis, latency optimization, and root cause analysis. OpenTelemetry provides [extensive documentation]([link] and examples to help you fine-tune your configuration for maximum effectiveness. Capture critical performance indicators such as request latency, error rates, and resource usage.

Latency

Latency Best Practices Metrics Open Source

How to Improve MySQL AWS Performance 2X Over Amazon RDS at The Same Cost

Scalegrid

OCTOBER 24, 2019

As organizations continue to migrate to the cloud, it’s important to get in front of performance issues, such as high latency, low throughput, and replication lag with higher distances between your users and cloud infrastructure. ScaleGrid also maintains 53% lower latency on average throughout the entire MySQL AWS performance tests.

AWS

AWS Latency Performance Performance Testing

Bulldozer: Batch Data Moving from Data Warehouse to Online Key-Value Stores

The Netflix TechBlog

OCTOBER 27, 2020

The data warehouse is not designed to serve point requests from microservices with low latency. Therefore, we must efficiently move data from the data warehouse to a global, low-latency and highly-reliable key-value store. How Bulldozer leverages Spark, Protobuf and KV DAL for moving the data.

Latency

Latency Storage Big Data Tuning

How Dynatrace boosts production resilience with Site Reliability Guardian

Dynatrace

MAY 17, 2023

In this case, the four golden signals (latency, traffic, errors, and saturation) are derived from span attributes and DQL metric queries via Dynatrace Grail™. Based on those insights, they implemented automated validation tasks, and shifted left in their software delivery pipeline.

DevOps

DevOps Traffic Latency Best Practices

The Netflix Cosmos Platform

The Netflix TechBlog

MARCH 1, 2021

It supports both high throughput services that consume hundreds of thousands of CPUs at a time, and latency-sensitive workloads where humans are waiting for the results of a computation. The subsystems all communicate with each other asynchronously via Timestone, a high-scale, low-latency priority queuing system. Warm capacity.

Serverless

Serverless Media Latency Social Media

Faster time to value with enhanced handling of OneAgent runtime data

Dynatrace

SEPTEMBER 23, 2020

Storage mount points in a system might be larger or smaller, local or remote, with high or low latency, and various speeds. For example: All subfolders of the /opt directory are mounted as local, low latency, high-throughput drives, with relatively low storage capacity. Stay tuned for upcoming news about these changes.

Storage

Storage Latency Operating System Network

Telltale: Netflix Application Monitoring Simplified

The Netflix TechBlog

AUGUST 13, 2020

You’re half awake and wondering, “Is there really a problem or is this just an alert that needs tuning? Telltale learns what constitutes typical health for an application, no alert tuning required. For example, a latency increase is less critical than error rate increase and some error codes are less critical than others.

Monitoring

Monitoring Tuning Traffic Metrics

LinkedIn Migrates Espresso to HTTP2 and Reduces Connections by 88% and Latency by 75%

InfoQ

DECEMBER 4, 2023

to HTTP2, resulting in a reduction in the number of connections, latency, and garbage collection times. LinkedIn was able to dramatically improve the scalability and performance of its Espresso database by migrating it from HTTP1.1 To achieve these gains, the team had to optimize the Netty’s default HTTP2 stack to make it fit their needs.

Latency

Latency Scalability Database Performance

Rebuilding Netflix Video Processing Pipeline with Microservices

The Netflix TechBlog

JANUARY 10, 2024

This architecture shift greatly reduced the processing latency and increased system resiliency. We expanded pipeline support to serve our studio/content-development use cases, which had different latency and resiliency requirements as compared to the traditional streaming use case. divide the input video into small chunks 2.

Processing

Processing Media Latency Innovation

What is serverless computing? Driving efficiency without sacrificing observability

Dynatrace

JANUARY 26, 2021

When an application is triggered, it can cause latency as the application starts. This creates latency when they need to restart. The platform builds the trigger to initiate the app. Every time the trigger executes, the function runs on an available resource. How does serverless computing tackle inefficiencies?

Serverless

Serverless Efficiency Lambda Azure

How BizDevOps can “shift left” using SLOs to automate quality gates

Dynatrace

MAY 5, 2021

For example, improving latency by as little as 0.1 latency is the number one reason consumers abandon mobile sites. Fine-tuning the service-level indicators that make up quality gates will improve with the help of upcoming features. Organizations can feel the impact of even a minor roadblock in the user experience.

Benchmarking

Benchmarking Latency Speed Software

Enhancing Kubernetes cluster management key to platform engineering success

Dynatrace

MARCH 29, 2024

You can ask for the best configuration to reduce latency or improve the user experience.” And with automatic application tuning, teams spend less time on manually testing and reviewing configurations, resulting in up to five times the productivity of performance engineers, DevOps, and SREs when it comes to application optimization.

Engineering

Engineering DevOps Operating System Open Source

Netflix Cloud Packaging in the Terabyte Era

The Netflix TechBlog

SEPTEMBER 24, 2021

Uploading and downloading data always come with a penalty, namely latency. Figure 3: Video Processing with Index and Virtual Assembly Using virtual assembly greatly improves the latency performance of the ProRes 422 HQ proxy generation by removing one round trip of cloud downloading and cloud uploading by the physical assembler.

Cloud

Cloud Media Storage Cache

PostgreSQL Connection Pooling: Part 1 – Pros & Cons

Scalegrid

OCTOBER 17, 2019

While there is plenty of well-documented benefits to using a connection pooler, there are some arguments to be made against using one: Introducing a middleware in the communication inevitably introduces some latency. A middleware becomes a single point of failure. Should You Use a PostgreSQL Connection Pooler?

Architecture

Architecture Database Latency Servers

Applying Netflix DevOps Patterns to Windows

The Netflix TechBlog

AUGUST 22, 2019

The canary stage will determine a score based on metrics such as CPU, threads, latency, and GC pauses. Running a canary for each change and testing the AMI in production allows us to capture insights around impact on Windows updates, script changes, tuning web server configuration, among others.

DevOps

DevOps AWS Tuning Infrastructure

What’s New at ScaleGrid – September 2024

Scalegrid

SEPTEMBER 10, 2024

Additionally, we’ve added the Philadelphia AWS Local Zone , helping to reduce latency for customers operating in the eastern U.S. This enables ScaleGrid users in Australia and nearby regions to access lower-latency services and improved performance. Stay tuned for more exciting updates in the months to come! <p>The

Latency

Latency AWS Storage Tuning

Build and operate multicloud FaaS with enhanced, intelligent end-to-end observability

Dynatrace

APRIL 25, 2023

Higher latency and cold start issues due to the initialization time of the functions. Understanding cold-start behavior is essential to tune your cloud applications cost or performance to meet your operational needs. Enable faster development and deployment cycles by abstracting away the infrastructure complexity.

Serverless

Serverless Lambda Azure AWS

Automated observability, security, and reliability at scale

Dynatrace

JULY 18, 2023

Whether tracking internal, workload-centric indicators such as errors, duration, or saturation or focusing on the golden signals and other user-centric views such as availability, latency, traffic, or engagement, SLOs-as-code enables coherent and consistent monitoring throughout the environment at scale.

Best Practices

Best Practices Code Infrastructure Latency

Top 10 Tips for Making the Spark + Alluxio Stack Blazing Fast

DZone

JULY 10, 2019

In addition, compute and storage are increasingly being separated causing larger latencies for queries. This article provides the top 10 tips for performance tuning for real-world workloads when running Spark on Alluxio with data locality, giving the most bang for the buck.

Best Practices

Best Practices Storage Latency Tuning

Tuning SQL Server Reporting Services

SQL Performance

SEPTEMBER 17, 2019

This article will cover many areas that database administrators need to be aware of in order to properly license, recover, and tune a Reporting Services installation. Tuning Options. Tuning SSRS is much like any other application. Disk latency for ReportServer and ReportServerTempDB are very important. General Tuning.

Tuning

Tuning Servers Database Best Practices

Optimizing data warehouse storage

The Netflix TechBlog

DECEMBER 21, 2020

These principles reduce resource usage by being more efficient and effective while lowering the end-to-end latency in data processing. Orient: Gather tuning parameters for a particular table that changed. AutoAnalyze In short, AutoAnalyze finds the best tuning/configuration parameters for a table. More processing resources.

Storage

Storage Latency Efficiency Data Engineering

Get up to 300 new metrics out of the box with AWS supporting services (GA)

Dynatrace

DECEMBER 22, 2019

The example below visualizes average latency by API name and stage for a specific AWS API Gateway. Stay tuned for updates in Q1 2020. You can observe the metrics across service instances split by region (in this example, API Gateways in us-east-1 and us-east-2 ). You can also create custom charts. Requirements. AWS supporting services.

AWS

AWS Metrics IoT Storage

Rapid Event Notification System at Netflix

The Netflix TechBlog

FEBRUARY 18, 2022

This separation allows us to tune system configuration and scaling policies independently for different event priorities and traffic patterns. We thus assigned a priority to each use case and sharded event traffic by routing to priority-specific queues and the corresponding event processing clusters.

Systems

Systems Traffic Architecture Mobile

Evolving from Rule-based Classifier: Machine Learning Powered Auto Remediation in Netflix Data…

The Netflix TechBlog

MARCH 4, 2024

Operational automation–including but not limited to, auto diagnosis, auto remediation, auto configuration, auto tuning, auto scaling, auto debugging, and auto testing–is key to the success of modern data platforms. We have also noted a great potential for further improvement by model tuning (see the section of Rollout in Production).

Tuning

Tuning Efficiency Big Data Engineering

Foundation Model for Personalized Recommendation

How to Optimize CPU Performance Through Isolation and System Tuning

Trending Sources

Netflix’s Distributed Counter Abstraction

Introducing Impressions at Netflix

RabbitMQ vs. Kafka: Key Differences

Why applying chaos engineering to data-intensive applications matters

Performance Tuning Java Applications in Linux

Migrating Critical Traffic At Scale with No Downtime?—?Part 1

Title Launch Observability at Netflix Scale

Comparing PostgreSQL DigitalOcean Performance & Pricing – ScaleGrid vs. DigitalOcean Managed Databases

Optimizing your Kubernetes clusters without breaking the bank

Introducing Netflix’s Key-Value Data Abstraction Layer

Best Practice for Creating Indexes on your MySQL Tables

Best MySQL DigitalOcean Performance – ScaleGrid vs. DigitalOcean Managed Databases

Bending pause times to your will with Generational ZGC

Introducing Netflix TimeSeries Data Abstraction Layer

Migrating Netflix to GraphQL Safely

Allegro Reduces Kafka Producer Latency Outliers by 82% After Switching to XFS

Pushy to the Limit: Evolving Netflix’s WebSocket proxy for the future

Best Practices for Scaling RabbitMQ

OpenTelemetry 101: A nontechnical guide for IT leaders and enthusiasts

How to Improve MySQL AWS Performance 2X Over Amazon RDS at The Same Cost

Bulldozer: Batch Data Moving from Data Warehouse to Online Key-Value Stores

How Dynatrace boosts production resilience with Site Reliability Guardian

The Netflix Cosmos Platform

Faster time to value with enhanced handling of OneAgent runtime data

Telltale: Netflix Application Monitoring Simplified

LinkedIn Migrates Espresso to HTTP2 and Reduces Connections by 88% and Latency by 75%

Rebuilding Netflix Video Processing Pipeline with Microservices

What is serverless computing? Driving efficiency without sacrificing observability

How BizDevOps can “shift left” using SLOs to automate quality gates

Enhancing Kubernetes cluster management key to platform engineering success

Netflix Cloud Packaging in the Terabyte Era

PostgreSQL Connection Pooling: Part 1 – Pros & Cons

Applying Netflix DevOps Patterns to Windows

What’s New at ScaleGrid – September 2024

Build and operate multicloud FaaS with enhanced, intelligent end-to-end observability

Automated observability, security, and reliability at scale

Top 10 Tips for Making the Spark + Alluxio Stack Blazing Fast

Tuning SQL Server Reporting Services

Optimizing data warehouse storage

Get up to 300 new metrics out of the box with AWS supporting services (GA)

Rapid Event Notification System at Netflix

Evolving from Rule-based Classifier: Machine Learning Powered Auto Remediation in Netflix Data…

Stay Connected