Design and Latency - Technology Performance Pulse

Designing Instagram

High Scalability

JANUARY 11, 2022

Design a photo-sharing platform similar to Instagram where users can upload their photos and share it with their followers. High Level Design. Component Design. API Design. We have provided the API design of posting an image on Instagram below. API Design. Problem Statement. Architecture. Fetching User Feed.

Design

Design Media Storage Logistics

Optimising for High Latency Environments

CSS Wizardry

SEPTEMBER 16, 2024

This gives fascinating insights into the network topography of our visitors, and how much we might be impacted by high latency regions. Round-trip-time (RTT) is basically a measure of latency—how long did it take to get from one endpoint to another and back again? What is RTT? Where Does CrUX’s RTT Data Come From?

Latency

Latency Cache Transportation Mobile

Netflix’s Distributed Counter Abstraction

The Netflix TechBlog

NOVEMBER 12, 2024

By: Rajiv Shringi , Oleksii Tkachuk , Kartik Sathyanarayanan Introduction In our previous blog post, we introduced Netflix’s TimeSeries Abstraction , a distributed service designed to store and query large volumes of temporal event data with low millisecond latencies. Today, we’re excited to present the Distributed Counter Abstraction.

Latency

Latency Cache Infrastructure Strategy

RabbitMQ vs. Kafka: Key Differences

Scalegrid

FEBRUARY 6, 2025

RabbitMQ is designed for flexible routing and message reliability, while Kafka handles high-throughput event streaming and real-time data processing. Its design prioritizes high availability and efficient data transfer with minimal overhead, making it a practical choice for handling real-time data pipelines and distributed event processing.

Latency

Latency Analytics Architecture Storage

How To Design For High-Traffic Events And Prevent Your Website From Crashing

Smashing Magazine

JANUARY 7, 2025

How To Design For High-Traffic Events And Prevent Your Website From Crashing How To Design For High-Traffic Events And Prevent Your Website From Crashing Saad Khan 2025-01-07T14:00:00+00:00 2025-01-07T22:04:48+00:00 This article is sponsored by Cloudways Product launches and sales typically attract large volumes of traffic.

Traffic

Traffic Website Design Cache

API Design Principles for Optimal Performance and Scalability

DZone

JUNE 22, 2023

It involves a combination of techniques and best practices aimed at reducing latency, improving user experience, and increasing the overall efficiency of the system. API performance optimization is the process of improving the speed, scalability, and reliability of APIs.

Scalability

Scalability Design Best Practices Performance

Timestone: Netflix’s High-Throughput, Low-Latency Priority Queueing System with Built-in Support…

The Netflix TechBlog

SEPTEMBER 29, 2022

Timestone: Netflix’s High-Throughput, Low-Latency Priority Queueing System with Built-in Support for Non-Parallelizable Workloads by Kostas Christidis Introduction Timestone is a high-throughput, low-latency priority queueing system we built in-house to support the needs of Cosmos , our media encoding platform. Over the past 2.5

Latency

Latency Systems Media Serverless

Practical API Design at Netflix, Part 1: Using Protobuf FieldMask

The Netflix TechBlog

SEPTEMBER 3, 2021

Remote calls are never free; they impose extra latency, increase probability of an error, and consume network bandwidth. How can we achieve a similar functionality when designing our gRPC APIs? When we process a request it is often beneficial to know which fields the caller is interested in and which ones they ignore.

Design

Design Java Code Servers

Benchmark (YCSB) numbers for Redis, MongoDB, Couchbase2, Yugabyte and BangDB

High Scalability

FEBRUARY 17, 2021

We note that for MongoDB update latency is really very low (low is better) compared to other dbs, however the read latency is on the higher side. The latency table shows that 99th percentile latency for Yugabyte is quite high compared to others (lower is better). Again Yugabyte latency is quite high. Conclusion.

Benchmarking

Benchmarking Latency C++ Database

Architectural Insights: Designing Efficient Multi-Layered Caching With Instagram Example

DZone

FEBRUARY 27, 2024

Leveraging this hierarchical structure can significantly reduce latency and improve overall performance.

Cache

Cache Efficiency Architecture Design

Spring WebFlux: publishOn vs subscribeOn for Improving Microservices Performance

DZone

SEPTEMBER 23, 2024

With the rise of microservices architecture , there has been a rapid acceleration in the modernization of legacy platforms, leveraging cloud infrastructure to deliver highly scalable, low-latency, and more responsive services. Why Use Spring WebFlux?

Performance

Performance Latency Architecture Programming

Seeing through hardware counters: a journey to threefold performance increase

The Netflix TechBlog

NOVEMBER 9, 2022

A quick canary test was free of errors and showed lower latency, which is expected given that our standard canary setup routes an equal amount of traffic to both the baseline running on 4xl and the canary on 12xl. What’s worse, average latency degraded by more than 50%, with both CPU and latency patterns becoming more “choppy.”

Hardware

Hardware Cache Performance Latency

Best Practices for Scaling RabbitMQ

Scalegrid

FEBRUARY 24, 2025

The architecture of RabbitMQ is meticulously designed for complex message routing, enabling dynamic and flexible interactions between producers and consumers. While clustering across wide-area networks (WANs) is discouraged due to latency issues, leased links can mitigate some connectivity challenges.

Best Practices

Best Practices Traffic Strategy Efficiency

MySQL on Azure Performance Benchmark – ScaleGrid vs. Azure Database

Scalegrid

AUGUST 26, 2020

ScaleGrid MySQL on Azure so you can see which provider offers the best throughput and latency performance. We measure latency in ms 95th percentile latency. During Read-Intensive Workloads, ScaleGrid manages to achieve up to 3 times higher throughput and averages 66% better latency compared to Azure Database.

Azure

Azure Benchmarking Database Latency

The Three Cs: Concatenate, Compress, Cache

CSS Wizardry

OCTOBER 16, 2023

Plotted on the same horizontal axis of 1.6s, the waterfalls speak for themselves: 201ms of cumulative latency; 109ms of cumulative download. 4,362ms of cumulative latency; 240ms of cumulative download. When we talk about downloading files, we—generally speaking—have two things to consider: latency and bandwidth. It gets worse.

Cache

Cache Latency Strategy Speed

Introducing Netflix’s Key-Value Data Abstraction Layer

The Netflix TechBlog

SEPTEMBER 18, 2024

These include challenges with tail latency and idempotency, managing “wide” partitions with many rows, handling single large “fat” columns, and slow response pagination. It also serves as central configuration of access patterns such as consistency or latency targets.

Latency

Latency Storage Cache Servers

Scalable Annotation Service?—?Marken

The Netflix TechBlog

JANUARY 25, 2023

The service should be able to serve real-time, aka UI, applications so CRUD and search operations should be achieved with low latency. Our service will be used by a lot of internal UI applications hence the latency for CRUD and search operations must be low. Search latency for the generic text queries are in milliseconds.

Scalability

Scalability Latency Media Architecture

Performance and Scalability Analysis of Redis and Memcached

DZone

JULY 2, 2024

We have run these benchmarks on the AWS EC2 instances and designed a custom dataset to make it as close as possible to real application use cases. We compare throughput, operations per second, and latency under different loads, namely the P90 and P99 percentiles.

Scalability

Scalability Performance Benchmarking Games

Why applying chaos engineering to data-intensive applications matters

Dynatrace

MAY 23, 2024

Stream processing systems, designed for continuous, low-latency processing, demand swift recovery mechanisms to tolerate and mitigate failures effectively. We designed experimental scenarios inspired by chaos engineering. This significantly increases event latency. Recovery time of the throughput metric.

Engineering

Engineering Tuning Latency Open Source

Title Launch Observability at Netflix Scale

The Netflix TechBlog

DECEMBER 17, 2024

The Challenge of Title Launch Observability As engineers, were wired to track system metrics like error rates, latencies, and CPU utilizationbut what about metrics that matter to a titlessuccess? How can we design systems that recognize these nuances and empower every title to shine and bring joy to ourmembers?

Traffic

Traffic Scalability Strategy Monitoring

Resilience Pattern: Circuit Breaker

DZone

NOVEMBER 16, 2023

The circuit breaker is a design pattern that prevents cascading failures and improves the overall availability and performance of a system. A dependency can become unhealthy or unavailable for various reasons, such as network failures, high latency, timeouts, errors, or overload. What Is a Circuit Breaker?

Latency

Latency Network Database Monitoring

Allegro Reduces Kafka Producer Latency Outliers by 82% After Switching to XFS

InfoQ

APRIL 26, 2024

Allegro experimented with different performance optimization options to improve Apache Kafka producer tail latency and eventually switched all its clusters to the XFS filesystem. The company used Kafka protocol sniffing, JVM profiling, and eBPF, which proved instrumental in identifying and eliminating performance bottlenecks.

Latency

Latency Performance Tuning Scalability

Self-Host Your Static Assets

CSS Wizardry

MAY 31, 2019

Every new origin we need to visit needs a connection opening, and that can be very costly: DNS resolution, TCP handshakes, and TLS negotiation all add up, and the story gets worse the higher the latency of the connection is. On a slower, higher-latency connection, the story is much, mush worse. All completely avoidable. to just 3.6s.

Cache

Cache Latency Infrastructure Website

Introducing Netflix TimeSeries Data Abstraction Layer

The Netflix TechBlog

OCTOBER 8, 2024

Rajiv Shringi Vinay Chella Kaidan Fullerton Oleksii Tkachuk Joey Lynch Introduction As Netflix continues to expand and diversify into various sectors like Video on Demand and Gaming , the ability to ingest and store vast amounts of temporal data — often reaching petabytes — with millisecond access latency has become increasingly vital.

Latency

Latency Storage Traffic Infrastructure

Dynatrace automatically monitors OpenAI ChatGPT for companies that deliver reliable, cost-effective services powered by generative AI

Dynatrace

JUNE 7, 2023

A typical design pattern is the use of a semantic search over a domain-specific knowledge base, like internal documentation, to provide the required context in the prompt. With these latency, reliability, and cost measurements in place, your operations team can now define their own OpenAI dashboards and SLOs.

Monitoring

Monitoring Latency Metrics Azure

Dynatrace Managed turnkey Premium High Availability for globally distributed data centers (Early Adopter)

Dynatrace

JUNE 25, 2020

The network latency between cluster nodes should be around 10 ms or less. For Premium HA, this has been extended from 10 ms latency (in the same network region) to around 100 ms network latency due to asynchronous data replication between regions. In the image below, three downed nodes make an entire cluster unavailable.

Availability

Availability Hardware Latency Traffic

Netflix Cloud Packaging in the Terabyte Era

The Netflix TechBlog

SEPTEMBER 24, 2021

Uploading and downloading data always come with a penalty, namely latency. Figure 2: Cloud Resource and Job Sizes This initial architecture was designed at a time when packaging from a list of chunks was not possible and terabyte-sized files were not considered. thus making this design viable. at most few parts at a time?—?thus

Cloud

Cloud Media Storage Cache

Build automated self-healing systems with xMatters and Dynatrace (Part 2 of 3)

Dynatrace

AUGUST 27, 2019

Welcome back to the blog series in which we share how you can easily solve three common problem scenarios by using Dynatrace and xMatters Flow Designer. This is where xMatters Flow Designer comes into play, by automating remediation steps at the touch of a button. Step 5 – xMatters triggers a runbook in Ansible to fix the disk latency.

Systems

Systems DevOps Latency Azure

Predictive CPU isolation of containers at Netflix

The Netflix TechBlog

JUNE 4, 2019

Because microprocessors are so fast, computer architecture design has evolved towards adding various levels of caching between compute units and the main memory, in order to hide the latency of bringing the bits to the brains. This avoids thrashing caches too much for B and evens out the pressure on the L3 caches of the machine.

Cache

Cache Latency Airlines Logistics

SLOs done right: how DevOps teams can build better service-level objectives

Dynatrace

MARCH 16, 2023

Monitors signals The first attribute of a good SLO is the ability to monitor the four “golden signals”: latency, traffic, error rates, and resource saturation. In practice, however, SLOs’ value varies significantly based on how teams design, deploy, and manage them.

DevOps

DevOps Latency Metrics Traffic

Pushy to the Limit: Evolving Netflix’s WebSocket proxy for the future

The Netflix TechBlog

SEPTEMBER 10, 2024

To support this growth, we’ve revisited Pushy’s past assumptions and design decisions with an eye towards both Pushy’s future role and future stability. In our case, we value low latency — the faster we can read from KeyValue, the faster these messages can get delivered.

Latency

Latency Cache Tuning Efficiency

The Netflix Cosmos Platform

The Netflix TechBlog

MARCH 1, 2021

It supports both high throughput services that consume hundreds of thousands of CPUs at a time, and latency-sensitive workloads where humans are waiting for the results of a computation. The subsystems all communicate with each other asynchronously via Timestone, a high-scale, low-latency priority queuing system. Warm capacity.

Serverless

Serverless Media Latency Social Media

LinkedIn Migrates Espresso to HTTP2 and Reduces Connections by 88% and Latency by 75%

InfoQ

DECEMBER 4, 2023

to HTTP2, resulting in a reduction in the number of connections, latency, and garbage collection times. LinkedIn was able to dramatically improve the scalability and performance of its Espresso database by migrating it from HTTP1.1 To achieve these gains, the team had to optimize the Netty’s default HTTP2 stack to make it fit their needs.

Latency

Latency Scalability Database Performance

Consistent caching mechanism in Titus Gateway

The Netflix TechBlog

NOVEMBER 3, 2022

In that scenario, the system would need to deal with the data propagation latency directly, for example, by use of timeouts or client-originated update tracking mechanisms. We started seeing increased response latencies and leader servers running at dangerously high utilization. The query rate in this test is set to 1K requests/second.

Cache

Cache Latency Traffic Systems

Seamlessly Swapping the API backend of the Netflix Android app

The Netflix TechBlog

SEPTEMBER 8, 2020

For each route we migrated, we wanted to make sure we were not introducing any regressions: either in the form of missing (or worse, wrong) data, or by increasing the latency of each endpoint. Being able to canary a new route let us verify latency and error rates were within acceptable limits. This meant that data that was static (e.g.

Latency

Latency Cache Java Traffic

What is AWS Lambda?

Dynatrace

APRIL 5, 2021

You can eliminate the latency issues caused by cold starts — an increase in normal response time when a new instance receives its first request — by using edge-optimized functions that run code closer to users and other projects. AWS continues to improve how it handles latency issues. It helps SRE teams automate responses.

Lambda

Lambda AWS Serverless Hardware

SRE vs DevOps: What you need to know

Dynatrace

FEBRUARY 24, 2021

As Google’s Ben Treynor explains , “Fundamentally, it’s what happens when you ask a software engineer to design an operations function.” Designating and managing Service Level Objectives (SLOs) as availability targets for a service. Reduced latency. Efficiency. Streamlined change management.

DevOps

DevOps Software Engineering Speed Google

How Dynatrace boosts production resilience with Site Reliability Guardian

Dynatrace

MAY 17, 2023

The Dynatrace Site Reliability Guardian is designed for this practice; it allows development teams to define quality objectives in their code, which is validated throughout the delivery process before the code reaches production. The queries are depicted below (sensitive data has been removed).

DevOps

DevOps Traffic Latency Best Practices

For your eyes only: improving Netflix video quality with neural networks

The Netflix TechBlog

NOVEMBER 17, 2022

Our approach to NN-based video downscaling The deep downscaler is a neural network architecture designed to improve the end-to-end video quality by learning a higher-quality video downscaler. We employed an adaptive network design that is applicable to the wide variety of resolutions we use for encoding.

Network

Network Media Innovation Efficiency

Amazon DynamoDB ? a Fast and Scalable NoSQL Database.

All Things Distributed

JANUARY 18, 2012

a Fast and Scalable NoSQL Database Service Designed for Internet Scale Applications. Today is a very exciting day as we release Amazon DynamoDB , a fast, highly reliable and cost-effective NoSQL database service designed for internet scale applications. Amazon DynamoDB offers low, predictable latencies at any scale. Comments ().

Scalability

Scalability Database Ecommerce Latency

Rapid Event Notification System at Netflix

The Netflix TechBlog

FEBRUARY 18, 2022

Scaling Policies To address the thundering herd problem and to keep latencies under acceptable thresholds, the cluster scale-up policies are configured to be more aggressive than the scale-down policies. This approach enables the computing power to catch up quickly when the queues grow.

Systems

Systems Traffic Architecture Mobile

Dynatrace accelerates business transformation with new AI observability solution

Dynatrace

JANUARY 31, 2024

For production models, this provides observability of service-level agreement (SLA) performance metrics, such as token consumption, latency, availability, response time, and error count. Beyond SLAs, the emergence of machine learning technical debt poses an additional challenge for model observability.

Cache

Cache Azure Infrastructure Monitoring

What is real user monitoring (RUM)?

Dynatrace

JANUARY 13, 2022

Providing insight into the service latency to help developers identify poorly performing code. And UX designers can use that data to better understand how users interact with an application and how developers can streamline the interface. Want to learn more? There are also some limitations of real user monitoring.

Monitoring

Monitoring Mobile Latency Best Practices

Rebuilding Netflix Video Processing Pipeline with Microservices

The Netflix TechBlog

JANUARY 10, 2024

This architecture shift greatly reduced the processing latency and increased system resiliency. We expanded pipeline support to serve our studio/content-development use cases, which had different latency and resiliency requirements as compared to the traditional streaming use case. divide the input video into small chunks 2.

Processing

Processing Media Latency Innovation

Designing Instagram

Optimising for High Latency Environments

Trending Sources

Netflix’s Distributed Counter Abstraction

RabbitMQ vs. Kafka: Key Differences

How To Design For High-Traffic Events And Prevent Your Website From Crashing

API Design Principles for Optimal Performance and Scalability

Timestone: Netflix’s High-Throughput, Low-Latency Priority Queueing System with Built-in Support…

Practical API Design at Netflix, Part 1: Using Protobuf FieldMask

Benchmark (YCSB) numbers for Redis, MongoDB, Couchbase2, Yugabyte and BangDB

Architectural Insights: Designing Efficient Multi-Layered Caching With Instagram Example

Spring WebFlux: publishOn vs subscribeOn for Improving Microservices Performance

Seeing through hardware counters: a journey to threefold performance increase

Best Practices for Scaling RabbitMQ

MySQL on Azure Performance Benchmark – ScaleGrid vs. Azure Database

The Three Cs: Concatenate, Compress, Cache

Introducing Netflix’s Key-Value Data Abstraction Layer

Scalable Annotation Service?—?Marken

Performance and Scalability Analysis of Redis and Memcached

Why applying chaos engineering to data-intensive applications matters

Title Launch Observability at Netflix Scale

Resilience Pattern: Circuit Breaker

Allegro Reduces Kafka Producer Latency Outliers by 82% After Switching to XFS

Self-Host Your Static Assets

Introducing Netflix TimeSeries Data Abstraction Layer

Dynatrace automatically monitors OpenAI ChatGPT for companies that deliver reliable, cost-effective services powered by generative AI

Dynatrace Managed turnkey Premium High Availability for globally distributed data centers (Early Adopter)

Netflix Cloud Packaging in the Terabyte Era

Build automated self-healing systems with xMatters and Dynatrace (Part 2 of 3)

Predictive CPU isolation of containers at Netflix

SLOs done right: how DevOps teams can build better service-level objectives

Pushy to the Limit: Evolving Netflix’s WebSocket proxy for the future

The Netflix Cosmos Platform

LinkedIn Migrates Espresso to HTTP2 and Reduces Connections by 88% and Latency by 75%

Consistent caching mechanism in Titus Gateway

Seamlessly Swapping the API backend of the Netflix Android app

What is AWS Lambda?

SRE vs DevOps: What you need to know

How Dynatrace boosts production resilience with Site Reliability Guardian

For your eyes only: improving Netflix video quality with neural networks

Amazon DynamoDB ? a Fast and Scalable NoSQL Database.

Rapid Event Notification System at Netflix

Dynatrace accelerates business transformation with new AI observability solution

What is real user monitoring (RUM)?

Rebuilding Netflix Video Processing Pipeline with Microservices

Stay Connected