Architecture, Latency and Servers - Technology Performance Pulse

Netflix’s Distributed Counter Abstraction

The Netflix TechBlog

NOVEMBER 12, 2024

By: Rajiv Shringi , Oleksii Tkachuk , Kartik Sathyanarayanan Introduction In our previous blog post, we introduced Netflix’s TimeSeries Abstraction , a distributed service designed to store and query large volumes of temporal event data with low millisecond latencies. Today, we’re excited to present the Distributed Counter Abstraction.

Latency

Latency Cache Infrastructure Strategy

RabbitMQ vs. Kafka: Key Differences

Scalegrid

FEBRUARY 6, 2025

This article outlines the key differences in architecture, performance, and use cases to help determine the best fit for your workload. RabbitMQ follows a message broker model with advanced routing, while Kafkas event streaming architecture uses partitioned logs for distributed processing. What is RabbitMQ? What is Apache Kafka?

Latency

Latency Analytics Architecture Storage

Single-core memory bandwidth: Latency, Bandwidth, and Concurrency

John McCalpin

FEBRUARY 17, 2025

The Multicore Era Over the past ~15 years, server processors from Intel and AMD have evolved from the early quad-core processors to the current monsters with over 50 cores per socket. The example below is for a 2005-era processor with 60 ns memory latency and 6.4 If we want to sustain full bandwidth, we need 64/2 =32 cache lines.

Latency

Latency Hardware Cache Systems

Introducing Impressions at Netflix

The Netflix TechBlog

FEBRUARY 14, 2025

Architecture Overview The first pivotal step in managing impressions begins with the creation of a Source-of-Truth (SOT) dataset. These events are promptly relayed from the client side to our servers, entering a centralized event processing queue. This queue ensures we are consistently capturing raw events from our global userbase.

Tuning

Tuning Latency Efficiency Storage

Timestone: Netflix’s High-Throughput, Low-Latency Priority Queueing System with Built-in Support…

The Netflix TechBlog

SEPTEMBER 29, 2022

Timestone: Netflix’s High-Throughput, Low-Latency Priority Queueing System with Built-in Support for Non-Parallelizable Workloads by Kostas Christidis Introduction Timestone is a high-throughput, low-latency priority queueing system we built in-house to support the needs of Cosmos , our media encoding platform. Over the past 2.5

Latency

Latency Systems Media Serverless

Migrating Critical Traffic At Scale with No Downtime?—?Part 1

The Netflix TechBlog

MAY 4, 2023

When undertaking system migrations, one of the main challenges is establishing confidence and seamlessly transitioning the traffic to the upgraded architecture without adversely impacting the customer experience. It provides a good read on the availability and latency ranges under different production conditions.

Traffic

Traffic Latency Tuning Systems

Introducing Netflix’s Key-Value Data Abstraction Layer

The Netflix TechBlog

SEPTEMBER 18, 2024

These include challenges with tail latency and idempotency, managing “wide” partitions with many rows, handling single large “fat” columns, and slow response pagination. Data Model At its core, the KV abstraction is built around a two-level map architecture. Useful for keeping “n-newest” or prefix path deletion.

Latency

Latency Storage Cache Efficiency

How to maximize serverless benefits and overcome its challenges

Dynatrace

OCTOBER 10, 2022

Despite the name, serverless computing still uses servers. This means companies can access the exact resources they need whenever they need them, rather than paying for server space and computing power they only need occasionally. If servers reach maximum load and capacity in-house, something has to give before adding new services.

Serverless

Serverless Infrastructure Lambda Latency

PostgreSQL Connection Pooling: Part 1 – Pros & Cons

Scalegrid

OCTOBER 17, 2019

Moving to a multithreaded architecture will require extensive rewrites. But that causes a problem with PostgreSQL’s architecture – forking a process becomes expensive when transactions are very short, as the common wisdom dictates they should be. The PostgreSQL Architecture | Source. The Connection Pool Architecture.

Architecture

Architecture Database Latency Servers

Implementing service-level objectives to improve software quality

Dynatrace

DECEMBER 27, 2022

As more organizations embrace microservices-based architecture to deliver goods and services digitally, maintaining customer satisfaction has become exponentially more challenging. In this example, “Reverse proxy” and “Front-end server” are clearly in the critical path. Latency is the time that it takes a request to be served.

Software

Software Software Benchmarking Latency

Who will watch the watchers? Extended infrastructure observability for WSO2 API Manager

Dynatrace

SEPTEMBER 18, 2020

Cloud-based application architectures commonly leverage microservices. High latency or lack of responses. API manager monitoring from the application server perspective, which is what Dynatrace delivers with the WSO2 API Manager monitoring extension, can save you hours of bug hunting time. Soaring number of active connections.

Infrastructure

Infrastructure Latency Metrics Cloud

Lessons learned from enterprise service-level objective management

Dynatrace

MAY 19, 2022

Example 1: Architecture boundaries. First, they took a big step back and looked at their end-to-end architecture (Figure 2). SLO dashboard defined by architectural boundary. In their new dashboard, they added dimensions for load, latency, and open problems for each component. Not all attempts succeed on the first try.

Automotive

Automotive Latency Architecture Azure

Rapid Event Notification System at Netflix

The Netflix TechBlog

FEBRUARY 18, 2022

To this end, we developed a Rapid Event Notification System (RENO) to support use cases that require server initiated communication with devices in a scalable and extensible manner. Architecture As shown in the diagram above, the RENO service can be broken down into the following components.

Systems

Systems Traffic Architecture Mobile

What is serverless computing? Driving efficiency without sacrificing observability

Dynatrace

JANUARY 26, 2021

Within this paradigm, it is possible to run entire architectures without touching a traditional virtual server, either locally or in the cloud. In a serverless architecture, applications are distributed to meet demand and scale requirements efficiently. This creates latency when they need to restart. Pay Per Use.

Serverless

Serverless Efficiency Lambda AWS

Introducing Netflix TimeSeries Data Abstraction Layer

The Netflix TechBlog

OCTOBER 8, 2024

Rajiv Shringi Vinay Chella Kaidan Fullerton Oleksii Tkachuk Joey Lynch Introduction As Netflix continues to expand and diversify into various sectors like Video on Demand and Gaming , the ability to ingest and store vast amounts of temporal data — often reaching petabytes — with millisecond access latency has become increasingly vital.

Latency

Latency Storage Traffic Tuning

Consistent caching mechanism in Titus Gateway

The Netflix TechBlog

NOVEMBER 3, 2022

The original assumptions and architectural choices were no longer viable. Overview The figure below depicts a simplified high-level architecture of a single Titus cluster (a.k.a We started seeing increased response latencies and leader servers running at dangerously high utilization.

Cache

Cache Latency Traffic Systems

Edge Authentication and Token-Agnostic Identity Propagation

The Netflix TechBlog

FEBRUARY 9, 2021

Plus, the architecture of the Edge tier was evolving to a PaaS (platform as a service) model, and we had some tough decisions to make about how, and where, to handle identity token handling. The API server orchestrates backend systems to authenticate the user. The whole system was quite complex, and starting to become brittle.

Architecture

Architecture Latency Servers Website

Dynatrace supports the newly released AWS Lambda Response Streaming

Dynatrace

APRIL 7, 2023

Customers can use AWS Lambda Response Streaming to improve performance for latency-sensitive applications and return larger payload sizes. The difference is the owner of the Lambda function does not have to worry about provisioning and managing servers. Return larger payload sizes.

Lambda

Lambda AWS Serverless Latency

Designing Instagram

High Scalability

JANUARY 11, 2022

Architecture. When the server receives a request for an action (post, like etc.) When a user requests for feed then there will be two parallel threads involved in fetching the user feeds to optimize for latency. High Level Design. After that, the post gets added to the feed of all the followers in the columnar data storage.

Design

Design Media Storage Logistics

Bending pause times to your will with Generational ZGC

The Netflix TechBlog

MARCH 5, 2024

Reduced tail latencies In both our GRPC and DGS Framework services, GC pauses are a significant source of tail latencies. That’s particularly true of our GRPC clients and servers, where request cancellations due to timeouts interact with reliability features such as retries, hedging and fallbacks.

Latency

Latency Java Tuning Efficiency

What is AWS Lambda?

Dynatrace

APRIL 5, 2021

The 2014 launch of AWS Lambda marked a milestone in how organizations use cloud services to deliver their applications more efficiently, by running functions at the edge of the cloud without the cost and operational overhead of on-premises servers. AWS continues to improve how it handles latency issues. What is AWS Lambda?

Lambda

Lambda AWS Serverless Hardware

The Netflix Cosmos Platform

The Netflix TechBlog

MARCH 1, 2021

It supports both high throughput services that consume hundreds of thousands of CPUs at a time, and latency-sensitive workloads where humans are waiting for the results of a computation. The subsystems all communicate with each other asynchronously via Timestone, a high-scale, low-latency priority queuing system.

Serverless

Serverless Media Latency Social Media

Dynatrace accelerates business transformation with new AI observability solution

Dynatrace

JANUARY 31, 2024

Retrieval-augmented generation emerges as the standard architecture for LLM-based applications Given that LLMs can generate factually incorrect or nonsensical responses, retrieval-augmented generation (RAG) has emerged as an industry standard for building GenAI applications. million AI server units annually by 2027, consuming 75.4+

Cache

Cache Azure Infrastructure Monitoring

What is ITOps? Why IT operations is more crucial than ever in a multicloud world

Dynatrace

DECEMBER 15, 2022

The roles and responsibilities of ITOps team members include the following: A system administrator configures servers, installs applications, monitors the health of the system, and fixes and upgrades hardware. This includes response time, accuracy, speed, throughput, uptime, CPU utilization, and latency. Performance.

Artificial Intelligence

Artificial Intelligence DevOps Hardware Virtualization

Seamlessly Swapping the API backend of the Netflix Android app

The Netflix TechBlog

SEPTEMBER 8, 2020

We tried a few iterations of what this new service should look like, and eventually settled on a modern architecture that aimed to give more control of the API experience to the client teams. For us, it means that we now need to have ~15 MDN tabs open when writing routes :) Let’s briefly discuss the architecture of this microservice.

Latency

Latency Cache Java Traffic

Site reliability done right: 5 SRE best practices that deliver on business objectives

Dynatrace

MAY 31, 2023

Microservices-based architectures and software containers enable organizations to deploy and modify applications with unprecedented speed. At the lowest level, SLIs provide a view of service availability, latency, performance, and capacity across systems. However, cloud complexity has made software delivery challenging.

Best Practices

Best Practices DevOps Latency Metrics

Observability vs. monitoring: What’s the difference?

Dynatrace

NOVEMBER 3, 2021

Organizations are depending more and more on distributed architectures to provide application services. For example, when monitoring a database, you’ll want to know about any latency when writing data to a disk or average query response time. Dynatrace news. This trend is prompting advances in both observability and monitoring.

Monitoring

Monitoring Metrics DevOps Scalability

Observability platform vs. observability tools

Dynatrace

DECEMBER 22, 2021

Metrics are measures of critical system values, such as CPU utilization or average write latency to persistent storage. They are particularly important in distributed systems, such as microservices architectures. Observability platforms are becoming essential as the complexity of cloud-native architectures increases.

Artificial Intelligence

Artificial Intelligence Metrics Architecture DevOps

Achieving 100Gbps intrusion prevention on a single server

The Morning Paper

NOVEMBER 15, 2020

Achieving 100 Gbps intrusion prevention on a single server , Zhao et al., Today’s paper choice is a wonderful example of pushing the state of the art on a single server. This makes the whole system latency sensitive. Moreover, Pigasus wants to do all this on a single server! Can you really do all this on a single server??

Servers

Servers Hardware Latency Design

Edgar: Solving Mysteries Faster with Observability

The Netflix TechBlog

SEPTEMBER 2, 2020

While this abundance of dashboards and information is by no means unique to Netflix, it certainly holds true within our microservices architecture. A span: Represents a unit of work, such as a network call from one service to another (a client/server relationship) or a purely internal action (e.g., starting and finishing a method).

Latency

Latency Transportation Engineering Traffic

Amazon DynamoDB ? a Fast and Scalable NoSQL Database.

All Things Distributed

JANUARY 18, 2012

Behind the scenes, Amazon DynamoDB automatically spreads the data and traffic for a table over a sufficient number of servers to meet the request capacity specified by the customer. Amazon DynamoDB offers low, predictable latencies at any scale. s read latency, particularly as dataset sizes grow. Consistency. SimpleDBâ??s

Scalability

Scalability Database Ecommerce Latency

The Anna Key-Value Store Now Has 355x the Performance of DynamoDB for the Dollar

High Scalability

SEPTEMBER 8, 2018

RISELabs , those wonderfully innovative folks over at Berkeley, have uplifted their Anna datatabase —a shared-nothing, thread-per-core architecture to achieve lightning-fast speeds by avoiding all coordination mechanisms—to become cloud-aware. While database neogenesis has slowed down considerably, it has not gone necrotic.

Storage

Storage Performance AWS Cloud

Understanding operational 5G: a first measurement study on its coverage, performance and energy consumption

The Morning Paper

OCTOBER 4, 2020

We are standing on the eve of the 5G era… 5G, as a monumental shift in cellular communication technology, holds tremendous potential for spurring innovations across many vertical industries, with its promised multi-Gbps speed, sub-10 ms low latency, and massive connectivity. Throughput and latency. energy consumption).

Energy

Energy Latency Performance Network

Zero Configuration Service Mesh with On-Demand Cluster Discovery

The Netflix TechBlog

AUGUST 29, 2023

IPC clients are instantiated targeting that VIP or SVIP, and the Eureka client code handles the translation of that VIP to a set of IP and port pairs by fetching them from the Eureka server. In this architecture, service to service communication no longer goes through the single point of failure of a load balancer.

Traffic

Traffic Latency Cloud C++

Towards a Unified Theory of Web Performance

Alex Russell

FEBRUARY 28, 2022

Tim Berners-Lee tweets that 'This is for everyone' at the 2012 Olympic Games opening ceremony using the NeXT computer he used to build the first browser and web server. The difference in weight between the two architectures is interesting, but what we should focus on is the per interaction loop. Today's web architecture debates (e.g.

Performance

Performance Latency Architecture Network

No need to compromise visibility in public clouds with the new Azure services supported by Dynatrace

Dynatrace

JULY 6, 2020

The Azure MySQL dashboard serves as a comprehensive overview of your MySQL servers and database services. This means that you can improve performance, scale your application, and enable complex application architectures like IaaS and PaaS, on premise + cloud, or multi-cloud hybrid environments.

Azure

Azure Cloud Big Data Virtualization

Understanding Proxy Browsers: Architecture

Tim Kadlec

JULY 28, 2015

This first post looks at the general architecture of proxy browsers with a performance focus. Typical Browser Architecture. Establish TCP connection(s) to the server(s). How quickly they happen, and the cost, depends mostly on the characteristics of the network: the bandwidth, latency, cost of data, etc.

Architecture

Architecture Servers Network Latency

Migrating Critical Traffic At Scale with No Downtime?—?Part 2

The Netflix TechBlog

JUNE 13, 2023

By collecting and analyzing key performance metrics of the service over time, we can assess the impact of the new changes and determine if they meet the availability, latency, and performance requirements. A/B testing is also a key technique in migrations where the updates to the architecture involve changing device contracts as well.

Traffic

Traffic Metrics Systems Strategy

What is Serverless Architecture?

cdemi

FEBRUARY 20, 2017

Let's talk about the elephant in the room; Serverless doesn't really mean that there are no Software or Hardware servers. It just means that from Software Development perspective, servers are abstracted and outsourced to another entity, so you don't need to worry about it. Advantages and Disadvantages of Serverless. Advantages.

Serverless

Serverless Architecture Lambda Azure

Using Docker To Deploy Neon Serverless PostgreSQL

Percona

MARCH 13, 2023

Architecture To understand more about deployment procedures, we need to look a little more at Neon architecture. I prefer to test a distributed deployment where each component is placed on different servers or virtual machines, that’s why I do not put it into docker-compose. 50051 2.

Serverless

Serverless C++ Storage Servers

Redis® Monitoring Strategies for 2025

Scalegrid

JANUARY 21, 2025

With its widespread use in modern application architectures, understanding the ins and outs of Redis monitoring is essential for any tech professional. Identifying key Redis metrics such as latency, CPU usage, and memory metrics is crucial for effective Redis monitoring. Redis, a powerful in-memory data store, is no exception.

Strategy

Strategy Monitoring Latency DevOps

Practical API Design at Netflix, Part 1: Using Protobuf FieldMask

The Netflix TechBlog

SEPTEMBER 3, 2021

Remote calls are never free; they impose extra latency, increase probability of an error, and consume network bandwidth. of our message definition: [link] In this chart, the producer (server) utilizes new descriptors, with field number 2 named title_name. Suppose we want to rename the field title to title_name and publish version 2.0

Design

Design Java Code Servers

Amazon EC2 Cluster GPU Instances - All Things Distributed

All Things Distributed

NOVEMBER 15, 2010

Building general purpose architectures has always been hard; there are often so many conflicting requirements that you cannot derive an architecture that will serve all, so we have often ended up focusing on one side of the requirements that allow you to serve that area really well.

AWS

AWS Programming Latency Architecture

Redis vs Memcached in 2024

Scalegrid

MARCH 28, 2024

Key Takeaways Redis offers complex data structures and additional features for versatile data handling, while Memcached excels in simplicity with a fast, multi-threaded architecture for basic caching needs. Introduction Caching serves a dual purpose in web development – speeding up client requests and reducing server load.

Cache

Cache Storage Architecture Scalability

Netflix’s Distributed Counter Abstraction

RabbitMQ vs. Kafka: Key Differences

Trending Sources

Single-core memory bandwidth: Latency, Bandwidth, and Concurrency

Introducing Impressions at Netflix

Timestone: Netflix’s High-Throughput, Low-Latency Priority Queueing System with Built-in Support…

Migrating Critical Traffic At Scale with No Downtime?—?Part 1

Introducing Netflix’s Key-Value Data Abstraction Layer

How to maximize serverless benefits and overcome its challenges

PostgreSQL Connection Pooling: Part 1 – Pros & Cons

Implementing service-level objectives to improve software quality

Who will watch the watchers? Extended infrastructure observability for WSO2 API Manager

Lessons learned from enterprise service-level objective management

Rapid Event Notification System at Netflix

What is serverless computing? Driving efficiency without sacrificing observability

Introducing Netflix TimeSeries Data Abstraction Layer

Consistent caching mechanism in Titus Gateway

Edge Authentication and Token-Agnostic Identity Propagation

Dynatrace supports the newly released AWS Lambda Response Streaming

Designing Instagram

Bending pause times to your will with Generational ZGC

What is AWS Lambda?

The Netflix Cosmos Platform

Dynatrace accelerates business transformation with new AI observability solution

What is ITOps? Why IT operations is more crucial than ever in a multicloud world

Seamlessly Swapping the API backend of the Netflix Android app

Site reliability done right: 5 SRE best practices that deliver on business objectives

Observability vs. monitoring: What’s the difference?

Observability platform vs. observability tools

Achieving 100Gbps intrusion prevention on a single server

Edgar: Solving Mysteries Faster with Observability

Amazon DynamoDB ? a Fast and Scalable NoSQL Database.

The Anna Key-Value Store Now Has 355x the Performance of DynamoDB for the Dollar

Understanding operational 5G: a first measurement study on its coverage, performance and energy consumption

Zero Configuration Service Mesh with On-Demand Cluster Discovery

Towards a Unified Theory of Web Performance

No need to compromise visibility in public clouds with the new Azure services supported by Dynatrace

Understanding Proxy Browsers: Architecture

Migrating Critical Traffic At Scale with No Downtime?—?Part 2

What is Serverless Architecture?

Using Docker To Deploy Neon Serverless PostgreSQL

Redis® Monitoring Strategies for 2025

Practical API Design at Netflix, Part 1: Using Protobuf FieldMask

Amazon EC2 Cluster GPU Instances - All Things Distributed

Redis vs Memcached in 2024

Stay Connected