Infrastructure, Latency and Servers - Technology Performance Pulse

Netflix’s Distributed Counter Abstraction

The Netflix TechBlog

NOVEMBER 12, 2024

By: Rajiv Shringi , Oleksii Tkachuk , Kartik Sathyanarayanan Introduction In our previous blog post, we introduced Netflix’s TimeSeries Abstraction , a distributed service designed to store and query large volumes of temporal event data with low millisecond latencies. Today, we’re excited to present the Distributed Counter Abstraction.

Latency

Latency Cache Infrastructure Strategy

Comparing Approaches to Durability in Low Latency Messaging Queues

DZone

AUGUST 2, 2022

A significant feature of Chronicle Queue Enterprise is support for TCP replication across multiple servers to ensure the high availability of application infrastructure. Little’s Law and Why Latency Matters. In many cases, the assumption is that as long as throughput is high enough, the latency won’t be a problem.

Latency

Latency Benchmarking Network Infrastructure

Next-level interaction and customization of data visualizations in Dynatrace Dashboards and Notebooks

Dynatrace

OCTOBER 10, 2024

Take your monitoring, data exploration, and storytelling to the next level with outstanding data visualization All your applications and underlying infrastructure produce vast volumes of data that you need to monitor or analyze for insights. Infrastructure health: A honeycomb chart is often used to visualize infrastructure health.

Latency

Latency Infrastructure Monitoring Metrics

Cloud infrastructure monitoring in action: Dynatrace on Dynatrace

Dynatrace

SEPTEMBER 29, 2020

A critical component to this success was that the Dynatrace Team itself uses the Dynatrace Platform to monitor every single Dynatrace cluster in the cloud and trusts the Dynatrace Davis AI to alert in case there are any issues, either with a new feature, a configuration change or with the infrastructure our servers are running on.

Infrastructure

Infrastructure Cloud Monitoring AWS

Who will watch the watchers? Extended infrastructure observability for WSO2 API Manager

Dynatrace

SEPTEMBER 18, 2020

Sure, cloud infrastructure requires comprehensive performance visibility, as Dynatrace provides , but the services that leverage cloud infrastructures also require close attention. Extend infrastructure observability to WSO2 API Manager. High latency or lack of responses. Soaring number of active connections.

Infrastructure

Infrastructure Latency Metrics Cloud

RabbitMQ vs. Kafka: Key Differences

Scalegrid

FEBRUARY 6, 2025

Its partitioned log architecture supports both queuing and publish-subscribe models, allowing it to handle large-scale event processing with minimal latency. Kafka clusters can be deployed in Kubernetes using Helm charts to simplify scaling and management across multiple servers.

Latency

Latency Analytics Architecture Storage

Optimize Citrix platform performance and user experience with Dynatrace (GA)

Dynatrace

JANUARY 15, 2020

Citrix is a sophisticated, efficient, and highly scalable application delivery platform that is itself comprised of anywhere from hundreds to thousands of servers. Dynatrace Extension: database performance as experienced by the SAP ABAP server. OneAgent: Citrix infrastructure performance. SAP server. Citrix VDA.

Latency

Latency Performance Virtualization Infrastructure

Solve hybrid Kubernetes performance and reliability problems with unified observability

Dynatrace

APRIL 10, 2025

Three steps to set up hybrid Kubernetes observability Setting up hybrid Kubernetes observability involves a few straightforward steps to deploy Dynatrace into your environment, enabling effective instrumentation of both application and infrastructure nodes. The containers list as individual PaaS hosts after successful deployment.

Performance

Performance Java Operating System Infrastructure

Introducing Netflix’s Key-Value Data Abstraction Layer

The Netflix TechBlog

SEPTEMBER 18, 2024

Vidhya Arvind , Rajasekhar Ummadisetty , Joey Lynch , Vinay Chella Introduction At Netflix our ability to deliver seamless, high-quality, streaming experiences to millions of users hinges on robust, global backend infrastructure. It also serves as central configuration of access patterns such as consistency or latency targets.

Latency

Latency Storage Cache Servers

How to maximize serverless benefits and overcome its challenges

Dynatrace

OCTOBER 10, 2022

Despite the name, serverless computing still uses servers. This means companies can access the exact resources they need whenever they need them, rather than paying for server space and computing power they only need occasionally. If servers reach maximum load and capacity in-house, something has to give before adding new services.

Serverless

Serverless Infrastructure Lambda Latency

MySQL on Azure Performance Benchmark – ScaleGrid vs. Azure Database

Scalegrid

AUGUST 26, 2020

Microsoft Azure is one of the most popular cloud providers in the world, and a natural fit for database hosting on applications leveraging Microsoft across their infrastructure. ScaleGrid MySQL on Azure so you can see which provider offers the best throughput and latency performance. We measure latency in ms 95th percentile latency.

Azure

Azure Benchmarking Database Latency

Noisy Neighbor Detection with eBPF

The Netflix TechBlog

SEPTEMBER 10, 2024

The first step is determining whether the problem originates from the application or the underlying infrastructure. On Titus , our multi-tenant compute platform, a "noisy neighbor" refers to a container or system service that heavily utilizes the server's resources, causing performance degradation in adjacent containers.

Latency

Latency Metrics Programming Monitoring

Time to First Byte: What It Is and Why It Matters

CSS Wizardry

AUGUST 7, 2019

A lot of people surmise that TTFB is merely time spent on the server, but that is only a small fraction of the true extent of things. The first—and often most surprising for people to learn—thing that I want to draw your attention to is that TTFB counts one whole round trip of latency. But what else is TTFB?

Latency

Latency Ecommerce Servers Mobile

Migrating Netflix to GraphQL Safely

The Netflix TechBlog

JUNE 14, 2023

Before GraphQL: Monolithic Falcor API implemented and maintained by the API Team Before moving to GraphQL, our API layer consisted of a monolithic server built with Falcor. A single API team maintained both the Java implementation of the Falcor framework and the API Server. To launch Phase 1 safely, we used AB Testing.

Traffic

Traffic Latency Metrics Cache

Self-Host Your Static Assets

CSS Wizardry

MAY 31, 2019

One of the quickest wins—and one of the first things I recommend my clients do—to make websites faster can at first seem counter-intuitive: you should self-host all of your static assets, forgoing others’ CDNs/infrastructure. Critical assets are far too valuable to leave on someone else’s servers. You’re going to suffer, too.

Cache

Cache Latency Infrastructure Website

Implementing service-level objectives to improve software quality

Dynatrace

DECEMBER 27, 2022

SLOs can be a great way for DevOps and infrastructure teams to use data and performance expectations to make decisions, such as whether to release and where engineers should focus their time. In this example, “Reverse proxy” and “Front-end server” are clearly in the critical path. SLOs aid decision making. SLOs promote automation.

Software

Software Software Benchmarking Latency

The Power of Caching: Boosting API Performance and Scalability

DZone

AUGUST 16, 2023

Benefits of Caching Improved performance: Caching eliminates the need to retrieve data from the original source every time, resulting in faster response times and reduced latency. Reduced server load: By serving cached content, the load on the server is reduced, allowing it to handle more requests and improving overall scalability.

Cache

Cache Scalability Performance Latency

The Three Cs: Concatenate, Compress, Cache

CSS Wizardry

OCTOBER 16, 2023

Concatenating our files on the server: Are we going to send many smaller files, or are we going to send one monolithic file? Plotted on the same horizontal axis of 1.6s, the waterfalls speak for themselves: 201ms of cumulative latency; 109ms of cumulative download. 4,362ms of cumulative latency; 240ms of cumulative download.

Cache

Cache Latency Strategy Speed

Introducing Netflix TimeSeries Data Abstraction Layer

The Netflix TechBlog

OCTOBER 8, 2024

Rajiv Shringi Vinay Chella Kaidan Fullerton Oleksii Tkachuk Joey Lynch Introduction As Netflix continues to expand and diversify into various sectors like Video on Demand and Gaming , the ability to ingest and store vast amounts of temporal data — often reaching petabytes — with millisecond access latency has become increasingly vital.

Latency

Latency Storage Traffic Tuning

What is full stack observability?

Dynatrace

APRIL 6, 2022

Endpoints include on-premises servers, Kubernetes infrastructure, cloud-hosted infrastructure and services, and open-source technologies. Observability across the full technology stack gives teams comprehensive, real-time insight into the behavior, performance, and health of applications and their underlying infrastructure.

DevOps

DevOps Innovation Infrastructure Cloud

Dynatrace Managed turnkey Premium High Availability for globally distributed data centers (Early Adopter)

Dynatrace

JUNE 25, 2020

The network latency between cluster nodes should be around 10 ms or less. For Premium HA, this has been extended from 10 ms latency (in the same network region) to around 100 ms network latency due to asynchronous data replication between regions. In the image below, three downed nodes make an entire cluster unavailable.

Availability

Availability Hardware Latency Traffic

Pushy to the Limit: Evolving Netflix’s WebSocket proxy for the future

The Netflix TechBlog

SEPTEMBER 10, 2024

By Karthik Yagna , Baskar Odayarkoil , and Alex Ellis Pushy is Netflix’s WebSocket server that maintains persistent WebSocket connections with devices running the Netflix application. In our case, we value low latency — the faster we can read from KeyValue, the faster these messages can get delivered.

Latency

Latency Cache Tuning Efficiency

Dynatrace supports the newly released AWS Lambda Response Streaming

Dynatrace

APRIL 7, 2023

Customers can use AWS Lambda Response Streaming to improve performance for latency-sensitive applications and return larger payload sizes. Despite being serverless, the function still requires infrastructure on which to run. What is a Lambda serverless function? Return larger payload sizes.

Lambda

Lambda AWS Serverless Latency

What is ITOps? Why IT operations is more crucial than ever in a multicloud world

Dynatrace

DECEMBER 15, 2022

ITOps is an IT discipline involving actions and decisions made by the operations team responsible for an organization’s IT infrastructure. Besides the traditional system hardware, storage, routers, and software, ITOps also includes virtual components of the network and cloud infrastructure. What is ITOps?

Artificial Intelligence

Artificial Intelligence DevOps Hardware Virtualization

Real-World Effectiveness of Brotli

CSS Wizardry

APRIL 22, 2020

They were either running their own infrastructure and installing and deploying Brotli everywhere proved non-trivial, or they were using a CDN who didn’t have readily available support for the new algorithm. Taking a very reductive and simplistic view of how files are transmitted from server to client, we need to look at TCP.

Latency

Latency Servers Website Speed

Optimize Citrix platform performance and user experience with a new extension (Preview)

Dynatrace

SEPTEMBER 25, 2019

Citrix is a sophisticated, efficient, and highly scalable application delivery platform that is itself comprised of anywhere from hundreds to thousands of servers. Dynatrace Extension: database performance as experienced by the SAP ABAP server. OneAgent: Citrix infrastructure performance. SAP server. Citrix VDA.

Latency

Latency Performance Virtualization Infrastructure

What is AWS Lambda?

Dynatrace

APRIL 5, 2021

The 2014 launch of AWS Lambda marked a milestone in how organizations use cloud services to deliver their applications more efficiently, by running functions at the edge of the cloud without the cost and operational overhead of on-premises servers. AWS continues to improve how it handles latency issues. What is AWS Lambda?

Lambda

Lambda AWS Serverless Hardware

Dynatrace automatically monitors OpenAI ChatGPT for companies that deliver reliable, cost-effective services powered by generative AI

Dynatrace

JUNE 7, 2023

One of the crucial success factors for delivering cost-efficient and high-quality AI-agent services, following the approach described above, is to closely observe their cost, latency, and reliability. With these latency, reliability, and cost measurements in place, your operations team can now define their own OpenAI dashboards and SLOs.

Monitoring

Monitoring Latency Metrics Azure

Build and operate multicloud FaaS with enhanced, intelligent end-to-end observability

Dynatrace

APRIL 25, 2023

However, serverless applications have unique characteristics that make observability more difficult than in traditional server-based applications. These functions are executed by a serverless platform or provider (such as AWS Lambda, Azure Functions or Google Cloud Functions) that manages the underlying infrastructure, scaling and billing.

Serverless

Serverless Lambda Azure AWS

Site reliability done right: 5 SRE best practices that deliver on business objectives

Dynatrace

MAY 31, 2023

How site reliability engineering affects organizations’ bottom line SRE applies the disciplines of software engineering to infrastructure management, both on-premises and in the cloud. There are now many more applications, tools, and infrastructure variables that impact an application’s performance and availability.

Best Practices

Best Practices DevOps Latency Metrics

What is serverless computing? Driving efficiency without sacrificing observability

Dynatrace

JANUARY 26, 2021

Within this paradigm, it is possible to run entire architectures without touching a traditional virtual server, either locally or in the cloud. When an application is triggered, it can cause latency as the application starts. Unlike on-premises machines, shared servers, or rented virtual machines, there is no cost for downtime.

Serverless

Serverless Efficiency Lambda AWS

The Netflix Cosmos Platform

The Netflix TechBlog

MARCH 1, 2021

It supports both high throughput services that consume hundreds of thousands of CPUs at a time, and latency-sensitive workloads where humans are waiting for the results of a computation. The subsystems all communicate with each other asynchronously via Timestone, a high-scale, low-latency priority queuing system.

Serverless

Serverless Media Latency Social Media

Optimize your environment: Unveiling Dynatrace Hyper-V extension for enhanced performance and efficient troubleshooting

Dynatrace

OCTOBER 23, 2023

Determining the root cause of these issues can be difficult when the underlying “hardware” is a virtualization software stack rather than a bare-metal server.

Efficiency

Efficiency Virtualization Hardware Performance

Scale up your Dynatrace Managed software-intelligence deployment with self-healing insights

Dynatrace

JUNE 8, 2020

As a software intelligence platform, Dynatrace is woven into the fabric of your business systems, actively managing and providing self-healing capabilities for all aspects of your applications and vital infrastructure. Metrics are provided for general host info like CPU usage and memory consumption, OneAgent traffic, and network latency.

Software

Software Software Programming Metrics

Dynatrace accelerates business transformation with new AI observability solution

Dynatrace

JANUARY 31, 2024

Data dependencies and framework intricacies require observing the lifecycle of an AI-powered application end to end, from infrastructure and model performance to semantic caches and workflow orchestration. million AI server units annually by 2027, consuming 75.4+ terawatt hours yearly—more than the annual consumption of some countries.

Cache

Cache Azure Infrastructure Monitoring

Designing Instagram

High Scalability

JANUARY 11, 2022

When the server receives a request for an action (post, like etc.) FUN FACT : In this talk , Rodrigo Schmidt, director of engineering at Instagram talks about the different challenges they have faced in scaling the data infrastructure at Instagram. High Level Design. Architecture. System Components. Fetching User Feed. Optimization.

Design

Design Media Storage Logistics

Seamlessly Swapping the API backend of the Netflix Android app

The Netflix TechBlog

SEPTEMBER 8, 2020

The big difference from the monolith, though, is that this is now a standalone service deployed as a separate “application” (service) in our cloud infrastructure. Being able to canary a new route let us verify latency and error rates were within acceptable limits. For the migration, testing was a first-class citizen.

Latency

Latency Cache Java Traffic

How observability analytics helps teams uncover answers

Dynatrace

JUNE 26, 2024

While measuring app response time under different circumstances provides a latency value, for example, it doesn’t tell you why the app is slow, fast, or somewhere in between. Complete visibility of IT infrastructure sets the stage for predictive analytics that can help inform key business objectives. Predictive analysis.

Analytics

Analytics Infrastructure Metrics Efficiency

Applying Netflix DevOps Patterns to Windows

The Netflix TechBlog

AUGUST 22, 2019

Artisan Crafted Images In the Netflix full cycle DevOps culture the team responsible for building a service is also responsible for deploying, testing, infrastructure, and operation of that service. Now each change in the infrastructure is tested, canaried, and deployed like any other code change.

DevOps

DevOps AWS Tuning Infrastructure

Why growing AI adoption requires an AI observability strategy

Dynatrace

JANUARY 17, 2024

Cloud-based AI enables organizations to run AI in the cloud without the hassle of managing, provisioning, or housing servers. Containerization enables organizations to package AI applications and dependencies into a single unit, which can be easily deployed on any server with the necessary dependencies.

Strategy

Strategy Artificial Intelligence Storage Cloud

How Edge and Industrial IoT Will Converge in 2025: A New Era for Smart Manufacturing

VoltDB

NOVEMBER 20, 2024

Edge computing involves processing data locally, near the source of data generation, rather than relying on centralized cloud servers. This proximity reduces latency and enables real-time decision-making. Assess your infrastructure Evaluate your current IIoT network and identify areas where edge computing could add value.

IoT

IoT Energy Latency Automotive

What is cloud migration?

Dynatrace

SEPTEMBER 30, 2021

Generally speaking, cloud migration involves moving from on-premises infrastructure to cloud-based services. In cloud computing environments, infrastructure and services are maintained by the cloud vendor, allowing you to focus on how best to serve your customers. However, it can also mean migrating from one cloud to another.

Cloud

Cloud Traffic Best Practices Strategy

How digital experience monitoring helps deliver business observability

Dynatrace

APRIL 26, 2022

Gartner estimates that by 2025, 70% of digital business initiatives will require infrastructure and operations (I&O) leaders to include digital experience metrics in their business reporting. With DEM solutions, organizations can operate over on-premise network infrastructure or private or public cloud SaaS or IaaS offerings.

Monitoring

Monitoring Social Media IoT Metrics

DevOps automation: From event-driven automation to answer-driven automation [with causal AI]

Dynatrace

JULY 24, 2023

When a server experiences an outage, the system promptly triggers an alert and initiates actions like restarting a server or redirecting traffic to a redundant server. Using advanced causal AI and context-aware decision-making, it identifies the root cause behind server failures. But it doesn’t stop there.

DevOps

DevOps Traffic Efficiency Servers

Netflix’s Distributed Counter Abstraction

Comparing Approaches to Durability in Low Latency Messaging Queues

Trending Sources

Next-level interaction and customization of data visualizations in Dynatrace Dashboards and Notebooks

Cloud infrastructure monitoring in action: Dynatrace on Dynatrace

Who will watch the watchers? Extended infrastructure observability for WSO2 API Manager

RabbitMQ vs. Kafka: Key Differences

Optimize Citrix platform performance and user experience with Dynatrace (GA)

Solve hybrid Kubernetes performance and reliability problems with unified observability

Introducing Netflix’s Key-Value Data Abstraction Layer

How to maximize serverless benefits and overcome its challenges

MySQL on Azure Performance Benchmark – ScaleGrid vs. Azure Database

Noisy Neighbor Detection with eBPF

Time to First Byte: What It Is and Why It Matters

Migrating Netflix to GraphQL Safely

Self-Host Your Static Assets

Implementing service-level objectives to improve software quality

The Power of Caching: Boosting API Performance and Scalability

The Three Cs: Concatenate, Compress, Cache

Introducing Netflix TimeSeries Data Abstraction Layer

What is full stack observability?

Dynatrace Managed turnkey Premium High Availability for globally distributed data centers (Early Adopter)

Pushy to the Limit: Evolving Netflix’s WebSocket proxy for the future

Dynatrace supports the newly released AWS Lambda Response Streaming

What is ITOps? Why IT operations is more crucial than ever in a multicloud world

Real-World Effectiveness of Brotli

Optimize Citrix platform performance and user experience with a new extension (Preview)

What is AWS Lambda?

Dynatrace automatically monitors OpenAI ChatGPT for companies that deliver reliable, cost-effective services powered by generative AI

Build and operate multicloud FaaS with enhanced, intelligent end-to-end observability

Site reliability done right: 5 SRE best practices that deliver on business objectives

What is serverless computing? Driving efficiency without sacrificing observability

The Netflix Cosmos Platform

Optimize your environment: Unveiling Dynatrace Hyper-V extension for enhanced performance and efficient troubleshooting

Scale up your Dynatrace Managed software-intelligence deployment with self-healing insights

Dynatrace accelerates business transformation with new AI observability solution

Designing Instagram

Seamlessly Swapping the API backend of the Netflix Android app

How observability analytics helps teams uncover answers

Applying Netflix DevOps Patterns to Windows

Why growing AI adoption requires an AI observability strategy

How Edge and Industrial IoT Will Converge in 2025: A New Era for Smart Manufacturing

What is cloud migration?

How digital experience monitoring helps deliver business observability

DevOps automation: From event-driven automation to answer-driven automation [with causal AI]

Stay Connected