Latency, Network and Systems - Technology Performance Pulse

Comparing Approaches to Durability in Low Latency Messaging Queues

DZone

AUGUST 2, 2022

I have generally held the view that replicating data to a secondary system is faster than sync-ing to disk, assuming the round trip network delay wasn’t high due to quality networks and co-located redundant servers. Little’s Law and Why Latency Matters.

Latency

Latency Benchmarking Network Infrastructure

Cut costs and complexity: 5 strategies for reducing tool sprawl with Dynatrace

Dynatrace

APRIL 10, 2025

Traditional network-based security approaches are evolving. Enhanced security measures, such as encryption and zero-trust, are making it increasingly difficult to analyze security threats using network packets. While network security remains relevant, the emphasis is now on application observability and threat detection.

Strategy

Strategy Storage Network Architecture

Optimising for High Latency Environments

CSS Wizardry

SEPTEMBER 16, 2024

This gives fascinating insights into the network topography of our visitors, and how much we might be impacted by high latency regions. Round-trip-time (RTT) is basically a measure of latency—how long did it take to get from one endpoint to another and back again? What is RTT? RTT isn’t a you-thing, it’s a them-thing.

Latency

Latency Cache Transportation Mobile

Build systems more reliably with Dynatrace: Chaos Engineering

Dynatrace

AUGUST 21, 2024

These releases often assumed ideal conditions such as zero latency, infinite bandwidth, and no network loss, as highlighted in Peter Deutsch’s eight fallacies of distributed systems. With Dynatrace, teams can seamlessly monitor the entire system, including network switches, database storage, and third-party dependencies.

Engineering

Engineering Systems Latency Metrics

Rapid Event Notification System at Netflix

The Netflix TechBlog

FEBRUARY 18, 2022

To this end, we developed a Rapid Event Notification System (RENO) to support use cases that require server initiated communication with devices in a scalable and extensible manner. In this blog post, we will give an overview of the Rapid Event Notification System at Netflix and share some of the learnings we gained along the way.

Systems

Systems Traffic Architecture Mobile

Efficient Multimodal Data Processing: A Technical Deep Dive

DZone

FEBRUARY 27, 2025

Multimodal data processing is the evolving need of the latest data platforms powering applications like recommendation systems, autonomous vehicles, and medical diagnostics. Handling multimodal data spanning text, images, videos, and sensor inputs requires resilient architecture to manage the diversity of formats and scale.

Efficiency

Efficiency Processing Latency Storage

RabbitMQ vs. Kafka: Key Differences

Scalegrid

FEBRUARY 6, 2025

Introduction to Message Brokers Message brokers enable applications, services, and systems to communicate by acting as intermediaries between senders and receivers. This decoupling simplifies system architecture and supports scalability in distributed environments.

Latency

Latency Analytics Architecture Storage

Best Practices for Scaling RabbitMQ

Scalegrid

FEBRUARY 24, 2025

Scaling RabbitMQ ensures your system can handle growing traffic and maintain high performance. Key Takeaways RabbitMQ improves scalability and fault tolerance in distributed systems by decoupling applications, enabling reliable message exchanges.

Best Practices

Best Practices Traffic Strategy Scalability

Mastering Latency With P90, P99, and Mean Response Times

DZone

FEBRUARY 5, 2024

In the fast-paced digital world, where every millisecond counts, understanding the nuances of network latency becomes paramount for developers and system architects. Latency, the delay before a transfer of data begins following an instruction for its transfer, can significantly impact user experience and system performance.

Latency

Latency Metrics Network Systems

Bandwidth or Latency: When to Optimise for Which

CSS Wizardry

JANUARY 31, 2019

When it comes to network performance, there are two main limiting factors that will slow you down: bandwidth and latency. Latency is defined as…. how long it takes for a bit of data to travel across the network from one node or endpoint to another. and reduction in latency. and reduction in latency.

Latency

Latency Network Speed Servers

Resilience Pattern: Circuit Breaker

DZone

NOVEMBER 16, 2023

In this article, we will explore one of the most common and useful resilience patterns in distributed systems: the circuit breaker. The circuit breaker is a design pattern that prevents cascading failures and improves the overall availability and performance of a system. What Is a Circuit Breaker?

Latency

Latency Network Database Monitoring

Optimize Citrix platform performance and user experience with Dynatrace (GA)

Dynatrace

JANUARY 15, 2020

Therefore, it requires multidimensional and multidisciplinary monitoring: Infrastructure health —automatically monitor the compute, storage, and network resources available to the Citrix system to ensure a stable platform. Citrix latency represents the end-to-end “screen lag” experienced by a server’s users. Citrix VDA.

Latency

Latency Performance Virtualization Infrastructure

Consistent caching mechanism in Titus Gateway

The Netflix TechBlog

NOVEMBER 3, 2022

As the number of Titus users increased over the years, the load and pressure on the system increased substantially. cell): Titus Job Coordinator is a leader elected process managing the active state of the system. For example, a batch workflow orchestration system may create multiple jobs which are part of a single workflow execution.

Cache

Cache Latency Traffic Systems

Designing Instagram

High Scalability

JANUARY 11, 2022

The streaming data store makes the system extensible to support other use-cases (e.g. System Components. The system will comprise of several micro-services each performing a separate task. When a user requests for feed then there will be two parallel threads involved in fetching the user feeds to optimize for latency.

Design

Design Media Storage Logistics

Dynatrace Managed turnkey Premium High Availability for globally distributed data centers (Early Adopter)

Dynatrace

JUNE 25, 2020

The network latency between cluster nodes should be around 10 ms or less. Minimized cross-data center network traffic. For Premium HA, this has been extended from 10 ms latency (in the same network region) to around 100 ms network latency due to asynchronous data replication between regions.

Availability

Availability Hardware Latency Traffic

Best practices and key metrics for improving mobile app performance

Dynatrace

DECEMBER 13, 2023

Mobile applications (apps) are an increasingly important channel for reaching customers, but the distributed nature of mobile app platforms and delivery networks can cause performance problems that leave users frustrated, or worse, turning to competitors. Load time and network latency metrics. Minimize network requests.

Best Practices

Best Practices Mobile Metrics Performance

Introducing Netflix’s Key-Value Data Abstraction Layer

The Netflix TechBlog

SEPTEMBER 18, 2024

These include challenges with tail latency and idempotency, managing “wide” partitions with many rows, handling single large “fat” columns, and slow response pagination. It also serves as central configuration of access patterns such as consistency or latency targets. Useful for keeping “n-newest” or prefix path deletion.

Latency

Latency Storage Cache Servers

Migrating Netflix to GraphQL Safely

The Netflix TechBlog

JUNE 14, 2023

To determine customer impact, we could compare various metrics such as error rates, latencies, and time to render. The AB experiment results hinted that GraphQL’s correctness was not up to par with the legacy system. Zuul , our primary edge gateway, assigns traffic to either cluster based on the experiment parameters.

Traffic

Traffic Latency Metrics Cache

Optimize your environment: Unveiling Dynatrace Hyper-V extension for enhanced performance and efficient troubleshooting

Dynatrace

OCTOBER 23, 2023

Microsoft Hyper-V is a virtualization platform that manages virtual machines (VMs) on Windows-based systems. It enables multiple operating systems to run simultaneously on the same physical hardware and integrates closely with Windows-hosted services. This leads to a more efficient and streamlined experience for users.

Efficiency

Efficiency Virtualization Hardware Performance

Faster time to value with enhanced handling of OneAgent runtime data

Dynatrace

SEPTEMBER 23, 2020

Operating Systems are not always set up in the same way. Storage mount points in a system might be larger or smaller, local or remote, with high or low latency, and various speeds. Storage and network transfer of files is a measurable cost. Recent improvements in OneAgent runtime-data handling. See details below.

Storage

Storage Latency Operating System Network

Lessons learned from enterprise service-level objective management

Dynatrace

MAY 19, 2022

Every organization’s goal is to keep its systems available and resilient to support business demands. Lastly, error budgets, as the difference between a current state and the target, represent the maximum amount of time a system can fail per the contractual agreement without repercussions. Dynatrace news. A world of misunderstandings.

Automotive

Automotive Latency Architecture Mobile

Who will watch the watchers? Extended infrastructure observability for WSO2 API Manager

Dynatrace

SEPTEMBER 18, 2020

High latency or lack of responses. You receive an alert message from Dynatrace (your infrastructure observability hub) letting you know that the average response latency of all deployed APIs has tripled. Is it the WSO2-AM gateway itself, a networking issue, a sudden increase in demand, or something else entirely?

Infrastructure

Infrastructure Latency Metrics Cloud

Site reliability done right: 5 SRE best practices that deliver on business objectives

Dynatrace

MAY 31, 2023

Uptime Institute’s 2022 Outage Analysis report found that over 60% of system outages resulted in at least $100,000 in total losses, up from 39% in 2019. The growing amount of data processed at the network edge, where failures are more difficult to prevent, magnifies complexity. Make SLOs realistic.

Best Practices

Best Practices DevOps Latency Metrics

What is ITOps? Why IT operations is more crucial than ever in a multicloud world

Dynatrace

DECEMBER 15, 2022

This transition to public, private, and hybrid cloud is driving organizations to automate and virtualize IT operations to lower costs and optimize cloud processes and systems. Besides the traditional system hardware, storage, routers, and software, ITOps also includes virtual components of the network and cloud infrastructure.

Artificial Intelligence

Artificial Intelligence DevOps Hardware Virtualization

Predictive CPU isolation of containers at Netflix

The Netflix TechBlog

JUNE 4, 2019

Because microprocessors are so fast, computer architecture design has evolved towards adding various levels of caching between compute units and the main memory, in order to hide the latency of bringing the bits to the brains. This avoids thrashing caches too much for B and evens out the pressure on the L3 caches of the machine.

Cache

Cache Latency Airlines Logistics

Implementing AWS well-architected pillars with automated workflows

Dynatrace

SEPTEMBER 13, 2023

This is a set of best practices and guidelines that help you design and operate reliable, secure, efficient, cost-effective, and sustainable systems in the cloud. Storing frequently accessed data in faster storage, usually in-memory caching, improves data retrieval speed and overall system performance. Beyond

AWS

AWS Efficiency Azure Cloud

What is a Distributed Storage System

Scalegrid

FEBRUARY 8, 2024

A distributed storage system is foundational in today’s data-driven landscape, ensuring data spread over multiple servers is reliable, accessible, and manageable. This guide delves into how these systems work, the challenges they solve, and their essential role in businesses and technology.

Storage

Storage Systems Big Data Azure

Edge Computing Orchestration in IoT: Coordinating Distributed Workloads

DZone

FEBRUARY 2, 2024

This proximity to data generation reduces latency, conserves bandwidth and enables real-time decision-making. Understanding Edge Computing Orchestration Edge computing orchestration is the art and science of managing the deployment, coordination, and scaling of workloads across a network of edge devices.

IoT

IoT Artificial Intelligence Latency Internet

How Edge and Industrial IoT Will Converge in 2025: A New Era for Smart Manufacturing

VoltDB

NOVEMBER 20, 2024

This proximity reduces latency and enables real-time decision-making. However, these technologies are on a path of rapid convergence as factories scale up their IIoT networks and demand faster, more autonomous decision-making. It will also allow for powerful money-saving things like preventative maintenance.

IoT

IoT Energy Latency Automotive

What is AWS Lambda?

Dynatrace

APRIL 5, 2021

AWS Lambda enables organizations to access many types of functions from AWS’ cloud-based services, such as: Data processing, to execute code based on triggers, system states, or user actions. You will likely need to write code to integrate systems and handle complex tasks or incoming network requests.

Lambda

Lambda AWS Serverless Hardware

Engineering dependability and fault tolerance in a distributed system

High Scalability

FEBRUARY 19, 2021

This means a system that is not merely available but is also engineered with extensive redundant measures to continue to work as its users expect. Fault tolerance The ability of a system to continue to be dependable (both available and reliable) in the presence of certain component or subsystem failures.

Engineering

Engineering Systems Availability Scalability

Netflix Cloud Packaging in the Terabyte Era

The Netflix TechBlog

SEPTEMBER 24, 2021

Lastly, the packager kicks in, adding a system layer to the asset, making it ready to be consumed by the clients. Uploading and downloading data always come with a penalty, namely latency. It is worth pointing out that cloud processing is always subject to variable network conditions.

Cloud

Cloud Media Storage Cache

Service level objectives: 5 SLOs to get started

Dynatrace

JUNE 1, 2023

It represents the percentage of time a system or service is expected to be accessible and functioning correctly. Response time Response time refers to the total time it takes for a system to process a request or complete an operation. Note : you might hear the term latency used instead of response time.

Latency

Latency Website Traffic Virtualization

Introducing Netflix TimeSeries Data Abstraction Layer

The Netflix TechBlog

OCTOBER 8, 2024

Rajiv Shringi Vinay Chella Kaidan Fullerton Oleksii Tkachuk Joey Lynch Introduction As Netflix continues to expand and diversify into various sectors like Video on Demand and Gaming , the ability to ingest and store vast amounts of temporal data — often reaching petabytes — with millisecond access latency has become increasingly vital.

Latency

Latency Storage Traffic Tuning

Pushy to the Limit: Evolving Netflix’s WebSocket proxy for the future

The Netflix TechBlog

SEPTEMBER 10, 2024

Sample system diagram for an Alexa voice command. The other main use case was RENO, the Rapid Event Notification System mentioned above. Rewriting always comes with a risk, and it’s never the first solution we reach for, particularly when working with a system that’s in place and working well.

Latency

Latency Cache Tuning Efficiency

The road to observability demo part 3: Collect, instrument, and analyze telemetry data automatically with Dynatrace

Dynatrace

MAY 17, 2023

Think about items such as general system metrics (for example, CPU utilization, free memory, number of services), the connectivity status, details of our web server, or even more granular in-application tasks like database queries. DNS query time indicates the average response times of DNS requests across the system.

Metrics

Metrics Database Monitoring Network

Snap: a microkernel approach to host networking

The Morning Paper

NOVEMBER 10, 2019

Snap: a microkernel approach to host networking Marty et al., This paper describes the networking stack, Snap , that has been running in production at Google for the last three years+. The desire for CPU efficiency and lower latencies is easy to understand. SOSP’19. Enter Google! Emphasis mine). It reminds me of ZeroMQ.

Network

Network Transportation Latency Entertainment

Building Netflix’s Distributed Tracing Infrastructure

The Netflix TechBlog

OCTOBER 19, 2020

which is difficult when troubleshooting distributed systems. Reconstructing a streaming session was a tedious and time consuming process that involved tracing all interactions (requests) between the Netflix app, our Content Delivery Network (CDN), and backend microservices.

Infrastructure

Infrastructure Transportation Storage Open Source

Rebuilding Netflix Video Processing Pipeline with Microservices

The Netflix TechBlog

JANUARY 10, 2024

This architecture shift greatly reduced the processing latency and increased system resiliency. By integrating with studio content systems, we enabled the pipeline to leverage rich metadata from the creative side and create more engaging member experiences like interactive storytelling.

Processing

Processing Media Latency Innovation

How to Improve MySQL AWS Performance 2X Over Amazon RDS at The Same Cost

Scalegrid

OCTOBER 24, 2019

As organizations continue to migrate to the cloud, it’s important to get in front of performance issues, such as high latency, low throughput, and replication lag with higher distances between your users and cloud infrastructure. AWS High Performance XLarge (see system details below). MySQL on AWS Performance Test. Amazon RDS.

AWS

AWS Latency Performance Performance Testing

Get seamless insights into Nutanix clusters with Dynatrace

Dynatrace

NOVEMBER 9, 2023

Performance monitoring Dynatrace can collect performance metrics from Nutanix clusters, including latency, IOPS (Input/Output Operations Per Second), and network throughput. Network metrics Gain visibility into network performance, helping you ensure smooth data transfer and communication.

Virtualization

Virtualization Storage Metrics Monitoring

Seamlessly Swapping the API backend of the Netflix Android app

The Netflix TechBlog

SEPTEMBER 8, 2020

For each route we migrated, we wanted to make sure we were not introducing any regressions: either in the form of missing (or worse, wrong) data, or by increasing the latency of each endpoint. Being able to canary a new route let us verify latency and error rates were within acceptable limits. This meant that data that was static (e.g.

Latency

Latency Cache Java Traffic

Towards a Reliable Device Management Platform

The Netflix TechBlog

AUGUST 30, 2021

System Setup Architecture The following diagram summarizes the architecture description: Figure 1: Event-sourcing architecture of the Device Management Platform. When a new hardware device is connected, the Local Registry detects and collects a set of information about it, such as networking information and ESN.

Latency

Latency Traffic Transportation Cloud

These 7 Edge Data Challenges Will Test Companies the Most in 2025

VoltDB

DECEMBER 11, 2024

By bringing computation closer to the data source, edge-based deployments reduce latency, enhance real-time capabilities, and optimize network bandwidth. Unlike centralized systems, where data resides in a single, well-protected environment, edge computing increases the attack surface, making systems vulnerable to breaches.

IoT

IoT Energy Logistics Latency

Comparing Approaches to Durability in Low Latency Messaging Queues

Cut costs and complexity: 5 strategies for reducing tool sprawl with Dynatrace

Trending Sources

Optimising for High Latency Environments

Build systems more reliably with Dynatrace: Chaos Engineering

Rapid Event Notification System at Netflix

Efficient Multimodal Data Processing: A Technical Deep Dive

RabbitMQ vs. Kafka: Key Differences

Best Practices for Scaling RabbitMQ

Mastering Latency With P90, P99, and Mean Response Times

Bandwidth or Latency: When to Optimise for Which

Resilience Pattern: Circuit Breaker

Optimize Citrix platform performance and user experience with Dynatrace (GA)

Consistent caching mechanism in Titus Gateway

Designing Instagram

Dynatrace Managed turnkey Premium High Availability for globally distributed data centers (Early Adopter)

Best practices and key metrics for improving mobile app performance

Introducing Netflix’s Key-Value Data Abstraction Layer

Migrating Netflix to GraphQL Safely

Optimize your environment: Unveiling Dynatrace Hyper-V extension for enhanced performance and efficient troubleshooting

Faster time to value with enhanced handling of OneAgent runtime data

Lessons learned from enterprise service-level objective management

Who will watch the watchers? Extended infrastructure observability for WSO2 API Manager

Site reliability done right: 5 SRE best practices that deliver on business objectives

What is ITOps? Why IT operations is more crucial than ever in a multicloud world

Predictive CPU isolation of containers at Netflix

Implementing AWS well-architected pillars with automated workflows

What is a Distributed Storage System

Edge Computing Orchestration in IoT: Coordinating Distributed Workloads

How Edge and Industrial IoT Will Converge in 2025: A New Era for Smart Manufacturing

What is AWS Lambda?

Engineering dependability and fault tolerance in a distributed system

Netflix Cloud Packaging in the Terabyte Era

Service level objectives: 5 SLOs to get started

Introducing Netflix TimeSeries Data Abstraction Layer

Pushy to the Limit: Evolving Netflix’s WebSocket proxy for the future

The road to observability demo part 3: Collect, instrument, and analyze telemetry data automatically with Dynatrace

Snap: a microkernel approach to host networking

Building Netflix’s Distributed Tracing Infrastructure

Rebuilding Netflix Video Processing Pipeline with Microservices

How to Improve MySQL AWS Performance 2X Over Amazon RDS at The Same Cost

Get seamless insights into Nutanix clusters with Dynatrace

Seamlessly Swapping the API backend of the Netflix Android app

Towards a Reliable Device Management Platform

These 7 Edge Data Challenges Will Test Companies the Most in 2025

Stay Connected