Engineering, Latency and Speed - Technology Performance Pulse

Optimising for High Latency Environments

CSS Wizardry

SEPTEMBER 16, 2024

This gives fascinating insights into the network topography of our visitors, and how much we might be impacted by high latency regions. Round-trip-time (RTT) is basically a measure of latency—how long did it take to get from one endpoint to another and back again? What is RTT? RTT isn’t a you-thing, it’s a them-thing.

Latency

Latency Cache Transportation Mobile

Why applying chaos engineering to data-intensive applications matters

Dynatrace

MAY 23, 2024

Stream processing enables software engineers to model their applications’ business logic as high-level representations in a directed acyclic graph without explicitly defining a physical execution plan. We designed experimental scenarios inspired by chaos engineering. This significantly increases event latency.

Engineering

Engineering Tuning Latency Open Source

Site reliability engineering: 5 things you need to know

Dynatrace

FEBRUARY 4, 2021

What is site reliability engineering? Site reliability engineering (SRE) is the practice of applying software engineering principles to operations and infrastructure processes to help organizations create highly reliable and scalable software systems. Dynatrace news. SRE focuses on automation.

Engineering

Engineering DevOps Government Latency

Site reliability engineering: 5 things to you need to know

Dynatrace

FEBRUARY 4, 2021

Site reliability engineering (SRE) is the practice of applying software engineering principles to operations and infrastructure processes to help organizations create highly reliable and scalable software systems. Organizations can then integrate these skilled engineers at key points in the DevOps life cycle.

Engineering

Engineering DevOps Government Latency

Introducing Impressions at Netflix

The Netflix TechBlog

FEBRUARY 14, 2025

Every image you hover over isnt just a visual placeholder; its a critical data point that fuels our sophisticated personalization engine. Part 1: Creating the Source of Truth for Impressions By: TulikaBhatt Imagine scrolling through Netflix, where each movie poster or promotional banner competes for your attention.

Tuning

Tuning Latency Efficiency Storage

Cut costs and complexity: 5 strategies for reducing tool sprawl with Dynatrace

Dynatrace

APRIL 10, 2025

Business-focused, unified platform approach : A unified platform approach enables platform engineering and self-service portals, simplifying operations and reducing costs. Davis, the causal AI engine, instantly identifies root causes and predicts service degradation before it impacts users.

Strategy

Strategy Storage Network Architecture

SRE vs DevOps: What you need to know

Dynatrace

FEBRUARY 24, 2021

Cloud-native environments bring speed and agility to software development and operations (DevOps) practices. But with that speed and agility comes new complications and complexity, all while maintaining performance and reliability with less than 1% down-time per year. Reduced latency. SRE as an application of DevOps. Efficiency.

DevOps

DevOps Software Engineering Speed Google

Site reliability done right: 5 SRE best practices that deliver on business objectives

Dynatrace

MAY 31, 2023

Site reliability engineering (SRE) has recently become a critical discipline in recent years as the world has shifted in favor of web-based interactions. This shift is leading more organizations to hire site reliability engineers to guarantee the reliability and resiliency of their services. Mobile retail e-commerce spending in the U.

Best Practices

Best Practices DevOps Latency Metrics

Optimizing your Kubernetes clusters without breaking the bank

Dynatrace

JANUARY 14, 2022

The Akamas vision is that only an autonomous optimization approach powered by AI can effectively enable performance engineers, SREs, and architects to identify the best configurations that ensure maximum service performance and resilience, at the lowest possible cost and at business speed. below 500ms) and error rates (e.g.

Latency

Latency Tuning Efficiency AWS

Automated Change Impact Analysis with Site Reliability Guardian

Dynatrace

FEBRUARY 15, 2023

This is where Site Reliability Engineering (SRE) practices are applied. SREs use Service-Level Indicators (SLI) to see the complete picture of service availability, latency, performance, and capacity across various systems, especially revenue-critical systems.

DevOps

DevOps Latency Traffic Best Practices

Service level objectives: 5 SLOs to get started

Dynatrace

JUNE 1, 2023

As organizations digitally transform, they’re also accelerating the speed of software delivery. Note : you might hear the term latency used instead of response time. Both latency and response time are critical to ensure reliability. Latency primarily focuses on the time spent in transit.

Latency

Latency Website Traffic Virtualization

Performance Hero: Annie Sullivan

Speed Curve

JANUARY 19, 2025

Annie leads the Chrome Speed Metrics team at Google, which has arguably had the most significant impact on web performance of the past decade. It's really important to acknowledge that none of this would have been possible without the great work from Annie and her small-but-mighty Speed Metrics team at Google. Nice job, everyone!

Performance

Performance Google Speed Metrics

Real-World Effectiveness of Brotli

CSS Wizardry

APRIL 22, 2020

So, for the last several years, I, along with other performance engineers like me, have been recommending that our clients move over from Gzip and to Brotli instead. The uncompressed version, on the other hand, takes a full two round trips more to be fully transferred, which—particularly on a high latency connection—could be quite noticeable.

Latency

Latency Servers Website Speed

How BizDevOps can “shift left” using SLOs to automate quality gates

Dynatrace

MAY 5, 2021

At Perform 2021 , Dynatrace’s Kristof Renders, Services Practice Manager for Autonomous Cloud Enablement, joined Sumit Nagal, Principal Engineer at Intuit, to demonstrate how service-level objectives (SLOs) and business-level objectives (BLOs) can “shift left.” For example, improving latency by as little as 0.1

Benchmarking

Benchmarking Latency Speed Software

DevOps observability: A guide for DevOps and DevSecOps teams

Dynatrace

JANUARY 18, 2023

However, getting reliable answers from observability data so teams can automate more processes to ensure speed, quality, and reliability can be challenging. This drive for speed has a cost: 22% of leaders admit they’re under so much pressure to innovate faster that they must sacrifice code quality. – blog.

DevOps

DevOps Best Practices Innovation Strategy

The Anna Key-Value Store Now Has 355x the Performance of DynamoDB for the Dollar

High Scalability

SEPTEMBER 8, 2018

RISELabs , those wonderfully innovative folks over at Berkeley, have uplifted their Anna datatabase —a shared-nothing, thread-per-core architecture to achieve lightning-fast speeds by avoiding all coordination mechanisms—to become cloud-aware. Our monitoring engine automatically moves data between tiers based on access patterns.

Storage

Storage Performance AWS Cloud

Netflix Cloud Packaging in the Terabyte Era

The Netflix TechBlog

SEPTEMBER 24, 2021

Uploading and downloading data always come with a penalty, namely latency. Figure 3: Video Processing with Index and Virtual Assembly Using virtual assembly greatly improves the latency performance of the ProRes 422 HQ proxy generation by removing one round trip of cloud downloading and cloud uploading by the physical assembler.

Cloud

Cloud Media Storage Cache

Pushy to the Limit: Evolving Netflix’s WebSocket proxy for the future

The Netflix TechBlog

SEPTEMBER 10, 2024

Dynomite is a Netflix open source wrapper around Redis that provides a few additional features like auto-sharding and cross-region replication, and it provided Pushy with low latency and easy record expiry, both of which are critical for Pushy’s workload. As Pushy’s portfolio grew, we experienced some pain points with Dynomite.

Latency

Latency Cache Tuning Efficiency

Application observability meets developer observability: Unlock a 360º view of your environment

Dynatrace

NOVEMBER 6, 2023

In a recent webinar , Dynatrace DevOps activist Andi Grabner and senior software engineer Yarden Laifenfeld explored developer observability. Why is developer observability important for engineers? Dynatrace enables teams to specify SLOs, such as latency, uptime, availability, and more.

Development

Development DevOps Programming Cloud

Consistent caching mechanism in Titus Gateway

The Netflix TechBlog

NOVEMBER 3, 2022

In that scenario, the system would need to deal with the data propagation latency directly, for example, by use of timeouts or client-originated update tracking mechanisms. We started seeing increased response latencies and leader servers running at dangerously high utilization.

Cache

Cache Latency Traffic Systems

The Speed of Time

Brendan Gregg

SEPTEMBER 25, 2021

A Cassandra database cluster had switched to Ubuntu and noticed write latency increased by over 30%. Measuring the speed of time Is there already a microbenchmark for os::javaTimeMillis()? I happened to be speaking at a technical confering while still debugging this, and mentioned what I was working on to a processor engineer.

Speed

Speed Java AWS Virtualization

Implementing AWS well-architected pillars with automated workflows

Dynatrace

SEPTEMBER 13, 2023

These workflows also utilize Davis® , the Dynatrace causal AI engine, and all your observability and security data across all platforms, in context, at scale, and in real-time. Storing frequently accessed data in faster storage, usually in-memory caching, improves data retrieval speed and overall system performance. Beyond

AWS

AWS Efficiency Azure Cloud

Real user monitoring vs. synthetic monitoring: Understanding best practices

Dynatrace

JUNE 27, 2022

Data collected on page load events, for example, can include navigation start (when performance begins to be measured), request start (right before the user makes a request from the server), and speed index metrics (measure page load speed). connectivity, access, user count, latency) of geographic regions.

Best Practices

Best Practices Monitoring Wireless Traffic

Speed Up Presto at Uber with Alluxio Local Cache

Uber Engineering

JANUARY 16, 2023

Uber’s interactive analytics team shares how they integrated Alluxio’s data caching into Presto, the SQL query engine powering thousands of daily active users on petabyte scale at Uber, to dramatically reduce data scan latencies through leveraging Presto on local disks.

Cache

Cache Speed Latency Analytics

What is ITOps? Why IT operations is more crucial than ever in a multicloud world

Dynatrace

DECEMBER 15, 2022

This includes response time, accuracy, speed, throughput, uptime, CPU utilization, and latency. The Dynatrace AI engine, Davis , provides precise answers with automatic root-cause analysis to help DevOps and ITOps continuously improve workflows. Performance. What does IT operations do? ” The post What is ITOps?

Artificial Intelligence

Artificial Intelligence DevOps Hardware Virtualization

Service level objective examples: 5 SLO examples for faster, more reliable apps

Dynatrace

JUNE 1, 2023

Measuring application performance is increasingly important because as organizations digitally transform, they’re also accelerating the speed of software delivery. Note : you might hear the term latency used instead of response time. Both latency and response time are critical to ensure reliability.

Traffic

Traffic Website Latency Virtualization

Amazon DynamoDB Accelerator (DAX): Speed Up DynamoDB Response Times from Milliseconds to Microseconds without Application Rewrite.

All Things Distributed

JUNE 21, 2017

Today, I'm excited to announce the general availability of Amazon DynamoDB Accelerator (DAX) , a fully managed, highly available, in-memory cache that can speed up DynamoDB response times from milliseconds to microseconds, even at millions of requests per second. "For Tinder, performance is absolutely key.

Speed

Speed Cache Latency AWS

MezzFS?—?Mounting object storage in Netflix’s media processing platform

The Netflix TechBlog

MARCH 6, 2019

Mounting object storage in Netflix’s media processing platform By Barak Alon (on behalf of Netflix’s Media Cloud Engineering team) MezzFS (short for “Mezzanine File System”) is a tool we’ve developed at Netflix that mounts cloud objects as local files via FUSE. Netflix operates in multiple AWS regions.

Media

Media Storage Processing Cache

What are SLOs? How service-level objectives work with SLIs to deliver on SLAs

Dynatrace

DECEMBER 2, 2021

SLOs can be a great way for DevOps and infrastructure teams to use data and performance expectations to make decisions, such as whether to release, and where engineers should focus their time. You can set SLOs based on individual indicators, such as batch throughput, request latency, and failures-per-second. Help with decision making.

Metrics

Metrics Best Practices DevOps Infrastructure

Embrace event-driven computing: Amazon expands DynamoDB with streams, cross-region replication, and database triggers

All Things Distributed

JULY 14, 2015

In traditional database architectures, database engines often run a small search engine or data warehouse engines on the same hardware as the database. A more scalable option is to decouple these systems and build a pipe that connects these engines and feeds all change records from the source database to the data warehouse (e.g.,

Database

Database Lambda AWS IoT

Latency vs. Throughput: Navigating the Digital Highway

VoltDB

FEBRUARY 29, 2024

In this fast-paced ecosystem, two vital elements determine the efficiency of this traffic: latency and throughput. LATENCY: THE WAITING GAME Latency is like the time you spend waiting in line at your local coffee shop. All these moments combined represent latency – the time it takes for your order to reach your hands.

Latency

Latency Games Traffic Network

Compression Methods in MongoDB: Snappy vs. Zstd

Percona

MARCH 29, 2023

Data compression MongoDB offers various block compression methods used by the WiredTiger storage engine, like snappy, zlib, and zstd. Percona Server for MongoDB (PSMDB) supports all types of compression and enterprise-grade features for free. I am using PSMDB 6.0.4 Snappy is a compression library developed by Google.

Storage

Storage Network Open Source Latency

USENIX LISA2021 Computing Performance: On the Horizon

Brendan Gregg

JULY 4, 2021

I summarized these topics and more as a plenary conference talk, including my own predictions (as a senior performance engineer) for the future of computing performance, with a focus on back-end servers. Ford, et al., “TCP

Performance

Performance Latency Hardware Storage

Incremental Processing using Netflix Maestro and Apache Iceberg

The Netflix TechBlog

NOVEMBER 20, 2023

It also improves the engineering productivity by simplifying the existing pipelines and unlocking the new patterns. As our business scales globally, the demand for data is growing and the needs for scalable low latency incremental processing begin to emerge. There are three common issues that the dataset owners usually face.

Processing

Processing Big Data Efficiency Engineering

Towards a Unified Theory of Web Performance

Alex Russell

FEBRUARY 28, 2022

I propose four key ingredients: Definition: What is "performance" beyond page speed? The chief effect of the architectural difference is to shift the distribution of latency within the loop. Improving latency for one scenario can degrade it in another. No silver bullets, only engineering. What are its goals?

Performance

Performance Latency Architecture Network

Redis® Monitoring Strategies for 2025

Scalegrid

JANUARY 21, 2025

Identifying key Redis metrics such as latency, CPU usage, and memory metrics is crucial for effective Redis monitoring. To monitor Redis instances effectively, collect Redis metrics focusing on cache hit ratio, memory allocated, and latency threshold. It is important to understand these challenges properly to find solutions for them.

Strategy

Strategy Monitoring Latency DevOps

Balancing Low Latency, High Availability, and Cloud Choice

VoltDB

MAY 14, 2024

Balancing Low Latency, High Availability and Cloud Choice Cloud hosting is no longer just an option — it’s now, in many cases, the default choice. Low usage or over-engineered legacy systems Until the arrival of cloud hosting, nobody knew how to size anything properly, or they did know but never got a chance to. Why are they refusing?

Latency

Latency Availability Cloud Hardware

Snap: a microkernel approach to host networking

The Morning Paper

NOVEMBER 10, 2019

You need a lot of software engineers and the willingness to rewrite a lot of software to entertain that idea. Here are the bombshell paragraphs: Our datacenter applications seek ever more CPU-efficient and lower-latency communication, which Pony Express delivers. The desire for CPU efficiency and lower latencies is easy to understand.

Network

Network Transportation Latency Entertainment

AI Essentials for Tech Executives

O'Reilly

FEBRUARY 18, 2025

The eval process combines: Human review Model-based evaluation A/B testing The results then inform two parallel streams: Fine-tuning with carefully curated data Prompt engineering improvements These both feed into model improvements, which starts the cycle again. Were experiencing high latency in responses.

Latency

Latency Tuning Metrics Processing

How Google PageSpeed Works: Improve Your Score and Search Engine Ranking

CSS - Tricks

JULY 25, 2019

In this article, we uncover how PageSpeed calculates it’s critical speed score. It’s no secret that speed has become a crucial factor in increasing revenue and lowering abandonment rates. Now that Google uses page speed as a ranking factor, many organizations have become laser-focused on performance. Speed Index.

Google

Google Engineering Speed Mobile

Redis® Monitoring Strategies for 2024

Scalegrid

DECEMBER 21, 2023

Identifying key Redis® metrics such as latency, CPU usage, and memory metrics is crucial for effective Redis monitoring. To monitor Redis® instances effectively, collect Redis metrics focusing on cache hit ratio, memory allocated, and latency threshold. It is important to understand these challenges properly to find solutions for them.

Strategy

Strategy Monitoring Latency DevOps

Introducing Dynatrace built-in data observability on Davis AI and Grail

Dynatrace

JANUARY 31, 2024

This freshness measurement can then be used by out-of-the-box Dynatrace anomaly detection to actively alert on abnormal changes within the data ingest latency to ensure the expected freshness of all the data records. Data observability is becoming a mandatory part of business analytics, automation, and AI.

DevOps

DevOps Analytics Airlines Metrics

Seamless offloading of web app computations from mobile device to edge clouds via HTML5 Web Worker migration

The Morning Paper

JANUARY 30, 2020

Edge servers are the middle ground – more compute power than a mobile device, but with latency of just a few ms. physics engine that simulates 3D cubes falling from the air. improvement, and the wasm version a 1.4x (here the compute speed-up is offset by the long transmission time to send the input/output images).

Mobile

Mobile Cloud Latency Games

Back-to-Basics Weekend Reading - U-Net: A User-Level Network Interface

All Things Distributed

OCTOBER 25, 2013

In the back to basics readings this week I am re-reading a paper from 1995 about the work that I did together with Thorsten on solving the problem of end-to-end low-latency communication on high-speed networks. The lack of low-latency made that distributed systems (e.g.

Network

Network Latency Operating System Speed

Optimising for High Latency Environments

Why applying chaos engineering to data-intensive applications matters

Trending Sources

Site reliability engineering: 5 things you need to know

Site reliability engineering: 5 things to you need to know

Introducing Impressions at Netflix

Cut costs and complexity: 5 strategies for reducing tool sprawl with Dynatrace

SRE vs DevOps: What you need to know

Site reliability done right: 5 SRE best practices that deliver on business objectives

Optimizing your Kubernetes clusters without breaking the bank

Automated Change Impact Analysis with Site Reliability Guardian

Service level objectives: 5 SLOs to get started

Performance Hero: Annie Sullivan

Real-World Effectiveness of Brotli

How BizDevOps can “shift left” using SLOs to automate quality gates

DevOps observability: A guide for DevOps and DevSecOps teams

The Anna Key-Value Store Now Has 355x the Performance of DynamoDB for the Dollar

Netflix Cloud Packaging in the Terabyte Era

Pushy to the Limit: Evolving Netflix’s WebSocket proxy for the future

Application observability meets developer observability: Unlock a 360º view of your environment

Consistent caching mechanism in Titus Gateway

The Speed of Time

Implementing AWS well-architected pillars with automated workflows

Real user monitoring vs. synthetic monitoring: Understanding best practices

Speed Up Presto at Uber with Alluxio Local Cache

What is ITOps? Why IT operations is more crucial than ever in a multicloud world

Service level objective examples: 5 SLO examples for faster, more reliable apps

Amazon DynamoDB Accelerator (DAX): Speed Up DynamoDB Response Times from Milliseconds to Microseconds without Application Rewrite.

MezzFS?—?Mounting object storage in Netflix’s media processing platform

What are SLOs? How service-level objectives work with SLIs to deliver on SLAs

Embrace event-driven computing: Amazon expands DynamoDB with streams, cross-region replication, and database triggers

Latency vs. Throughput: Navigating the Digital Highway

Compression Methods in MongoDB: Snappy vs. Zstd

USENIX LISA2021 Computing Performance: On the Horizon

Incremental Processing using Netflix Maestro and Apache Iceberg

Towards a Unified Theory of Web Performance

Redis® Monitoring Strategies for 2025

Balancing Low Latency, High Availability, and Cloud Choice

Snap: a microkernel approach to host networking

AI Essentials for Tech Executives

How Google PageSpeed Works: Improve Your Score and Search Engine Ranking

Redis® Monitoring Strategies for 2024

Introducing Dynatrace built-in data observability on Davis AI and Grail

Seamless offloading of web app computations from mobile device to edge clouds via HTML5 Web Worker migration

Back-to-Basics Weekend Reading - U-Net: A User-Level Network Interface

Stay Connected