Engineering, Hardware and Latency - Technology Performance Pulse

Cloud infrastructure monitoring in action: Dynatrace on Dynatrace

Dynatrace

SEPTEMBER 29, 2020

On one hand, they enable our engineers to get their latest enhancements deployed into production. Sydney, we have a disk write latency problem! It was on August 25 th at 14:00 when Davis initially alerted on a disk write latency issues to Elastic File System (EFS) on one of our EC2 instances in AWS’s Sydney Data Center.

Infrastructure

Infrastructure Cloud Monitoring AWS

Dynatrace Managed turnkey Premium High Availability for globally distributed data centers (Early Adopter)

Dynatrace

JUNE 25, 2020

The network latency between cluster nodes should be around 10 ms or less. Our Premium High Availability comes with the following features: Active-active deployment model for optimum hardware utilization. – A Dynatrace customer, Head of Performance Engineering. Automatic recovery for outages for up to 72 hours.

Availability

Availability Hardware Latency Traffic

Growth Engineering at Netflix?—?Automated Imagery Generation

The Netflix TechBlog

FEBRUARY 9, 2021

Growth Engineering at Netflix?—?Automated In the Growth Engineering team, we refer to this as the top of the signup funnel. For more background on the signup funnel and Growth Engineering’s role in the signup funnel, please read our initial post on the topic: Growth Engineering at Netflix? Accelerating Innovation.

Engineering

Engineering Storage Latency Entertainment

Predictive CPU isolation of containers at Netflix

The Netflix TechBlog

JUNE 4, 2019

Because microprocessors are so fast, computer architecture design has evolved towards adding various levels of caching between compute units and the main memory, in order to hide the latency of bringing the bits to the brains. This avoids thrashing caches too much for B and evens out the pressure on the L3 caches of the machine.

Cache

Cache Latency Airlines Logistics

What is AWS Lambda?

Dynatrace

APRIL 5, 2021

This is where Lambda comes in: Developers can deploy programs with no concern for the underlying hardware, connecting to services in the broader ecosystem, creating APIs, preparing data, or sending push notifications directly in the cloud, to list just a few examples. AWS continues to improve how it handles latency issues.

Lambda

Lambda AWS Serverless Hardware

Best Practices for Scaling RabbitMQ

Scalegrid

FEBRUARY 24, 2025

While clustering across wide-area networks (WANs) is discouraged due to latency issues, leased links can mitigate some connectivity challenges. Keeping queues short minimizes latency and enhances the overall efficiency of message delivery in RabbitMQ. Keeping queues short maintains a responsive and efficient RabbitMQ setup.

Best Practices

Best Practices Traffic Strategy Efficiency

What is ITOps? Why IT operations is more crucial than ever in a multicloud world

Dynatrace

DECEMBER 15, 2022

Besides the traditional system hardware, storage, routers, and software, ITOps also includes virtual components of the network and cloud infrastructure. Although modern cloud systems simplify tasks, such as deploying apps and provisioning new hardware and servers, hybrid cloud and multicloud environments are often complex. Performance.

Artificial Intelligence

Artificial Intelligence DevOps Hardware Virtualization

These 7 Edge Data Challenges Will Test Companies the Most in 2025

VoltDB

DECEMBER 11, 2024

By bringing computation closer to the data source, edge-based deployments reduce latency, enhance real-time capabilities, and optimize network bandwidth. Use hardware-based encryption and ensure regular over-the-air updates to maintain device security. Increased latency during peak loads. Data interception during transit.

IoT

IoT Energy Logistics Latency

What is cloud migration?

Dynatrace

SEPTEMBER 30, 2021

It requires purchasing, powering, and configuring physical hardware, training and retaining the staff capable of servicing and securing the machines, operating a data center, and so on. They need enough hardware to serve their anticipated volume and keep things running smoothly without buying too much or too little. Reduced cost.

Cloud

Cloud Traffic Best Practices Strategy

The Three Types of Performance Testing

CSS Wizardry

OCTOBER 27, 2018

Things always always feel fast when we’re developing because, more often than not, we’re working on high-spec machines on dedicated networks, and also serving from localhost which removes the bulk of the latency and bandwidth issues that a real user would suffer. Who: Engineers. Who: Engineers, Product Owners.

Performance Testing

Performance Testing Testing Performance Strategy

Under the Hood of Amazon EC2 Container Service

All Things Distributed

JULY 20, 2015

To be robust and scalable, this key/value store needs to be distributed for durability and availability, to protect against network partitions or hardware failures. This architecture affords Amazon ECS high availability, low latency, and high throughput because the data store is never pessimistically locked.

Latency

Latency Architecture AWS Open Source

Amazon DynamoDB ? a Fast and Scalable NoSQL Database.

All Things Distributed

JANUARY 18, 2012

Amazon DynamoDB offers low, predictable latencies at any scale. This is not just predictability of median performance and latency, but also at the end of the distribution (the 99.9th percentile), so we could provide acceptable performance for virtually every customer. Another important requirement for Dynamo was predictability.

Scalability

Scalability Database Ecommerce Latency

Achieving 100Gbps intrusion prevention on a single server

The Morning Paper

NOVEMBER 15, 2020

This makes the whole system latency sensitive. So we need low latency, but we also need very high throughput: A recurring theme in IDS/IPS literature is the gap between the workloads they need to handle and the capabilities of existing hardware/software implementations. The target FPGA for Pigasus has 16MB of BRAM.

Servers

Servers Hardware Latency Design

Balancing Low Latency, High Availability, and Cloud Choice

VoltDB

MAY 14, 2024

Balancing Low Latency, High Availability and Cloud Choice Cloud hosting is no longer just an option — it’s now, in many cases, the default choice. Low usage or over-engineered legacy systems Until the arrival of cloud hosting, nobody knew how to size anything properly, or they did know but never got a chance to. Why are they refusing?

Latency

Latency Availability Cloud Hardware

5.5 mm in 1.25 nanoseconds

Randon ASCII

JANUARY 12, 2022

That meant I started having regular meetings with the hardware engineers who were working with IBM on the CPU which gave me even more expertise on this CPU, which was critical in helping me discover a design flaw in one of its instructions , and in helping game developers master this finicky beast. So, anyway. Standard stuff.

Cache

Cache Latency Benchmarking Hardware

USENIX LISA2021 Computing Performance: On the Horizon

Brendan Gregg

JULY 4, 2021

I summarized these topics and more as a plenary conference talk, including my own predictions (as a senior performance engineer) for the future of computing performance, with a focus on back-end servers. This was a chance to talk about other things I've been working on, such as the present and future of hardware performance.

Performance

Performance Latency Hardware Storage

Välkommen till Stockholm – An AWS Region is coming to the Nordics

All Things Distributed

APRIL 4, 2017

This enables customers to serve content to their end users with low latency, giving them the best application experience. In 2011, AWS opened a Point of Presence (PoP) in Stockholm to enable customers to serve content to their end users with low latency. As well as AWS Regions, we also have 24 AWS Edge Network Locations in Europe.

AWS

AWS Airlines Latency Games

10 Lessons from 10 Years of Amazon Web Services

All Things Distributed

MARCH 11, 2016

Marvin Theimer, Amazon Distinguished Engineer, once jokingly said that the evolution of Amazon S3 could best be described as starting off as a single engine Cessna plane, but over time the plane was upgraded to a 737, then a group of 747s, all the way to the large fleet of Airbus 380s that it is now. Primitives not frameworks.

AWS

AWS Hardware Retail Virtualization

Snap: a microkernel approach to host networking

The Morning Paper

NOVEMBER 10, 2019

You need a lot of software engineers and the willingness to rewrite a lot of software to entertain that idea. Here are the bombshell paragraphs: Our datacenter applications seek ever more CPU-efficient and lower-latency communication, which Pony Express delivers. The desire for CPU efficiency and lower latencies is easy to understand.

Network

Network Transportation Latency Entertainment

Embrace event-driven computing: Amazon expands DynamoDB with streams, cross-region replication, and database triggers

All Things Distributed

JULY 14, 2015

In traditional database architectures, database engines often run a small search engine or data warehouse engines on the same hardware as the database. However, in the past, you had to write code to manage the data changes and deal with keeping the search engine and data warehousing engines in sync.

Database

Database Lambda AWS IoT

InnoDB Performance Optimization Basics

Percona

MARCH 23, 2023

Hardware Memory The amount of RAM to be provisioned for database servers can vary greatly depending on the size of the database and the specific requirements of the company. have been released since then with some major changes. Some servers may need a few GBs of RAM, while others may need hundreds of GBs or even terabytes of RAM.

Performance

Performance Hardware Tuning Storage

Amazon EC2 Cluster GPU Instances - All Things Distributed

All Things Distributed

NOVEMBER 15, 2010

For example, the most fundamental abstraction trade-off has always been latency versus throughput. Modern CPUs strongly favor lower latency of operations with clock cycles in the nanoseconds and we have built general purpose software architectures that can exploit these low latencies very well.Â Where to go from here?

AWS

AWS Programming Latency Architecture

Redis® Monitoring Strategies for 2025

Scalegrid

JANUARY 21, 2025

Identifying key Redis metrics such as latency, CPU usage, and memory metrics is crucial for effective Redis monitoring. To monitor Redis instances effectively, collect Redis metrics focusing on cache hit ratio, memory allocated, and latency threshold. It is important to understand these challenges properly to find solutions for them.

Strategy

Strategy Monitoring Latency DevOps

What is a Site Reliability Engineer (SRE)?

Dotcom-Montior

OCTOBER 6, 2021

A site reliability engineer, or SRE, is a role that that encompasses aspects of both software engineering and operations/infrastructure. The term site reliability engineering first came into existence at Google in 2003 when a site reliability team was created. At that time, the team was made up of software engineers.

Engineering

Engineering DevOps Monitoring Google

Best Practices for a Seamless MongoDB Upgrade

Percona

NOVEMBER 2, 2023

Improved performance : MongoDB continually fine-tunes its database engine, resulting in faster query execution and reduced latency. You should also review your hardware resources, how you use MongoDB, and any custom configurations.

Best Practices

Best Practices Hardware Tuning Scalability

Redis® Monitoring Strategies for 2024

Scalegrid

DECEMBER 21, 2023

Identifying key Redis® metrics such as latency, CPU usage, and memory metrics is crucial for effective Redis monitoring. To monitor Redis® instances effectively, collect Redis metrics focusing on cache hit ratio, memory allocated, and latency threshold. It is important to understand these challenges properly to find solutions for them.

Strategy

Strategy Monitoring Latency DevOps

How Google PageSpeed Works: Improve Your Score and Search Engine Ranking

CSS - Tricks

JULY 25, 2019

Estimated Input Latency. Estimated Input Latency. It’s time to come to terms that your customers aren’t using the same powerful hardware as you. An excellent substitute for using a real device is to use Chrome DevTools hardware emulation mode. There are six metrics used to create the overall performance score.

Google

Google Engineering Speed Mobile

Seamless offloading of web app computations from mobile device to edge clouds via HTML5 Web Worker migration

The Morning Paper

JANUARY 30, 2020

Edge servers are the middle ground – more compute power than a mobile device, but with latency of just a few ms. physics engine that simulates 3D cubes falling from the air. These use their regression models to estimate processing time (which will depend on the hardware available, current load, etc.).

Mobile

Mobile Cloud Latency Games

Progress Delayed Is Progress Denied

Alex Russell

APRIL 29, 2021

Apple forces developers of competing browsers to use their engine for all browsers on iOS , restricting their ability to deliver a better version of the web platform. They are, pound for pound, some of the best engine developers globally and genuinely want good things for the web. So is speedy resolution and agreement.

Media

Media Games Education Engineering

Expanding the Cloud - Cluster Compute Instances for Amazon EC2.

All Things Distributed

JULY 13, 2010

In particular this has been true for applications based on algorithms - often MPI-based - that depend on frequent low-latency communication and/or require significant cross sectional bandwidth. There is no more need for hardware tinkering to keep the clusters up and running (I spent many nights doing this; there is no glory in it).

Cloud

Cloud AWS Automotive Latency

Automating chaos experiments in production

The Morning Paper

JULY 4, 2019

This is a fascinating paper from members of Netflix’s Resilience Engineering team describing their chaos engineering initiatives: automated controlled experiments designed to verify hypotheses about how the system should behave under gray failure conditions, and to probe for and flush out any weaknesses. The Chaos automation platform.

Latency

Latency Engineering Metrics Traffic

ChatGPT vs. MySQL DBA Challenge

Percona

MAY 2, 2023

At the same time that I see database engineers relying on the tool, sites such as StackOverflow are banning ChatGPT. However, it’s important to note that the optimal value may vary depending on your specific workload and hardware configuration. 16) and monitoring the server’s performance. Let’s make it harder now.

Social Media

Social Media Database Servers Cache

MySQL Key Performance Indicators (KPI) With PMM

Percona

JUNE 22, 2023

Another related variable, innodb_buffer_pool_instances, determines the number of buffer pool instances for the InnoDB storage engine, which can improve the performance of multi-core systems by reducing contention on the buffer pool latch.

Performance

Performance Monitoring Traffic Database

How To Build Resilient JavaScript UIs

Smashing Magazine

AUGUST 3, 2021

Fortunately, we as engineers can avoid, or at least mitigate the impact of breakages in the web apps we build. Engineers need visibility on the root cause behind a degraded experience. Even errors not surfaced to end-users (secondary errors) must propagate to engineers. More after jump! Observability. Large preview ).

Network

Network Engineering Availability Code

A case for managed and model-less inference serving

The Morning Paper

JUNE 13, 2019

As we saw with the SOAP paper last time out, even with a fixed model variant and hardware there are a lot of different ways to map a training workload over the available hardware. The following figure highlights how just one of these variables, batch size, impacts throughput and latency on ResNet50.

Hardware

Hardware Latency Serverless Energy

USENIX SREcon APAC 2022: Computing Performance: What's on the Horizon

Brendan Gregg

FEBRUARY 28, 2023

My personal opinion is that I don't see a widespread need for more capacity given horizontal scaling and servers that can already exceed 1 Tbyte of DRAM; bandwidth is also helpful, but I'd be concerned about the increased latency for adding a hop to more memory. Ford, et al., “TCP

Performance

Performance Latency Cache Virtualization

Millions of tiny databases

The Morning Paper

MARCH 3, 2020

It takes you through the thinking processes and engineering practices behind the design of a key part of the control plane for AWS Elastic Block Storage (EBS): the Physalia database that stores configuration information. Engineering decisions involve making lots of trade-offs. Larger cells have better tolerance of tail latency (e.g.

Database

Database AWS Network Design

Monitoring Distributed Systems

Dotcom-Montior

NOVEMBER 24, 2021

Within an organization, the responsibility of monitoring these large distributed systems typically falls on site reliability engineering (SRE) teams. Software and hardware components are autonomous and execute tasks concurrently. This also includes latency, or the time it takes for data or a request to get through a network.

Systems

Systems Monitoring Hardware Network

Expanding the AWS Cloud – Introducing the AWS Europe (Stockholm) Region

All Things Distributed

DECEMBER 12, 2018

They can run applications in Sweden, serve end users across the Nordics with lower latency, and leverage advanced technologies such as containers, serverless computing, and more. million vehicles in more than 75 countries with services like car locator, engine remote start, driving journal, heater start, and stolen vehicle tracking.

AWS

AWS Cloud Games Serverless

In-Stream Big Data Processing

Highly Scalable

AUGUST 20, 2013

This article is an effort to explore techniques used by developers of in-stream data processing systems, trace the connections of these techniques to massive batch processing and OLTP/OLAP databases, and discuss how one unified query engine can support in-stream, batch, and OLAP processing at the same time. Modularity and flexibility.

Big Data

Big Data Processing Lambda Database

Linux Load Averages: Solving the Mystery

Brendan Gregg

AUGUST 8, 2017

They are demand on the system, albeit for software resources rather than hardware resources. ## Decomposing Linux load averages Can the Linux load average value be fully decomposed into components? Latency was acceptable and no one complained. Yes, I'd say so. They aren't idle. Load averages measured in a modern tool.

Latency

Latency Metrics C++ Systems

Friends don't let friends build data pipelines

Abhishek Tiwari

JULY 12, 2018

These data pipelines can process data at petabytes scale and to some extent, their success can be attributed to an army of engineers devoted to build and maintain internal data pipelines. Not everyone is operating at Netflix or Spotify scale data engineering function. A data pipeline is a software which runs on hardware.

Latency

Latency Analytics Scalability Engineering

Software-defined far memory in warehouse scale computers

The Morning Paper

MAY 21, 2019

This boils down to a single digit µs latency toleration in the tail for far memory, and in addition to security and privacy concerns, rules out remote memory solutions. Thus we’re fundamentally trading (de)-compression latency at access time for the ability to pack more data in memory.

Software

Software Software Google Hardware

The Speed of Time

Brendan Gregg

SEPTEMBER 25, 2021

A Cassandra database cluster had switched to Ubuntu and noticed write latency increased by over 30%. As a Xen guest, this profile was gathered using perf(1) and the kernel's software cpu-clock soft interrupts, not the hardware NMI. A quick check of basic performance statistics showed over 30% higher CPU consumption. in total.

Speed

Speed Java AWS Virtualization

Cloud infrastructure monitoring in action: Dynatrace on Dynatrace

Dynatrace Managed turnkey Premium High Availability for globally distributed data centers (Early Adopter)

Trending Sources

Growth Engineering at Netflix?—?Automated Imagery Generation

Predictive CPU isolation of containers at Netflix

What is AWS Lambda?

Best Practices for Scaling RabbitMQ

What is ITOps? Why IT operations is more crucial than ever in a multicloud world

These 7 Edge Data Challenges Will Test Companies the Most in 2025

What is cloud migration?

The Three Types of Performance Testing

Under the Hood of Amazon EC2 Container Service

Amazon DynamoDB ? a Fast and Scalable NoSQL Database.

Achieving 100Gbps intrusion prevention on a single server

Balancing Low Latency, High Availability, and Cloud Choice

5.5 mm in 1.25 nanoseconds

USENIX LISA2021 Computing Performance: On the Horizon

Välkommen till Stockholm – An AWS Region is coming to the Nordics

10 Lessons from 10 Years of Amazon Web Services

Snap: a microkernel approach to host networking

Embrace event-driven computing: Amazon expands DynamoDB with streams, cross-region replication, and database triggers

InnoDB Performance Optimization Basics

Amazon EC2 Cluster GPU Instances - All Things Distributed

Redis® Monitoring Strategies for 2025

What is a Site Reliability Engineer (SRE)?

Best Practices for a Seamless MongoDB Upgrade

Redis® Monitoring Strategies for 2024

How Google PageSpeed Works: Improve Your Score and Search Engine Ranking

Seamless offloading of web app computations from mobile device to edge clouds via HTML5 Web Worker migration

Progress Delayed Is Progress Denied

Expanding the Cloud - Cluster Compute Instances for Amazon EC2.

Automating chaos experiments in production

ChatGPT vs. MySQL DBA Challenge

MySQL Key Performance Indicators (KPI) With PMM

How To Build Resilient JavaScript UIs

A case for managed and model-less inference serving

USENIX SREcon APAC 2022: Computing Performance: What's on the Horizon

Millions of tiny databases

Monitoring Distributed Systems

Expanding the AWS Cloud – Introducing the AWS Europe (Stockholm) Region

In-Stream Big Data Processing

Linux Load Averages: Solving the Mystery

Friends don't let friends build data pipelines

Software-defined far memory in warehouse scale computers

The Speed of Time

Stay Connected