Architecture, Latency and Metrics - Technology Performance Pulse

Cut costs and complexity: 5 strategies for reducing tool sprawl with Dynatrace

Dynatrace

APRIL 10, 2025

As an executive, I am always seeking simplicity and efficiency to make sure the architecture of the business is as streamlined as possible. Worsened by separate tools to track metrics, logs, traces, and user behaviorcrucial, interconnected details are separated into different storage.

Strategy

Strategy Storage Network Architecture

What is observability? Not just logs, metrics and traces

Dynatrace

OCTOBER 1, 2021

As dynamic systems architectures increase in complexity and scale, IT teams face mounting pressure to track and respond to conditions and issues across their multi-cloud environments. Dynatrace news. Why is it important, and what can it actually help organizations achieve? What is observability? How do you make a system observable?

Metrics

Metrics Open Source Monitoring Cloud

RabbitMQ vs. Kafka: Key Differences

Scalegrid

FEBRUARY 6, 2025

This article outlines the key differences in architecture, performance, and use cases to help determine the best fit for your workload. RabbitMQ follows a message broker model with advanced routing, while Kafkas event streaming architecture uses partitioned logs for distributed processing. What is RabbitMQ? What is Apache Kafka?

Latency

Latency Analytics Architecture Storage

Introducing Impressions at Netflix

The Netflix TechBlog

FEBRUARY 14, 2025

Architecture Overview The first pivotal step in managing impressions begins with the creation of a Source-of-Truth (SOT) dataset. Impression Source-of-Truth architecture Ensuring High Quality Impressions Maintaining the highest quality of impressions is a top priority.

Tuning

Tuning Latency Efficiency Storage

Best Practices for Scaling RabbitMQ

Scalegrid

FEBRUARY 24, 2025

This decoupling is crucial in modern architectures where scalability and fault tolerance are paramount. The architecture of RabbitMQ is meticulously designed for complex message routing, enabling dynamic and flexible interactions between producers and consumers. Keeping queues short maintains a responsive and efficient RabbitMQ setup.

Best Practices

Best Practices Traffic Strategy Scalability

Migrating Critical Traffic At Scale with No Downtime?—?Part 1

The Netflix TechBlog

MAY 4, 2023

When undertaking system migrations, one of the main challenges is establishing confidence and seamlessly transitioning the traffic to the upgraded architecture without adversely impacting the customer experience. It provides a good read on the availability and latency ranges under different production conditions.

Traffic

Traffic Latency Tuning Systems

Implementing service-level objectives to improve software quality

Dynatrace

DECEMBER 27, 2022

As more organizations embrace microservices-based architecture to deliver goods and services digitally, maintaining customer satisfaction has become exponentially more challenging. By implementing service-level objectives, teams can avoid collecting and checking a huge amount of metrics for each service. Define SLOs for each service.

Software

Software Software Benchmarking Latency

Who will watch the watchers? Extended infrastructure observability for WSO2 API Manager

Dynatrace

SEPTEMBER 18, 2020

Cloud-based application architectures commonly leverage microservices. High latency or lack of responses. You receive an alert message from Dynatrace (your infrastructure observability hub) letting you know that the average response latency of all deployed APIs has tripled. Dynatrace monitors 29 WSO2 API Manager–related metrics.

Infrastructure

Infrastructure Latency Metrics Cloud

Optimize your observability pipeline for AWS Lambda serverless functions

Dynatrace

NOVEMBER 10, 2022

As companies accelerate digital transformation, cloud services such as AWS Lambda help companies to modernize their application architectures to quickly adapt to the needs of their customers while offloading the operational complexity to their cloud vendor. The need for a simplified approach to capture telemetry. How to get started.

Lambda

Lambda Serverless AWS Latency

Dynatrace supports SnapStart for Lambda as an AWS launch partner

Dynatrace

NOVEMBER 28, 2022

The new Amazon capability enables customers to improve the startup latency of their functions from several seconds to as low as sub-second (up to 10 times faster) at P99 (the 99th latency percentile). This can cause latency outliers and may lead to a poor end-user experience for latency-sensitive applications.

Lambda

Lambda AWS Serverless Latency

Lessons learned from enterprise service-level objective management

Dynatrace

MAY 19, 2022

Example 1: Architecture boundaries. First, they took a big step back and looked at their end-to-end architecture (Figure 2). SLO dashboard defined by architectural boundary. In their new dashboard, they added dimensions for load, latency, and open problems for each component. Not all attempts succeed on the first try.

Automotive

Automotive Latency Architecture Azure

Dynatrace supports Azure Managed Instance for Apache Cassandra

Dynatrace

MAY 13, 2022

Because of its scalability and distributed architecture, thousands of companies trust it to run their cloud and hybrid-based workloads at high availability without compromising performance. Once you deploy the Dynatrace extension, Dynatrace ingests your Cassandra metrics and analyzes them in context with the entire stack.

Azure

Azure Latency Metrics Infrastructure

Rapid Event Notification System at Netflix

The Netflix TechBlog

FEBRUARY 18, 2022

Motivation With the rapid growth in Netflix member base and the increasing complexity of our systems, our architecture has evolved into an asynchronous one that enables both online and offline computation. Architecture As shown in the diagram above, the RENO service can be broken down into the following components.

Systems

Systems Traffic Architecture Mobile

Observability vs. monitoring: What’s the difference?

Dynatrace

NOVEMBER 3, 2021

Organizations are depending more and more on distributed architectures to provide application services. Monitoring focuses on watching specific metrics. Observability is the ability to understand a system’s internal state by analyzing the data it generates, such as logs, metrics, and traces. Dynatrace news.

Monitoring

Monitoring Metrics DevOps Scalability

Improved Alerting with Atlas Streaming Eval

The Netflix TechBlog

APRIL 27, 2023

A few years ago, we were paged by our SRE team due to our Metrics Alerting System falling behind — critical application health alerts reached engineers 45 minutes late! Hence, we started down the path of alert evaluation via real-time streaming metrics. OK, Results?

Storage

Storage Cache Metrics Database

Why applying chaos engineering to data-intensive applications matters

Dynatrace

MAY 23, 2024

Stream processing One approach to such a challenging scenario is stream processing, a computing paradigm and software architectural style for data-intensive software systems that emerged to cope with requirements for near real-time processing of massive amounts of data. This significantly increases event latency.

Engineering

Engineering Tuning Latency Open Source

Dynatrace supports the newly released AWS Lambda Response Streaming

Dynatrace

APRIL 7, 2023

Customers can use AWS Lambda Response Streaming to improve performance for latency-sensitive applications and return larger payload sizes. Customers can use response streaming to achieve the following: Improve Time to First Byte (TTFB) performance for latency-sensitive applications. Return larger payload sizes. How does Dynatrace help?

Lambda

Lambda AWS Serverless Latency

Site reliability done right: 5 SRE best practices that deliver on business objectives

Dynatrace

MAY 31, 2023

As a result, site reliability has emerged as a critical success metric for many organizations. Microservices-based architectures and software containers enable organizations to deploy and modify applications with unprecedented speed. The following three metrics are commonly used to measure success: Service-level agreements (SLAs).

Best Practices

Best Practices DevOps Latency Metrics

Observability platform vs. observability tools

Dynatrace

DECEMBER 22, 2021

Observability is made up of three key pillars: metrics, logs, and traces. Metrics are measures of critical system values, such as CPU utilization or average write latency to persistent storage. They are particularly important in distributed systems, such as microservices architectures.

Artificial Intelligence

Artificial Intelligence Metrics Architecture DevOps

Dynatrace accelerates business transformation with new AI observability solution

Dynatrace

JANUARY 31, 2024

Retrieval-augmented generation emerges as the standard architecture for LLM-based applications Given that LLMs can generate factually incorrect or nonsensical responses, retrieval-augmented generation (RAG) has emerged as an industry standard for building GenAI applications.

Cache

Cache Azure Infrastructure Monitoring

What is serverless computing? Driving efficiency without sacrificing observability

Dynatrace

JANUARY 26, 2021

Within this paradigm, it is possible to run entire architectures without touching a traditional virtual server, either locally or in the cloud. In a serverless architecture, applications are distributed to meet demand and scale requirements efficiently. When an application is triggered, it can cause latency as the application starts.

Serverless

Serverless Efficiency Lambda Azure

Seamlessly Swapping the API backend of the Netflix Android app

The Netflix TechBlog

SEPTEMBER 8, 2020

We tried a few iterations of what this new service should look like, and eventually settled on a modern architecture that aimed to give more control of the API experience to the client teams. For us, it means that we now need to have ~15 MDN tabs open when writing routes :) Let’s briefly discuss the architecture of this microservice.

Latency

Latency Cache Java Traffic

Extending Vector with eBPF to inspect host and container performance

The Netflix TechBlog

FEBRUARY 20, 2019

Today we are excited to announce latency heatmaps and improved container support for our on-host monitoring solution?—?Vector?—?to Remotely view real-time process scheduler latency and tcp throughput with Vector and eBPF What is Vector? to the broader community. Vector is open source and in use by multiple companies.

Performance

Performance Latency Open Source Metrics

Unlock the power of contextual log analytics

Dynatrace

OCTOBER 2, 2024

Davis AI contextually aligns all relevant data points—such as logs, traces, and metrics—enabling teams to act quickly and accurately while still providing power users with the flexibility and depth they desire and need. The Clouds app provides a view of all available cloud-native services.

Analytics

Analytics AWS DevOps Cloud

What is AWS Lambda?

Dynatrace

APRIL 5, 2021

Real-time stream processing to perform live activity tracking, data cleansing, metrics generation, and more. Lambda’s highly efficient, on-demand computing environment aligns with today’s microservices-centric architectures, and readily integrates with other popular AWS offerings that an organization may already be using.

Lambda

Lambda AWS Serverless Hardware

Migrating Critical Traffic At Scale with No Downtime?—?Part 2

The Netflix TechBlog

JUNE 13, 2023

By collecting and analyzing key performance metrics of the service over time, we can assess the impact of the new changes and determine if they meet the availability, latency, and performance requirements. The results are then evaluated using specific metrics to determine whether the hypothesis is valid.

Traffic

Traffic Metrics Systems Strategy

Implementing AWS well-architected pillars with automated workflows

Dynatrace

SEPTEMBER 13, 2023

Because Google offers its own Google Cloud Architecture Framework and Microsoft its Azure Well-Architected Framework , organizations that use a combination of these platforms triple the challenge of integrating their performance frameworks into a cohesive strategy. SRG validates the status of the resiliency SLOs for the experiment period.

AWS

AWS Efficiency Azure Cloud

Introducing Netflix TimeSeries Data Abstraction Layer

The Netflix TechBlog

OCTOBER 8, 2024

Rajiv Shringi Vinay Chella Kaidan Fullerton Oleksii Tkachuk Joey Lynch Introduction As Netflix continues to expand and diversify into various sectors like Video on Demand and Gaming , the ability to ingest and store vast amounts of temporal data — often reaching petabytes — with millisecond access latency has become increasingly vital.

Latency

Latency Storage Traffic Tuning

What are SLOs? How service-level objectives work with SLIs to deliver on SLAs

Dynatrace

DECEMBER 2, 2021

As organizations adopt microservices-based architecture , service-level objectives (SLOs) have become a vital way for teams to set specific, measurable targets that ensure users are receiving agreed-upon service levels. SLIs provide the actual metrics and measurements that indicate whether you are meeting your SLO. Dynatrace news.

Metrics

Metrics Best Practices DevOps Infrastructure

Enhancing Kubernetes cluster management key to platform engineering success

Dynatrace

MARCH 29, 2024

. “We use AI to optimize the configuration of the software stack,” Doni said, highlighting how Akamas works by taking into account infrastructure and application metrics at the same time to achieve its optimization goals. You can ask for the best configuration to reduce latency or improve the user experience.”

Engineering

Engineering DevOps Operating System Open Source

Keeping Netflix Reliable Using Prioritized Load Shedding

The Netflix TechBlog

NOVEMBER 2, 2020

High level playback architecture with priority throttling and chaos testing Building a request taxonomy We decided to focus on three dimensions in order to categorize request traffic: throughput, functionality, and criticality. Those two metrics are approximate indicators of failures and latency.

Traffic

Traffic Metrics Infrastructure Architecture

Edge Authentication and Token-Agnostic Identity Propagation

The Netflix TechBlog

FEBRUARY 9, 2021

Plus, the architecture of the Edge tier was evolving to a PaaS (platform as a service) model, and we had some tough decisions to make about how, and where, to handle identity token handling. The system architecture now takes the form of: Notice that tokens never traverse past the Edge gateway / EAS boundary. We are serving over 2.5

Architecture

Architecture Latency Servers Website

Towards a Reliable Device Management Platform

The Netflix TechBlog

AUGUST 30, 2021

System Setup Architecture The following diagram summarizes the architecture description: Figure 1: Event-sourcing architecture of the Device Management Platform. Thus, the implemented solution must integrate with Netflix Spring facilities for authentication and metrics support at the very minimum?—?the million elements.

Latency

Latency Traffic Transportation Cloud

No need to compromise visibility in public clouds with the new Azure services supported by Dynatrace

Dynatrace

JULY 6, 2020

While the Azure overview page in Dynatrace has long featured monitoring data detected by OneAgent, with additional metrics pulled from Azure Monitor and topology information from Azure Resource Graph, the overview page now gives you quick access to the newly added services, which are listed under Supporting services.

Azure

Azure Cloud Big Data Virtualization

Netflix Video Quality at Scale with Cosmos Microservices

The Netflix TechBlog

NOVEMBER 2, 2021

In particular, the VMAF metric lies at the core of improving the Netflix member’s streaming video quality. The Reloaded system is a well-matured and scalable system, but its monolithic architecture can slow down rapid innovation. This enables us to use our scale to increase throughput and reduce latencies. via bug fixes).

Media

Media Innovation Metrics Latency

Telltale: Netflix Application Monitoring Simplified

The Netflix TechBlog

AUGUST 13, 2020

A metric crossed a threshold. Metrics are a key part of understanding application health. But sometimes you can have too many metrics, too many graphs, and too many dashboards. Telltale uses a variety of signals from multiple sources to assemble a constantly evolving model of the application’s health: Atlas time series metrics.

Monitoring

Monitoring Tuning Traffic Metrics

Under the Hood of Amazon EC2 Container Service

All Things Distributed

JULY 20, 2015

Today, I want to explore the Amazon ECS architecture and what this architecture enables. This architecture affords Amazon ECS high availability, low latency, and high throughput because the data store is never pessimistically locked. Below is a diagram of the basic components of Amazon ECS: How we coordinate the cluster.

Latency

Latency Architecture AWS Open Source

What is ITOps? Why IT operations is more crucial than ever in a multicloud world

Dynatrace

DECEMBER 15, 2022

ITOps teams use more technical IT incident metrics, such as mean time to repair, mean time to acknowledge, mean time between failures, mean time to detect, and mean time to failure, to ensure long-term network stability. This includes response time, accuracy, speed, throughput, uptime, CPU utilization, and latency. Performance.

Artificial Intelligence

Artificial Intelligence DevOps Hardware Virtualization

Open Observability – Part 1: Distributed tracing and observability

Dynatrace

JUNE 25, 2021

Already in the 2000s, service-oriented architectures (SOA) became popular, and operations teams discovered the need to understand how transactions traverse through all tiers and how these tiers contributed to the execution time and latency. OpenTelemetry aims to support three so-called observability signals, namely: metrics.

Open Source

Open Source Monitoring Google Systems

Redis® Monitoring Strategies for 2025

Scalegrid

JANUARY 21, 2025

With its widespread use in modern application architectures, understanding the ins and outs of Redis monitoring is essential for any tech professional. Identifying key Redis metrics such as latency, CPU usage, and memory metrics is crucial for effective Redis monitoring.

Strategy

Strategy Monitoring Latency DevOps

Managing risk for financial services: The secret to visibility and control during times of volatility

Dynatrace

APRIL 8, 2024

In summary, the Dynatrace platform enables banks to do the following: Capture any data type: logs, metrics, traces, topology, behavior, code, metadata, network, security, web, and real-user monitoring data, and business events. Maximize performance for high-frequency and low-latency trading strategies. Break down data silos.

Analytics

Analytics Infrastructure Efficiency Technology

Amazon DynamoDB ? a Fast and Scalable NoSQL Database.

All Things Distributed

JANUARY 18, 2012

Amazon DynamoDB offers low, predictable latencies at any scale. This architectural pattern was a response to the scaling challenges that had challenged Amazon.com through its first 5 years, when direct database access was one of the major bottlenecks in scaling and operating the business. This impacts the predictability of a Domainâ??s

Scalability

Scalability Database Ecommerce Latency

Redis® Monitoring Strategies for 2024

Scalegrid

DECEMBER 21, 2023

With its widespread use in modern application architectures, understanding the ins and outs of Redis® monitoring is essential for any tech professional. Identifying key Redis® metrics such as latency, CPU usage, and memory metrics is crucial for effective Redis monitoring.

Strategy

Strategy Monitoring Latency DevOps

Data Movement in Netflix Studio via Data Mesh

The Netflix TechBlog

JULY 26, 2021

Netflix is known for its loosely coupled microservice architecture and with a global studio footprint, surfacing and connecting the data from microservices into a studio data catalog in real time has become more important than ever. Most of the business views created on top of the Iceberg tables can tolerate a few minutes of latency.

Big Data

Big Data Government Processing Analytics

Cut costs and complexity: 5 strategies for reducing tool sprawl with Dynatrace

What is observability? Not just logs, metrics and traces

Trending Sources

RabbitMQ vs. Kafka: Key Differences

Introducing Impressions at Netflix

Best Practices for Scaling RabbitMQ

Migrating Critical Traffic At Scale with No Downtime?—?Part 1

Implementing service-level objectives to improve software quality

Who will watch the watchers? Extended infrastructure observability for WSO2 API Manager

Optimize your observability pipeline for AWS Lambda serverless functions

Dynatrace supports SnapStart for Lambda as an AWS launch partner

Lessons learned from enterprise service-level objective management

Dynatrace supports Azure Managed Instance for Apache Cassandra

Rapid Event Notification System at Netflix

Observability vs. monitoring: What’s the difference?

Improved Alerting with Atlas Streaming Eval

Why applying chaos engineering to data-intensive applications matters

Dynatrace supports the newly released AWS Lambda Response Streaming

Site reliability done right: 5 SRE best practices that deliver on business objectives

Observability platform vs. observability tools

Dynatrace accelerates business transformation with new AI observability solution

What is serverless computing? Driving efficiency without sacrificing observability

Seamlessly Swapping the API backend of the Netflix Android app

Extending Vector with eBPF to inspect host and container performance

Unlock the power of contextual log analytics

What is AWS Lambda?

Migrating Critical Traffic At Scale with No Downtime?—?Part 2

Implementing AWS well-architected pillars with automated workflows

Introducing Netflix TimeSeries Data Abstraction Layer

What are SLOs? How service-level objectives work with SLIs to deliver on SLAs

Enhancing Kubernetes cluster management key to platform engineering success

Keeping Netflix Reliable Using Prioritized Load Shedding

Edge Authentication and Token-Agnostic Identity Propagation

Towards a Reliable Device Management Platform

No need to compromise visibility in public clouds with the new Azure services supported by Dynatrace

Netflix Video Quality at Scale with Cosmos Microservices

Telltale: Netflix Application Monitoring Simplified

Under the Hood of Amazon EC2 Container Service

What is ITOps? Why IT operations is more crucial than ever in a multicloud world

Open Observability – Part 1: Distributed tracing and observability

Redis® Monitoring Strategies for 2025

Managing risk for financial services: The secret to visibility and control during times of volatility

Amazon DynamoDB ? a Fast and Scalable NoSQL Database.

Redis® Monitoring Strategies for 2024

Data Movement in Netflix Studio via Data Mesh

Stay Connected