Infrastructure, Latency and Processing - Technology Performance Pulse

Dynatrace on Microsoft Azure in Australia enables regional customers to leverage AI-powered observability

Dynatrace

OCTOBER 28, 2024

As modern multicloud environments become more distributed and complex, having real-time insights into applications and infrastructure while keeping data residency in local markets is crucial. Dynatrace on Microsoft Azure allows enterprises to streamline deployment, gain critical insights, and automate manual processes. The result?

Azure

Azure Latency Infrastructure Cloud

Netflix’s Distributed Counter Abstraction

The Netflix TechBlog

NOVEMBER 12, 2024

By: Rajiv Shringi , Oleksii Tkachuk , Kartik Sathyanarayanan Introduction In our previous blog post, we introduced Netflix’s TimeSeries Abstraction , a distributed service designed to store and query large volumes of temporal event data with low millisecond latencies. Today, we’re excited to present the Distributed Counter Abstraction.

Latency

Latency Cache Infrastructure Strategy

RabbitMQ vs. Kafka: Key Differences

Scalegrid

FEBRUARY 6, 2025

RabbitMQ is designed for flexible routing and message reliability, while Kafka handles high-throughput event streaming and real-time data processing. RabbitMQ follows a message broker model with advanced routing, while Kafkas event streaming architecture uses partitioned logs for distributed processing. What is Apache Kafka?

Latency

Latency Analytics Architecture Storage

Who will watch the watchers? Extended infrastructure observability for WSO2 API Manager

Dynatrace

SEPTEMBER 18, 2020

Sure, cloud infrastructure requires comprehensive performance visibility, as Dynatrace provides , but the services that leverage cloud infrastructures also require close attention. Extend infrastructure observability to WSO2 API Manager. High latency or lack of responses. Soaring number of active connections.

Infrastructure

Infrastructure Latency Metrics Cloud

Building Netflix’s Distributed Tracing Infrastructure

The Netflix TechBlog

OCTOBER 19, 2020

Now let’s look at how we designed the tracing infrastructure that powers Edgar. Reconstructing a streaming session was a tedious and time consuming process that involved tracing all interactions (requests) between the Netflix app, our Content Delivery Network (CDN), and backend microservices.

Infrastructure

Infrastructure Transportation Storage Open Source

Rebuilding Netflix Video Processing Pipeline with Microservices

The Netflix TechBlog

JANUARY 10, 2024

Future blogs will provide deeper dives into each service, sharing insights and lessons learned from this process. The Netflix video processing pipeline went live with the launch of our streaming service in 2007. The Netflix video processing pipeline went live with the launch of our streaming service in 2007.

Processing

Processing Media Latency Innovation

Title Launch Observability at Netflix Scale

The Netflix TechBlog

DECEMBER 17, 2024

The Challenge of Title Launch Observability As engineers, were wired to track system metrics like error rates, latencies, and CPU utilizationbut what about metrics that matter to a titlessuccess? Option 1: Log Processing Log processing offers a straightforward solution for monitoring and analyzing title launches.

Traffic

Traffic Scalability Strategy Monitoring

Optimize Citrix platform performance and user experience with Dynatrace (GA)

Dynatrace

JANUARY 15, 2020

Therefore, it requires multidimensional and multidisciplinary monitoring: Infrastructure health —automatically monitor the compute, storage, and network resources available to the Citrix system to ensure a stable platform. OneAgent: Citrix infrastructure performance. OneAgent: SAP infrastructure performance. Citrix VDA.

Latency

Latency Performance Virtualization Infrastructure

Noisy Neighbor Detection with eBPF

The Netflix TechBlog

SEPTEMBER 10, 2024

The first step is determining whether the problem originates from the application or the underlying infrastructure. One issue that often complicates this process is the "noisy neighbor" problem. Learn how Linux kernel instrumentation can improve your infrastructure observability with deeper insights and enhanced monitoring.

Latency

Latency Metrics Programming Monitoring

Foundation Model for Personalized Recommendation

The Netflix TechBlog

MARCH 28, 2025

Yet, many are confined to a brief temporal window due to constraints in serving latency or training costs. The impetus for constructing a foundational recommendation model is based on the paradigm shift in natural language processing (NLP) to large language models (LLMs).

Tuning

Tuning Efficiency Latency Strategy

Implementing service-level objectives to improve software quality

Dynatrace

DECEMBER 27, 2022

When organizations implement SLOs, they can improve software development processes and application performance. SLOs can be a great way for DevOps and infrastructure teams to use data and performance expectations to make decisions, such as whether to release and where engineers should focus their time. SLOs improve software quality.

Software

Software Software Benchmarking Latency

Why applying chaos engineering to data-intensive applications matters

Dynatrace

MAY 23, 2024

Stream processing One approach to such a challenging scenario is stream processing, a computing paradigm and software architectural style for data-intensive software systems that emerged to cope with requirements for near real-time processing of massive amounts of data.

Engineering

Engineering Tuning Latency Open Source

Comparing PostgreSQL DigitalOcean Performance & Pricing – ScaleGrid vs. DigitalOcean Managed Databases

Scalegrid

JUNE 4, 2020

As an open source database, it’s a highly popular choice for enterprise applications looking to modernize their infrastructure and reduce their total cost of ownership, along with startup and developer applications looking for a powerful, flexible and cost-effective database to work with. Compare Latency. At a glance – TLDR.

Database

Database Latency Benchmarking Performance

Dynatrace supports SnapStart for Lambda as an AWS launch partner

Dynatrace

NOVEMBER 28, 2022

The new Amazon capability enables customers to improve the startup latency of their functions from several seconds to as low as sub-second (up to 10 times faster) at P99 (the 99th latency percentile). This can cause latency outliers and may lead to a poor end-user experience for latency-sensitive applications.

Lambda

Lambda AWS Serverless Latency

Introducing Netflix’s Key-Value Data Abstraction Layer

The Netflix TechBlog

SEPTEMBER 18, 2024

Vidhya Arvind , Rajasekhar Ummadisetty , Joey Lynch , Vinay Chella Introduction At Netflix our ability to deliver seamless, high-quality, streaming experiences to millions of users hinges on robust, global backend infrastructure. It also serves as central configuration of access patterns such as consistency or latency targets.

Latency

Latency Storage Cache Servers

Best Practices for Scaling RabbitMQ

Scalegrid

FEBRUARY 24, 2025

While clustering across wide-area networks (WANs) is discouraged due to latency issues, leased links can mitigate some connectivity challenges. Proper setup involves creating a configuration process that accounts for hostname changes, which could prevent nodes from rejoining the cluster. Erlang is the backbone of RabbitMQ clustering.

Best Practices

Best Practices Traffic Strategy Scalability

OpenTelemetry 101: A nontechnical guide for IT leaders and enthusiasts

Dynatrace

JULY 22, 2024

Using OpenTelemetry, developers can collect and process telemetry data from applications, services, and systems. Text-based records of events and activities generated by applications and infrastructure components. Traces are used for performance analysis, latency optimization, and root cause analysis.

Latency

Latency Best Practices Metrics Open Source

The Three Cs: Concatenate, Compress, Cache

CSS Wizardry

OCTOBER 16, 2023

In this post, I’m going to break these processes down into each of: ? Plotted on the same horizontal axis of 1.6s, the waterfalls speak for themselves: 201ms of cumulative latency; 109ms of cumulative download. 4,362ms of cumulative latency; 240ms of cumulative download. Read the complete test methodology. It gets worse.

Cache

Cache Latency Strategy Speed

What are quality gates? How to use quality gates to deliver better software at speed and scale

Dynatrace

FEBRUARY 21, 2024

To remain competitive in today’s fast-paced market, organizations must not only ensure that their digital infrastructure is functioning optimally but also that software deployments and updates are delivered rapidly and consistently. Vulnerability score: The highest vulnerability risk score for a process group.

Speed

Speed Software Software Latency

Site reliability engineering: 5 things you need to know

Dynatrace

FEBRUARY 4, 2021

Site reliability engineering (SRE) is the practice of applying software engineering principles to operations and infrastructure processes to help organizations create highly reliable and scalable software systems. Shift-left using an SRE approach means that reliability is baked into each process, app and code change.

Engineering

Engineering DevOps Government Latency

The Netflix Cosmos Platform

The Netflix TechBlog

MARCH 1, 2021

It supports both high throughput services that consume hundreds of thousands of CPUs at a time, and latency-sensitive workloads where humans are waiting for the results of a computation. The subsystems all communicate with each other asynchronously via Timestone, a high-scale, low-latency priority queuing system.

Serverless

Serverless Media Latency Social Media

The Power of Caching: Boosting API Performance and Scalability

DZone

AUGUST 16, 2023

Caching is the process of storing frequently accessed data or resources in a temporary storage location, such as memory or disk, to improve retrieval speed and reduce the need for repetitive processing.

Cache

Cache Scalability Performance Latency

Dynatrace supports Azure Managed Instance for Apache Cassandra

Dynatrace

MAY 13, 2022

It also removes the need for developers and database administrators to manage infrastructure or update database versions. From there, you can dive deeper into infrastructure metrics (cluster, datacenter, racks, and nodes) and data metrics (keyspaces and tables). Add your visualization to your dashboards for easy access and sharing.

Azure

Azure Latency Metrics Infrastructure

What is observability? Not just logs, metrics and traces

Dynatrace

OCTOBER 1, 2021

In these modern environments, every hardware, software, and cloud infrastructure component and every container, open-source tool, and microservice generates records of every activity. An advanced observability solution can also be used to automate more processes, increasing efficiency and innovation among Ops and Apps teams.

Metrics

Metrics Open Source Monitoring Cloud

Service level objectives: 5 SLOs to get started

Dynatrace

JUNE 1, 2023

Response time Response time refers to the total time it takes for a system to process a request or complete an operation. This ensures that customers can quickly navigate through product listings, add items to their cart, and complete the checkout process without experiencing noticeable delays. or above for the checkout process.

Latency

Latency Website Traffic DevOps

What is full stack observability?

Dynatrace

APRIL 6, 2022

Endpoints include on-premises servers, Kubernetes infrastructure, cloud-hosted infrastructure and services, and open-source technologies. Observability across the full technology stack gives teams comprehensive, real-time insight into the behavior, performance, and health of applications and their underlying infrastructure.

DevOps

DevOps Innovation Infrastructure Cloud

Designing Instagram

High Scalability

JANUARY 11, 2022

FUN FACT : In this talk , Rodrigo Schmidt, director of engineering at Instagram talks about the different challenges they have faced in scaling the data infrastructure at Instagram. There are two major processes which gets executed when a user posts a photo on Instagram. System Components. Fetching User Feed. Streaming Data Model.

Design

Design Media Storage Logistics

Introducing Netflix TimeSeries Data Abstraction Layer

The Netflix TechBlog

OCTOBER 8, 2024

Rajiv Shringi Vinay Chella Kaidan Fullerton Oleksii Tkachuk Joey Lynch Introduction As Netflix continues to expand and diversify into various sectors like Video on Demand and Gaming , the ability to ingest and store vast amounts of temporal data — often reaching petabytes — with millisecond access latency has become increasingly vital.

Latency

Latency Storage Traffic Tuning

Site reliability engineering: 5 things to you need to know

Dynatrace

FEBRUARY 4, 2021

Site reliability engineering (SRE) is the practice of applying software engineering principles to operations and infrastructure processes to help organizations create highly reliable and scalable software systems. Shift-left using an SRE approach means that reliability is baked into each process, app and code change.

Engineering

Engineering DevOps Government Latency

What is AWS Lambda?

Dynatrace

APRIL 5, 2021

AWS Lambda is a serverless compute service that can run code in response to predetermined events or conditions and automatically manage all the computing resources required for those processes. Real-time file processing, for quickly indexing files, processing logs, and validating content.

Lambda

Lambda AWS Serverless Hardware

Dynatrace supports the newly released AWS Lambda Response Streaming

Dynatrace

APRIL 7, 2023

Customers can use AWS Lambda Response Streaming to improve performance for latency-sensitive applications and return larger payload sizes. Despite being serverless, the function still requires infrastructure on which to run. What is a Lambda serverless function? Return larger payload sizes. How does Dynatrace help?

Lambda

Lambda AWS Serverless Latency

Pushy to the Limit: Evolving Netflix’s WebSocket proxy for the future

The Netflix TechBlog

SEPTEMBER 10, 2024

The voice service then constructs a message for the device and places it on the message queue, which is then processed and sent to Pushy to deliver to the device. The previous version of the message processor was a Mantis stream-processing job that processed messages from the message queue.

Latency

Latency Cache Tuning Efficiency

Site reliability done right: 5 SRE best practices that deliver on business objectives

Dynatrace

MAY 31, 2023

How site reliability engineering affects organizations’ bottom line SRE applies the disciplines of software engineering to infrastructure management, both on-premises and in the cloud. There are now many more applications, tools, and infrastructure variables that impact an application’s performance and availability.

Best Practices

Best Practices DevOps Latency Metrics

Managing risk for financial services: The secret to visibility and control during times of volatility

Dynatrace

APRIL 8, 2024

Optimize the IT infrastructure supporting risk management processes and controls for maximum performance and resilience. The IT infrastructure, services, and applications that enable processes for risk management must perform optimally. If system failures occur, teams must resolve them quickly and resolutely.

Analytics

Analytics Infrastructure Efficiency Technology

Dynatrace accelerates business transformation with new AI observability solution

Dynatrace

JANUARY 31, 2024

The RAG process begins by summarizing and converting user prompts into queries that are sent to a search platform that uses semantic similarities to find relevant data in vector databases, semantic caches, or other online data sources. RAG augments user prompts with relevant data retrieved from outside the LLM.

Cache

Cache Azure Infrastructure Monitoring

Dynatrace Managed turnkey Premium High Availability for globally distributed data centers (Early Adopter)

Dynatrace

JUNE 25, 2020

The network latency between cluster nodes should be around 10 ms or less. For Premium HA, this has been extended from 10 ms latency (in the same network region) to around 100 ms network latency due to asynchronous data replication between regions. In the image below, three downed nodes make an entire cluster unavailable.

Availability

Availability Hardware Latency Traffic

Build systems more reliably with Dynatrace: Chaos Engineering

Dynatrace

AUGUST 21, 2024

This approach enhances key DORA metrics and enables early detection of failures in the release process, allowing SREs more time for innovation. These releases often assumed ideal conditions such as zero latency, infinite bandwidth, and no network loss, as highlighted in Peter Deutsch’s eight fallacies of distributed systems.

Engineering

Engineering Systems Latency Metrics

What is serverless computing? Driving efficiency without sacrificing observability

Dynatrace

JANUARY 26, 2021

REST APIs, authentication, databases, email, and video processing all have a home on serverless platforms. The Serverless Process. When an application is triggered, it can cause latency as the application starts. The average request is handled, processed, and returned quickly. Services scale to meet demand.

Serverless

Serverless Efficiency Lambda Azure

Analyze OpenTelemetry traces and log data at scale: Accelerate troubleshooting and optimize application performance

Dynatrace

OCTOBER 3, 2024

Without distributed tracing, pinpointing the cause of increased latency could take hours or even days. Dynatrace Davis ® AI will process logs automatically, independent of the technique used for ingestion. Interact with data intuitively and easily and benefit from immediate, AI-supported insights.

Performance

Performance Architecture Innovation Latency

Predictive CPU isolation of containers at Netflix

The Netflix TechBlog

JUNE 4, 2019

Because microprocessors are so fast, computer architecture design has evolved towards adding various levels of caching between compute units and the main memory, in order to hide the latency of bringing the bits to the brains. Its goal is to assign running processes to time slices of the CPU in a “fair” way. So why mess with it?

Cache

Cache Latency Airlines Logistics

What is ITOps? Why IT operations is more crucial than ever in a multicloud world

Dynatrace

DECEMBER 15, 2022

This transition to public, private, and hybrid cloud is driving organizations to automate and virtualize IT operations to lower costs and optimize cloud processes and systems. ITOps is an IT discipline involving actions and decisions made by the operations team responsible for an organization’s IT infrastructure. What is ITOps?

Artificial Intelligence

Artificial Intelligence DevOps Hardware Virtualization

Dynatrace automatically monitors OpenAI ChatGPT for companies that deliver reliable, cost-effective services powered by generative AI

Dynatrace

JUNE 7, 2023

Advanced AI applications using OpenAI services don’t just forward user input to OpenAI models; they also require client-side pre- and post-processing. It shows critical SLOs for latency and availability, as well as the most important OpenAI generative AI service metrics, such as response time, error count, and the overall number of requests.

Monitoring

Monitoring Latency Metrics Azure

Optimize Citrix platform performance and user experience with a new extension (Preview)

Dynatrace

SEPTEMBER 25, 2019

Therefore, it requires multidimensional and multidisciplinary monitoring: Infrastructure health —automatically monitor the compute, storage, and network resources available to the Citrix system to ensure a stable platform. OneAgent: Citrix infrastructure performance. OneAgent: SAP infrastructure performance. Citrix VDA.

Latency

Latency Performance Virtualization Infrastructure

Build automated self-healing systems with xMatters and Dynatrace (Part 2 of 3)

Dynatrace

AUGUST 27, 2019

In Part 1 we explored how DevOps teams can prevent a process crash from taking down services across an organization in five easy steps. Step 1 – Let Dynatrace analyze your infrastructure health in real-time. Step 5 – xMatters triggers a runbook in Ansible to fix the disk latency.

Systems

Systems DevOps Latency Azure

Dynatrace on Microsoft Azure in Australia enables regional customers to leverage AI-powered observability

Netflix’s Distributed Counter Abstraction

Trending Sources

RabbitMQ vs. Kafka: Key Differences

Who will watch the watchers? Extended infrastructure observability for WSO2 API Manager

Building Netflix’s Distributed Tracing Infrastructure

Rebuilding Netflix Video Processing Pipeline with Microservices

Title Launch Observability at Netflix Scale

Optimize Citrix platform performance and user experience with Dynatrace (GA)

Noisy Neighbor Detection with eBPF

Foundation Model for Personalized Recommendation

Implementing service-level objectives to improve software quality

Why applying chaos engineering to data-intensive applications matters

Comparing PostgreSQL DigitalOcean Performance & Pricing – ScaleGrid vs. DigitalOcean Managed Databases

Dynatrace supports SnapStart for Lambda as an AWS launch partner

Introducing Netflix’s Key-Value Data Abstraction Layer

Best Practices for Scaling RabbitMQ

OpenTelemetry 101: A nontechnical guide for IT leaders and enthusiasts

The Three Cs: Concatenate, Compress, Cache

What are quality gates? How to use quality gates to deliver better software at speed and scale

Site reliability engineering: 5 things you need to know

The Netflix Cosmos Platform

The Power of Caching: Boosting API Performance and Scalability

Dynatrace supports Azure Managed Instance for Apache Cassandra

What is observability? Not just logs, metrics and traces

Service level objectives: 5 SLOs to get started

What is full stack observability?

Designing Instagram

Introducing Netflix TimeSeries Data Abstraction Layer

Site reliability engineering: 5 things to you need to know

What is AWS Lambda?

Dynatrace supports the newly released AWS Lambda Response Streaming

Pushy to the Limit: Evolving Netflix’s WebSocket proxy for the future

Site reliability done right: 5 SRE best practices that deliver on business objectives

Managing risk for financial services: The secret to visibility and control during times of volatility

Dynatrace accelerates business transformation with new AI observability solution

Dynatrace Managed turnkey Premium High Availability for globally distributed data centers (Early Adopter)

Build systems more reliably with Dynatrace: Chaos Engineering

What is serverless computing? Driving efficiency without sacrificing observability

Analyze OpenTelemetry traces and log data at scale: Accelerate troubleshooting and optimize application performance

Predictive CPU isolation of containers at Netflix

What is ITOps? Why IT operations is more crucial than ever in a multicloud world

Dynatrace automatically monitors OpenAI ChatGPT for companies that deliver reliable, cost-effective services powered by generative AI

Optimize Citrix platform performance and user experience with a new extension (Preview)

Build automated self-healing systems with xMatters and Dynatrace (Part 2 of 3)

Stay Connected