Infrastructure, Latency and Monitoring - Technology Performance Pulse

Cloud infrastructure monitoring in action: Dynatrace on Dynatrace

Dynatrace

SEPTEMBER 29, 2020

As of September 2020, we run 51 clusters on 1100 EC2 instances distributed across six AWS Regions ensuring that all our users can leverage the Dynatrace Software Intelligence Platform to monitor their hybrid-multi cloud environments. Sydney, we have a disk write latency problem!

Infrastructure

Infrastructure Cloud Monitoring AWS

Next-level interaction and customization of data visualizations in Dynatrace Dashboards and Notebooks

Dynatrace

OCTOBER 10, 2024

Take your monitoring, data exploration, and storytelling to the next level with outstanding data visualization All your applications and underlying infrastructure produce vast volumes of data that you need to monitor or analyze for insights. Have a look at them on our Dynatrace Playground.

Latency

Latency Infrastructure Monitoring Metrics

Cut costs and complexity: 5 strategies for reducing tool sprawl with Dynatrace

Dynatrace

APRIL 10, 2025

Dynatrace integrates application performance monitoring (APM), infrastructure monitoring, and real-user monitoring (RUM) into a single platform, with its Foundation & Discovery mode offering a cost-effective, unified view of the entire infrastructure, including non-critical applications previously monitored using legacy APM tools.

Strategy

Strategy Storage Network Architecture

Netflix’s Distributed Counter Abstraction

The Netflix TechBlog

NOVEMBER 12, 2024

By: Rajiv Shringi , Oleksii Tkachuk , Kartik Sathyanarayanan Introduction In our previous blog post, we introduced Netflix’s TimeSeries Abstraction , a distributed service designed to store and query large volumes of temporal event data with low millisecond latencies. Today, we’re excited to present the Distributed Counter Abstraction.

Latency

Latency Cache Infrastructure Strategy

Who will watch the watchers? Extended infrastructure observability for WSO2 API Manager

Dynatrace

SEPTEMBER 18, 2020

Sure, cloud infrastructure requires comprehensive performance visibility, as Dynatrace provides , but the services that leverage cloud infrastructures also require close attention. Extend infrastructure observability to WSO2 API Manager. Save hours of bug hunting with out-of-the-box WSO2 API Manager monitoring.

Infrastructure

Infrastructure Latency Metrics Cloud

RabbitMQ vs. Kafka: Key Differences

Scalegrid

FEBRUARY 6, 2025

RabbitMQ can be deployed in distributed environments and includes monitoring tools through a built-in dashboard and CLI. Its partitioned log architecture supports both queuing and publish-subscribe models, allowing it to handle large-scale event processing with minimal latency.

Latency

Latency Analytics Architecture Storage

Building Netflix’s Distributed Tracing Infrastructure

The Netflix TechBlog

OCTOBER 19, 2020

Now let’s look at how we designed the tracing infrastructure that powers Edgar. If we had an ID for each streaming session then distributed tracing could easily reconstruct session failure by providing service topology, retry and error tags, and latency measurements for all service calls.

Infrastructure

Infrastructure Transportation Storage Open Source

How to Configure Istio, Prometheus and Grafana for Monitoring

DZone

AUGUST 29, 2023

You can implement security and advance networking policies to all the communication across your infrastructure using Istio. You can use Istio to observe the performance and behavior of all your microservices in your infrastructure (see the image below). But another important feature of Istio is observability.

Monitoring

Monitoring Latency Network Infrastructure

What is real user monitoring (RUM)?

Dynatrace

JANUARY 13, 2022

Real user monitoring can help you catch these issues before they impact the bottom line. What is real user monitoring? Real user monitoring (RUM) is a performance monitoring process that collects detailed data about a user’s interaction with an application. Real user monitoring collects data on a variety of metrics.

Monitoring

Monitoring Mobile Latency Best Practices

Solve hybrid Kubernetes performance and reliability problems with unified observability

Dynatrace

APRIL 10, 2025

A primary challenge in managing these hybrid Kubernetes clusters is the fragmented monitoring caused by using siloed tools with varying support for the different operating systems. This inconsistency leads to gaps in monitoring and alerting, making it difficult to maintain a unified view of the cluster’s health.

Performance

Performance Java Operating System Infrastructure

What is API monitoring?

Dynatrace

OCTOBER 4, 2021

As a result, API monitoring has become a must for DevOps teams. So what is API monitoring? What is API Monitoring? API monitoring is the process of collecting and analyzing data about the performance of an API in order to identify problems that impact users. The need for API monitoring. Ways to monitor APIs.

Monitoring

Monitoring Latency Metrics Availability

How Park ‘N Fly eliminated silos and improved customer experience with Dynatrace cloud monitoring

Dynatrace

APRIL 7, 2021

But your infrastructure teams don’t see any issue on their AWS or Azure monitoring tools, your platform team doesn’t see anything too concerning in Kubernetes logging, and your apps team says there are green lights across the board. This scenario has become all too common as digital infrastructure has grown increasingly complex.

Cloud

Cloud Monitoring Latency Games

Title Launch Observability at Netflix Scale

The Netflix TechBlog

DECEMBER 17, 2024

The Challenge of Title Launch Observability As engineers, were wired to track system metrics like error rates, latencies, and CPU utilizationbut what about metrics that matter to a titlessuccess? Option 1: Log Processing Log processing offers a straightforward solution for monitoring and analyzing title launches.

Traffic

Traffic Scalability Strategy Monitoring

Optimize Citrix platform performance and user experience with Dynatrace (GA)

Dynatrace

JANUARY 15, 2020

Having released this functionality in an Preview Release back in September 2019, we’re now happy to announce the General Availability of our Citrix monitoring extension. Synthetic monitoring: Citrix login availability and performance. OneAgent: Citrix infrastructure performance. OneAgent: SAP infrastructure performance.

Latency

Latency Performance Virtualization Infrastructure

Dynatrace automatically monitors OpenAI ChatGPT for companies that deliver reliable, cost-effective services powered by generative AI

Dynatrace

JUNE 7, 2023

One of the crucial success factors for delivering cost-efficient and high-quality AI-agent services, following the approach described above, is to closely observe their cost, latency, and reliability. With these latency, reliability, and cost measurements in place, your operations team can now define their own OpenAI dashboards and SLOs.

Monitoring

Monitoring Latency Metrics Azure

How digital experience monitoring helps deliver business observability

Dynatrace

APRIL 26, 2022

Digital experience monitoring (DEM) allows an organization to optimize customer experiences by taking into account the context surrounding digital experience metrics. What is digital experience monitoring? Primary digital experience monitoring tools.

Monitoring

Monitoring Social Media IoT Metrics

Noisy Neighbor Detection with eBPF

The Netflix TechBlog

SEPTEMBER 10, 2024

The first step is determining whether the problem originates from the application or the underlying infrastructure. In this blog post, we'll reveal how we leveraged eBPF to achieve continuous, low-overhead instrumentation of the Linux scheduler, enabling effective self-serve monitoring of noisy neighbor issues.

Latency

Latency Metrics Programming Monitoring

What is observability? Not just logs, metrics and traces

Dynatrace

OCTOBER 1, 2021

In these modern environments, every hardware, software, and cloud infrastructure component and every container, open-source tool, and microservice generates records of every activity. What is the difference between monitoring and observability? Is observability really monitoring by another name? In short, no.

Metrics

Metrics Open Source Monitoring Cloud

Telltale: Netflix Application Monitoring Simplified

The Netflix TechBlog

AUGUST 13, 2020

Over the years we’ve learned from on-call engineers about the pain points of application monitoring: too many alerts, too many dashboards to scroll through, and too much configuration and maintenance. Our streaming teams need a monitoring system that enables them to quickly diagnose and remediate problems; seconds count!

Monitoring

Monitoring Tuning Traffic Metrics

Implementing service-level objectives to improve software quality

Dynatrace

DECEMBER 27, 2022

SLOs can be a great way for DevOps and infrastructure teams to use data and performance expectations to make decisions, such as whether to release and where engineers should focus their time. In what follows, we explore some of these best practices and guidance for implementing service-level objectives in your monitored environment.

Software

Software Software Benchmarking Latency

Dynatrace supports SnapStart for Lambda as an AWS launch partner

Dynatrace

NOVEMBER 28, 2022

The new Amazon capability enables customers to improve the startup latency of their functions from several seconds to as low as sub-second (up to 10 times faster) at P99 (the 99th latency percentile). This can cause latency outliers and may lead to a poor end-user experience for latency-sensitive applications.

Lambda

Lambda AWS Serverless Latency

Best Practices for Scaling RabbitMQ

Scalegrid

FEBRUARY 24, 2025

Optimizing RabbitMQ performance through strategies such as keeping queues short, enabling lazy queues, and monitoring health checks is essential for maintaining system efficiency and effectively managing high traffic loads. Monitoring the cluster nodes preemptively addresses potential issues, ensuring the system operates smoothly.

Best Practices

Best Practices Traffic Strategy Scalability

Dynatrace supports Azure Managed Instance for Apache Cassandra

Dynatrace

MAY 13, 2022

This extension provides fully app-centric Cassandra performance monitoring for Azure Managed Instance for Apache Cassandra. Cassandra is also essential to Dynatrace because it is integral to our monitoring solution. It also removes the need for developers and database administrators to manage infrastructure or update database versions.

Azure

Azure Latency Metrics Infrastructure

OpenTelemetry 101: A nontechnical guide for IT leaders and enthusiasts

Dynatrace

JULY 22, 2024

Unlike traditional monitoring, which focuses on watching individual metrics for system health indicators with no overall context, observability goes deeper , analyzing telemetry data for a comprehensive view of the system’s internal state in context of the wider system. Contextualize data. Employ efficient sampling.

Latency

Latency Best Practices Metrics Open Source

Best practices and key metrics for improving mobile app performance

Dynatrace

DECEMBER 13, 2023

As a result, organizations need to monitor mobile app performance metrics that are meaningful and actionable by gaining adequate observability of mobile app performance. Closely monitoring mobile app performance will help ensure customer interactions via mobile apps are meeting the expectations of the customers. Proactive monitoring.

Best Practices

Best Practices Mobile Metrics Performance

What is full stack observability?

Dynatrace

APRIL 6, 2022

Endpoints include on-premises servers, Kubernetes infrastructure, cloud-hosted infrastructure and services, and open-source technologies. Observability across the full technology stack gives teams comprehensive, real-time insight into the behavior, performance, and health of applications and their underlying infrastructure.

DevOps

DevOps Innovation Infrastructure Cloud

How to maximize serverless benefits and overcome its challenges

Dynatrace

OCTOBER 10, 2022

Reduced latency. By using cloud providers with multiple server sites, organizations can reduce function latency for end users. No infrastructure to maintain. Because cloud providers own and manage back-end infrastructure, local IT teams aren’t responsible for ongoing maintenance and upgrades. Difficult to monitor.

Serverless

Serverless Infrastructure Lambda Latency

Dynatrace Managed turnkey Premium High Availability for globally distributed data centers (Early Adopter)

Dynatrace

JUNE 25, 2020

The network latency between cluster nodes should be around 10 ms or less. Near-zero RPO and RTO—monitoring continues seamlessly and without data loss in failover scenarios. Achieve high SLOs with seamless monitoring when entire data centers experience outages. Automatic recovery for outages for up to 72 hours.

Availability

Availability Hardware Latency Traffic

Dynatrace accelerates business transformation with new AI observability solution

Dynatrace

JANUARY 31, 2024

Data dependencies and framework intricacies require observing the lifecycle of an AI-powered application end to end, from infrastructure and model performance to semantic caches and workflow orchestration. Estimates show that NVIDIA, a semiconductor manufacturer, could release 1.5 million AI server units annually by 2027, consuming 75.4+

Cache

Cache Azure Infrastructure Monitoring

Optimize your environment: Unveiling Dynatrace Hyper-V extension for enhanced performance and efficient troubleshooting

Dynatrace

OCTOBER 23, 2023

Lastly, monitoring and maintaining system health within a virtual environment, which includes efficient troubleshooting and issue resolution, can pose a significant challenge for IT teams. Start monitoring Hyper-V Navigate to the Dynatrace Hub and activate the Microsoft Hyper-V Extension. What’s next?

Efficiency

Efficiency Virtualization Hardware Performance

Build systems more reliably with Dynatrace: Chaos Engineering

Dynatrace

AUGUST 21, 2024

These releases often assumed ideal conditions such as zero latency, infinite bandwidth, and no network loss, as highlighted in Peter Deutsch’s eight fallacies of distributed systems. With Dynatrace, teams can seamlessly monitor the entire system, including network switches, database storage, and third-party dependencies.

Engineering

Engineering Systems Latency Metrics

Service level objectives: 5 SLOs to get started

Dynatrace

JUNE 1, 2023

Note : you might hear the term latency used instead of response time. Both latency and response time are critical to ensure reliability. Latency typically refers to the time it takes for a single request to travel from its source to its destination. Latency primarily focuses on the time spent in transit.

Website

Website Latency Traffic Virtualization

Site reliability engineering: 5 things you need to know

Dynatrace

FEBRUARY 4, 2021

Site reliability engineering (SRE) is the practice of applying software engineering principles to operations and infrastructure processes to help organizations create highly reliable and scalable software systems. This can be anything from adjusting monitoring and alerting to making code changes in production.

Engineering

Engineering DevOps Government Latency

Site reliability done right: 5 SRE best practices that deliver on business objectives

Dynatrace

MAY 31, 2023

How site reliability engineering affects organizations’ bottom line SRE applies the disciplines of software engineering to infrastructure management, both on-premises and in the cloud. There are now many more applications, tools, and infrastructure variables that impact an application’s performance and availability.

Best Practices

Best Practices DevOps Latency Metrics

Dynatrace supports the newly released AWS Lambda Response Streaming

Dynatrace

APRIL 7, 2023

Customers can use AWS Lambda Response Streaming to improve performance for latency-sensitive applications and return larger payload sizes. Despite being serverless, the function still requires infrastructure on which to run. Auto-detection starts monitoring new virtual machines as they are deployed. Return larger payload sizes.

Lambda

Lambda AWS Serverless Latency

Get seamless insights into Nutanix clusters with Dynatrace

Dynatrace

NOVEMBER 9, 2023

Easily monitor your Nutanix clusters with Dynatrace The Dynatrace Nutanix Cluster Extension offers straightforward yet powerful features to help you streamline your monitoring with an easy one-click activation via Dynatrace Hub. However, this approach doesn’t provide seamless monitoring coverage.

Virtualization

Virtualization Storage Metrics Monitoring

What is serverless computing? Driving efficiency without sacrificing observability

Dynatrace

JANUARY 26, 2021

When an application is triggered, it can cause latency as the application starts. Cloud-hosted managed services eliminate the minute day-to-day tasks associated with hosting IT infrastructure on-premises. This creates latency when they need to restart. Monitoring serverless applications.

Serverless

Serverless Efficiency Lambda Azure

What is AWS Lambda?

Dynatrace

APRIL 5, 2021

But with the benefits also come concerns about observability, and how to monitor and manage ever-expanding cloud software stacks. Organizations can offload much of the burden of managing app infrastructure and transition many functions to the cloud by going serverless with the help of Lambda. The Amazon Web Services ecosystem.

Lambda

Lambda AWS Serverless Hardware

Redis® Monitoring Strategies for 2025

Scalegrid

JANUARY 21, 2025

In todays data-driven world, the ability to effectively monitor and manage data is of paramount importance. With its widespread use in modern application architectures, understanding the ins and outs of Redis monitoring is essential for any tech professional. Redis, a powerful in-memory data store, is no exception.

Strategy

Strategy Monitoring Latency DevOps

What are quality gates? How to use quality gates to deliver better software at speed and scale

Dynatrace

FEBRUARY 21, 2024

To remain competitive in today’s fast-paced market, organizations must not only ensure that their digital infrastructure is functioning optimally but also that software deployments and updates are delivered rapidly and consistently. In this example, unlike latency, the remaining three signals did not receive a “pass.”

Speed

Speed Software Software Latency

Build and operate multicloud FaaS with enhanced, intelligent end-to-end observability

Dynatrace

APRIL 25, 2023

These functions are executed by a serverless platform or provider (such as AWS Lambda, Azure Functions or Google Cloud Functions) that manages the underlying infrastructure, scaling and billing. Enable faster development and deployment cycles by abstracting away the infrastructure complexity. Sign up for a free trial.

Serverless

Serverless Lambda Azure AWS

Introducing Netflix TimeSeries Data Abstraction Layer

The Netflix TechBlog

OCTOBER 8, 2024

Rajiv Shringi Vinay Chella Kaidan Fullerton Oleksii Tkachuk Joey Lynch Introduction As Netflix continues to expand and diversify into various sectors like Video on Demand and Gaming , the ability to ingest and store vast amounts of temporal data — often reaching petabytes — with millisecond access latency has become increasingly vital.

Latency

Latency Storage Traffic Tuning

Optimize Citrix platform performance and user experience with a new extension (Preview)

Dynatrace

SEPTEMBER 25, 2019

Therefore, it requires multidimensional and multidisciplinary monitoring: Infrastructure health —automatically monitor the compute, storage, and network resources available to the Citrix system to ensure a stable platform. Synthetic monitoring: Citrix login availability and performance. Image callout numbers.

Latency

Latency Performance Virtualization Infrastructure

Site reliability engineering: 5 things to you need to know

Dynatrace

FEBRUARY 4, 2021

Site reliability engineering (SRE) is the practice of applying software engineering principles to operations and infrastructure processes to help organizations create highly reliable and scalable software systems. This can be anything from adjusting monitoring and alerting to making code changes in production.

Engineering

Engineering DevOps Government Latency

Cloud infrastructure monitoring in action: Dynatrace on Dynatrace

Next-level interaction and customization of data visualizations in Dynatrace Dashboards and Notebooks

Trending Sources

Cut costs and complexity: 5 strategies for reducing tool sprawl with Dynatrace

Netflix’s Distributed Counter Abstraction

Who will watch the watchers? Extended infrastructure observability for WSO2 API Manager

RabbitMQ vs. Kafka: Key Differences

Building Netflix’s Distributed Tracing Infrastructure

How to Configure Istio, Prometheus and Grafana for Monitoring

What is real user monitoring (RUM)?

Solve hybrid Kubernetes performance and reliability problems with unified observability

What is API monitoring?

How Park ‘N Fly eliminated silos and improved customer experience with Dynatrace cloud monitoring

Title Launch Observability at Netflix Scale

Optimize Citrix platform performance and user experience with Dynatrace (GA)

Dynatrace automatically monitors OpenAI ChatGPT for companies that deliver reliable, cost-effective services powered by generative AI

How digital experience monitoring helps deliver business observability

Noisy Neighbor Detection with eBPF

What is observability? Not just logs, metrics and traces

Telltale: Netflix Application Monitoring Simplified

Implementing service-level objectives to improve software quality

Dynatrace supports SnapStart for Lambda as an AWS launch partner

Best Practices for Scaling RabbitMQ

Dynatrace supports Azure Managed Instance for Apache Cassandra

OpenTelemetry 101: A nontechnical guide for IT leaders and enthusiasts

Best practices and key metrics for improving mobile app performance

What is full stack observability?

How to maximize serverless benefits and overcome its challenges

Dynatrace Managed turnkey Premium High Availability for globally distributed data centers (Early Adopter)

Dynatrace accelerates business transformation with new AI observability solution

Optimize your environment: Unveiling Dynatrace Hyper-V extension for enhanced performance and efficient troubleshooting

Build systems more reliably with Dynatrace: Chaos Engineering

Service level objectives: 5 SLOs to get started

Site reliability engineering: 5 things you need to know

Site reliability done right: 5 SRE best practices that deliver on business objectives

Dynatrace supports the newly released AWS Lambda Response Streaming

Get seamless insights into Nutanix clusters with Dynatrace

What is serverless computing? Driving efficiency without sacrificing observability

What is AWS Lambda?

Redis® Monitoring Strategies for 2025

What are quality gates? How to use quality gates to deliver better software at speed and scale

Build and operate multicloud FaaS with enhanced, intelligent end-to-end observability

Introducing Netflix TimeSeries Data Abstraction Layer

Optimize Citrix platform performance and user experience with a new extension (Preview)

Site reliability engineering: 5 things to you need to know

Stay Connected