Metrics, Network and Systems - Technology Performance Pulse

How Netflix uses eBPF flow logs at scale for network insight

The Netflix TechBlog

JUNE 7, 2021

By Alok Tiagi , Hariharan Ananthakrishnan , Ivan Porto Carrero and Keerti Lakshminarayan Netflix has developed a network observability sidecar called Flow Exporter that uses eBPF tracepoints to capture TCP flows at near real time. Without having network visibility, it’s difficult to improve our reliability, security and capacity posture.

Network

Network Transportation AWS Cloud

Rapid Event Notification System at Netflix

The Netflix TechBlog

FEBRUARY 18, 2022

To this end, we developed a Rapid Event Notification System (RENO) to support use cases that require server initiated communication with devices in a scalable and extensible manner. In this blog post, we will give an overview of the Rapid Event Notification System at Netflix and share some of the learnings we gained along the way.

Systems

Systems Traffic Architecture Mobile

How to Configure Custom Metrics in AWS Elastic Beanstalk Using Memory Metrics Example

DZone

JULY 2, 2024

Recently, I encountered a task where a business was using AWS Elastic Beanstalk but was struggling to understand the system state due to the lack of comprehensive metrics in CloudWatch. By default, CloudWatch only provides a few basic metrics such as CPU and Networks.

Metrics

Metrics AWS Virtualization Network

Build systems more reliably with Dynatrace: Chaos Engineering

Dynatrace

AUGUST 21, 2024

This approach enhances key DORA metrics and enables early detection of failures in the release process, allowing SREs more time for innovation. This blog post explores the Reliability metric , which measures modern operational practices. Why reliability?

Engineering

Engineering Systems Latency Metrics

Power Dashboarding, Part I: Start your exploration journey with Dashboards

Dynatrace

FEBRUARY 6, 2025

Even if infrastructure metrics aren’t your thing, you’re welcome to join us on this creative journey simply swap out the suggested metrics for ones that interest you. For our example dashboard, we’ll only focus on some selected key infrastructure metrics. Click on Select metric. Change it now to sum.

Metrics

Metrics Infrastructure Monitoring Best Practices

Dynatrace Cost & Carbon Optimization certified for accuracy and transparency

Dynatrace

MARCH 5, 2025

Integration with existing systems and processes : Integration with existing IT infrastructure, observability solutions, and workflows often requires significant investment and customization. We implemented a wasted energy metric in the app to enhance practitioner actionability.

Energy

Energy Analytics Traffic Cloud

Dynatrace extends Synthetic Monitoring capabilities with Network Availability Monitors to validate the availability of infrastructure and services

Dynatrace

JULY 15, 2024

Combined with Dynatrace OneAgent ® , you gain a precise view of the status of your systems at a glance. Why browser and HTTP monitors might not be sufficient In modern IT environments, which are complex and dynamically changing, you often need deeper insights into the Transport or Network layers. But is this all you need?

Availability

Availability Network Monitoring Infrastructure

Better dashboarding with Dynatrace Davis AI: Instant meaningful insights

Dynatrace

JANUARY 21, 2025

Chances are, youre a seasoned expert who visualizes meticulously identified key metrics across several sophisticated charts. For example, if you’re monitoring network traffic and the average over the past 7 days is 500 Mbps, the threshold will adapt to this baseline. This ensures optimal resource utilization and cost efficiency.

Traffic

Traffic Metrics Analytics Monitoring

HTTP monitors on the latest Dynatrace platform extend insights into the health of your API endpoints and simplify test management

Dynatrace

DECEMBER 18, 2024

But nowadays, with complex and dynamically changing modern IT systems, the last result details might not be enough in some cases. It now fully supports not only Network Availability Monitors but also HTTP synthetic monitors. The new Dynatrace Synthetic app allows you to analyze these results.

Monitoring

Monitoring Testing Metrics Analytics

9 key DevOps metrics for success

Dynatrace

SEPTEMBER 28, 2021

As we look at today’s applications, microservices, and DevOps teams, we see leaders are tasked with supporting complex distributed applications using new technologies spread across systems in multiple locations. The emerging concepts of working with DevOps metrics and DevOps KPIs have really come a long way. Deployment frequency.

DevOps

DevOps Metrics Traffic Efficiency

The keys to selecting a platform for end-to-end observability

Dynatrace

DECEMBER 2, 2024

Clearly, continuing to depend on siloed systems, disjointed monitoring tools, and manual analytics is no longer sustainable. It also helps to have access to OpenTelemetry, a collection of tools for examining applications that export metrics, logs, and traces for analysis.

Artificial Intelligence

Artificial Intelligence DevOps Architecture Cloud

Best practices and key metrics for improving mobile app performance

Dynatrace

DECEMBER 13, 2023

Mobile applications (apps) are an increasingly important channel for reaching customers, but the distributed nature of mobile app platforms and delivery networks can cause performance problems that leave users frustrated, or worse, turning to competitors. Some of the most important KPIs are listed below.

Best Practices

Best Practices Mobile Metrics Performance

What is log management? How to tame distributed cloud system complexities

Dynatrace

SEPTEMBER 8, 2022

Log management is an organization’s rules and policies for managing and enabling the creation, transmission, analysis, storage, and other tasks related to IT systems’ and applications’ log data. Metrics, logs , and traces make up three vital prongs of modern observability. How log management systems optimize performance and security.

Cloud

Cloud Systems Analytics DevOps

Part 1: A Survey of Analytics Engineering Work at Netflix

The Netflix TechBlog

DECEMBER 17, 2024

Analytics Engineers deliver these insights by establishing deep business and product partnerships; translating business challenges into solutions that unblock critical decisions; and designing, building, and maintaining end-to-end analytical systems. DJ acts as a central store where metric definitions can live and evolve.

Analytics

Analytics Engineering Entertainment Metrics

Observability engineering: Getting Prometheus metrics right for Kubernetes with Dynatrace and Kepler

Dynatrace

DECEMBER 18, 2023

For busy site reliability engineers, ensuring system reliability, scalability, and overall health is an imperative that’s getting harder to achieve in ever-expanding, cloud-native, container-based environments. To get a more granular look into telemetry data, many analysts rely on custom metrics using Prometheus.

Metrics

Metrics Engineering Energy Tuning

Cut costs and complexity: 5 strategies for reducing tool sprawl with Dynatrace

Dynatrace

APRIL 10, 2025

Break data silos and add context for faster, more strategic decisions : Unifying metrics, logs, traces, and user behavior within a single platform enables real-time decisions rooted in full context, not guesswork. Traditional network-based security approaches are evolving.

Strategy

Strategy Storage Network Architecture

Dynatrace innovates again with the release of topology-driven auto-adaptive metric baselines

Dynatrace

JULY 23, 2020

With the advent and ingestion of thousands of custom metrics into Dynatrace, we’ve once again pushed the boundaries of automatic, AI-based root cause analysis with the introduction of auto-adaptive baselines as a foundational concept for Dynatrace topology-driven timeseries measurements. In many cases, metric behavior changes over time.

Metrics

Metrics Innovation Strategy Monitoring

How AI and observability help to safeguard government networks from new threats

Dynatrace

MARCH 27, 2024

This is further exacerbated by the fact that a significant portion of their IT budgets are allocated to maintaining outdated legacy systems. By combining AI and observability, government agencies can create more intelligent and responsive systems that are better equipped to tackle the challenges of today and tomorrow.

Government

Government Network Artificial Intelligence Cloud

Distributed Network Service for Users Activity Limiting (Part 1)

DZone

FEBRUARY 8, 2022

Any service provider tries to reach several metrics in their activity. One group of these metrics is service quality. Quality metrics contain: The ratio of successfully processed requests. But what is the metric that shows service hardware monopolization by a group of users? Number of requests dependent curves.

Network

Network Hardware Metrics Processing

The road to observability with OpenTelemetry demo part 1: Identifying metrics and traces

Dynatrace

MAY 17, 2023

Anyone who’s concerned with developing, delivering, and operating software knows the importance of making software and the systems it runs on observable. That is, relying on metrics, logs, and traces to understand what software is doing and where it’s running into snags. OpenTelemetry is a free and open source take on observability.

Metrics

Metrics Open Source Traffic Cache

Best Practices for Scaling RabbitMQ

Scalegrid

FEBRUARY 24, 2025

Scaling RabbitMQ ensures your system can handle growing traffic and maintain high performance. Key Takeaways RabbitMQ improves scalability and fault tolerance in distributed systems by decoupling applications, enabling reliable message exchanges.

Best Practices

Best Practices Traffic Strategy Efficiency

Power dashboarding part 2: Dynatrace dashboard tutorial to gain better, faster answers using AI and formatting

Dynatrace

MARCH 31, 2025

You can either continue with the custom infrastructure metrics dashboard you created in Part I or use the dashboard we prepared here (Dynatrace login required). In our Dynatrace Dashboard tutorial, we want to add a chart that shows the bytes in and out per host over time to enhance visibility into network traffic.

Metrics

Metrics Infrastructure Network Best Practices

Five-nines availability: Always-on infrastructure delivers system availability during the holidays’ peak loads

Dynatrace

NOVEMBER 22, 2022

The nirvana state of system uptime at peak loads is known as “five-nines availability.” In its pursuit, IT teams hover over system performance dashboards hoping their preparations will deliver five nines—or even four nines—availability. How can IT teams deliver system availability under peak loads that will satisfy customers?

Infrastructure

Infrastructure Availability Systems Retail

What is MTTR? How mean time to repair helps define DevOps incident management

Dynatrace

NOVEMBER 1, 2022

DevOps and ITOps teams rely on incident management metrics such as mean time to repair (MTTR). These metrics help to keep a network system up and running?, Other such metrics include uptime, downtime, number of incidents, time between incidents, and time to respond to and resolve an issue. So, what is MTTR?

DevOps

DevOps Artificial Intelligence Metrics Network

Get up to 300 new metrics out of the box with AWS supporting services (GA)

Dynatrace

DECEMBER 22, 2019

AWS offers a broad set of global, cloud-based services including computing, storage, networking, Internet of Things (IoT), and many others. To reduce your CloudWatch costs and throttling, you can now select from additional services and metrics to monitor. Get up to 300 new AWS metrics out of the box. Dynatrace news. Amazon EMR.

AWS

AWS Metrics IoT Storage

From syslog to AWS Firehose: Dynatrace log management innovations that enhance observability

Dynatrace

SEPTEMBER 5, 2024

Native support for Syslog messages Syslog messages are generated by default in Linux and Unix operating systems, security devices, network devices, and applications such as web servers and databases. Native support for syslog messages extends our infrastructure log support to all Linux/Unix systems and network devices.

Innovation

Innovation AWS Analytics Storage

Simplified observability for your SNMP devices

Dynatrace

MARCH 22, 2021

As a Network Engineer, you need to ensure the operational functionality, availability, efficiency, backup/recovery, and security of your company’s network. But manual configuration of observability for systems like this is nearly impossible. While not the newest protocol, SNMP is still actively used and popular.

Metrics

Metrics Network Infrastructure Traffic

Migrating Netflix to GraphQL Safely

The Netflix TechBlog

JUNE 14, 2023

So, we relied on higher-level metrics-based testing: AB Testing and Sticky Canaries. To determine customer impact, we could compare various metrics such as error rates, latencies, and time to render. The AB experiment results hinted that GraphQL’s correctness was not up to par with the legacy system.

Traffic

Traffic Latency Metrics Cache

Get up to 300 new metrics out of the box with AWS supporting services (GA)

Dynatrace

DECEMBER 22, 2019

AWS offers a broad set of global, cloud-based services including computing, storage, networking, Internet of Things (IoT), and many others. To reduce your CloudWatch costs and throttling, you can now select from additional services and metrics to monitor. Get up to 300 new AWS metrics out of the box. Dynatrace news. Amazon EMR.

AWS

AWS Metrics IoT Storage

Mastering Kubernetes with Dynatrace

Dynatrace

AUGUST 24, 2020

To make this possible, the application code should be instrumented with telemetry data for deep insights, including: Metrics to find out how the behavior of a system has changed over time. Traces help find the flow of a request through a distributed system. Logs represent event data in plain-text, structured or binary format.

Analytics

Analytics Infrastructure AWS Operating System

Business observability: From IT monitoring to driving digital transformation

Dynatrace

JULY 11, 2024

In today’s digital-first world, data resides across dozens of different IT systems, from critical business applications to the modern cloud platforms that underpin them. Often, these metrics are unable to even identify trends from past to present, never mind helping teams to predict future trends. Operational optimization.

Monitoring

Monitoring Retail Energy Cloud

Detecting RegreSSHion with Dynatrace (CVE-2024-6387)

Dynatrace

JULY 2, 2024

The Qualys Threat Research Unit (TRU) has discovered a Remote Unauthenticated Code Execution (RCE) vulnerability in OpenSSH server (sshd) in glibc-based Linux systems. This can result in a complete system takeover, malware installation, data manipulation, and the creation of backdoors for persistent access.

AWS

AWS Network Traffic Servers

Dynatrace supports Amazon Linux 2023 as an AWS launch partner

Dynatrace

MARCH 14, 2023

Available directly from the AWS Marketplace , Dynatrace provides full-stack observability and AI to help IT teams optimize the resiliency of their cloud applications from the user experience down to the underlying operating system, infrastructure, and services. How does Dynatrace help?

AWS

AWS Lambda Serverless Virtualization

The Dynatrace Platform Subscription model enables broad Infrastructure Monitoring

Dynatrace

JUNE 27, 2023

This subscription model offers the flexibility to deploy Dynatrace even more broadly to gain greater visibility into system performance, improve the ability to detect and prevent bottlenecks, and quickly detect and diagnose problems. With DPS, metrics are available as a pool per tenant.

Infrastructure

Infrastructure Monitoring Metrics Network

Observe syslog with Dynatrace ActiveGate, a secure, trusted edge component

Dynatrace

JULY 15, 2024

These include traditional on-premises network devices and servers for infrastructure applications like databases, websites, or email. A local endpoint in a protected network or DMZ is required to capture these messages. The key to success is making data in this complex ecosystem actionable, as many types of syslog producers exist.

Infrastructure

Infrastructure Network Azure Monitoring

Dynatrace adds support for AWS Transit Gateway with VPC Flow Logs

Dynatrace

JULY 25, 2022

This new service enhances the user visibility of network details with direct delivery of Flow Logs for Transit Gateway to your desired endpoint via Amazon Simple Storage Service (S3) bucket or Amazon CloudWatch Logs. AWS Transit Gateway is a service offering from Amazon Web Services that connects network resources via a centralized hub.

AWS

AWS Transportation Network Traffic

10 digital experience monitoring best practices

Dynatrace

JUNE 21, 2024

They collect data from multiple sources through real user monitoring , synthetic monitoring, network monitoring, and application performance monitoring systems. Align business and development teams’ input on what user experience metrics to measure to understand users’ most critical digital experience aspects.

Best Practices

Best Practices Monitoring Metrics Transportation

Trace, diagnose, resolve: Introducing the Infrastructure & Operations app for streamlined troubleshooting

Dynatrace

FEBRUARY 1, 2024

The complex interconnections in cloud-based systems make it crucial to always have a topological overview to understand dependencies. To overcome these complex issues, teams must quickly find root causes among numerous alerts and metrics. For the most granular metrics and network insights, OneAgent is the optimal choice.

Infrastructure

Infrastructure Metrics Network Monitoring

Different CPU Times: Unix/Linux ‘top’

DZone

DECEMBER 27, 2020

CPU consumption in Unix/Linux operating systems is studied using eight different metrics: User CPU time, System CPU time, nice CPU time, Idle CPU time, Waiting CPU time, Hardware Interrupt CPU time, Software Interrupt CPU time, Stolen CPU time. User CPU Time and System CPU Time.

Operating System

Operating System Hardware Network Systems

Python at Netflix

The Netflix TechBlog

APRIL 29, 2019

Open Connect Open Connect is Netflix’s content delivery network (CDN). video streaming) takes place in the Open Connect network. Various software systems are needed to design, build, and operate this CDN infrastructure, and a significant number of them are written in Python. are you logged in?

Open Source

Open Source Network Infrastructure Big Data

Scale your enterprise cloud environment with enhanced AI-powered observability of all AWS services

Dynatrace

AUGUST 27, 2020

Dynatrace’s ability to ingest metrics from all 95 AWS services will be available within the next 60 days. The latest batch of services cover databases, networks, machine learning and computing. AWS SDK Metrics for Enterprise Support. AWS Systems Manager Run Command. Achieve full observability of all AWS services.

AWS

AWS Cloud IoT Database

The road to observability demo part 3: Collect, instrument, and analyze telemetry data automatically with Dynatrace

Dynatrace

MAY 17, 2023

Making applications observable—relying on metrics, logs, and traces to understand what software is doing and how it’s performing—has become increasingly important as workloads are shifting to multicloud environments. We also introduced our demo app and explained how to define the metrics and traces it uses. How can we verify that?

Metrics

Metrics Database Monitoring Network

Infrastructure Monitoring tools: 3 steps to evolve ITOps into AIOps

Dynatrace

JUNE 28, 2021

The result is a production paradox: with each new cloud service, container environment, and open-source solution, the number of technologies and dependencies increases, which makes it more difficult for ITOps teams to actively monitor systems at scale and address performance problems as they emerge. Stage 2: Service monitoring. Worth noting?

Infrastructure

Infrastructure Monitoring Artificial Intelligence Open Source

Get seamless insights into Nutanix clusters with Dynatrace

Dynatrace

NOVEMBER 9, 2023

Get ready for Nutanix insights: Here’s how Dynatrace helps The extension comes with a comprehensive set of essential metrics that can quickly identify the root causes of performance issues, saving time and minimizing disruptions. With Dynatrace, Nutanix metrics can be leveraged for various use cases.

Virtualization

Virtualization Storage Metrics Monitoring

How Netflix uses eBPF flow logs at scale for network insight

Rapid Event Notification System at Netflix

Trending Sources

How to Configure Custom Metrics in AWS Elastic Beanstalk Using Memory Metrics Example

Build systems more reliably with Dynatrace: Chaos Engineering

Power Dashboarding, Part I: Start your exploration journey with Dashboards

Dynatrace Cost & Carbon Optimization certified for accuracy and transparency

Dynatrace extends Synthetic Monitoring capabilities with Network Availability Monitors to validate the availability of infrastructure and services

Better dashboarding with Dynatrace Davis AI: Instant meaningful insights

HTTP monitors on the latest Dynatrace platform extend insights into the health of your API endpoints and simplify test management

9 key DevOps metrics for success

The keys to selecting a platform for end-to-end observability

Best practices and key metrics for improving mobile app performance

What is log management? How to tame distributed cloud system complexities

Part 1: A Survey of Analytics Engineering Work at Netflix

Observability engineering: Getting Prometheus metrics right for Kubernetes with Dynatrace and Kepler

Cut costs and complexity: 5 strategies for reducing tool sprawl with Dynatrace

Dynatrace innovates again with the release of topology-driven auto-adaptive metric baselines

How AI and observability help to safeguard government networks from new threats

Distributed Network Service for Users Activity Limiting (Part 1)

The road to observability with OpenTelemetry demo part 1: Identifying metrics and traces

Best Practices for Scaling RabbitMQ

Power dashboarding part 2: Dynatrace dashboard tutorial to gain better, faster answers using AI and formatting

Five-nines availability: Always-on infrastructure delivers system availability during the holidays’ peak loads

What is MTTR? How mean time to repair helps define DevOps incident management

Get up to 300 new metrics out of the box with AWS supporting services (GA)

From syslog to AWS Firehose: Dynatrace log management innovations that enhance observability

Simplified observability for your SNMP devices

Migrating Netflix to GraphQL Safely

Get up to 300 new metrics out of the box with AWS supporting services (GA)

Mastering Kubernetes with Dynatrace

Business observability: From IT monitoring to driving digital transformation

Detecting RegreSSHion with Dynatrace (CVE-2024-6387)

Dynatrace supports Amazon Linux 2023 as an AWS launch partner

The Dynatrace Platform Subscription model enables broad Infrastructure Monitoring

Observe syslog with Dynatrace ActiveGate, a secure, trusted edge component

Dynatrace adds support for AWS Transit Gateway with VPC Flow Logs

10 digital experience monitoring best practices

Trace, diagnose, resolve: Introducing the Infrastructure & Operations app for streamlined troubleshooting

Different CPU Times: Unix/Linux ‘top’

Python at Netflix

Scale your enterprise cloud environment with enhanced AI-powered observability of all AWS services

The road to observability demo part 3: Collect, instrument, and analyze telemetry data automatically with Dynatrace

Infrastructure Monitoring tools: 3 steps to evolve ITOps into AIOps

Get seamless insights into Nutanix clusters with Dynatrace

Stay Connected