Event, Metrics and Traffic - Technology Performance Pulse

Migrating Critical Traffic At Scale with No Downtime?—?Part 1

The Netflix TechBlog

MAY 4, 2023

Migrating Critical Traffic At Scale with No Downtime — Part 1 Shyam Gala , Javier Fernandez-Ivern , Anup Rokkam Pratap , Devang Shah Hundreds of millions of customers tune into Netflix every day, expecting an uninterrupted and immersive streaming experience. This approach has a handful of benefits.

Traffic

Traffic Latency Tuning Systems

Rapid Event Notification System at Netflix

The Netflix TechBlog

FEBRUARY 18, 2022

To this end, we developed a Rapid Event Notification System (RENO) to support use cases that require server initiated communication with devices in a scalable and extensible manner. In this blog post, we will give an overview of the Rapid Event Notification System at Netflix and share some of the learnings we gained along the way.

Systems

Systems Traffic Architecture Mobile

Better dashboarding with Dynatrace Davis AI: Instant meaningful insights

Dynatrace

JANUARY 21, 2025

Chances are, youre a seasoned expert who visualizes meticulously identified key metrics across several sophisticated charts. For example, if you’re monitoring network traffic and the average over the past 7 days is 500 Mbps, the threshold will adapt to this baseline.

Traffic

Traffic Metrics Analytics Monitoring

Efficient SLO event integration powers successful AIOps

Dynatrace

APRIL 5, 2024

The first part of this blog post briefly explores the integration of SLO events with AI. Consequently, the AI is founded upon the related events, and due to the detection parameters (threshold, period, analysis interval, frequent detection, etc), an issue arose. By analogy, envision an apple tree where an apple drops.

Efficiency

Efficiency Traffic Tuning Metrics

The keys to selecting a platform for end-to-end observability

Dynatrace

DECEMBER 2, 2024

It should also be possible to analyze data in context to proactively address events, optimize performance, and remediate issues in real time. This enables proactive changes such as resource autoscaling, traffic shifting, or preventative rollbacks of bad code deployment ahead of time.

Artificial Intelligence

Artificial Intelligence DevOps Architecture Cloud

AI-powered DNS request tracking extends infrastructure observability for high quality network traffic

Dynatrace

OCTOBER 1, 2020

To extend Dynatrace diagnostic visibility into network traffic, we’ve added out-of-the-box DNS request tracking to our infrastructure monitoring capabilities. While our competitors only provide generic traffic monitoring without artificial intelligence, Dynatrace automatically analyzes DNS-related anomalies.

Traffic

Traffic Network Infrastructure Artificial Intelligence

Title Launch Observability at Netflix Scale

The Netflix TechBlog

DECEMBER 17, 2024

The Challenge of Title Launch Observability As engineers, were wired to track system metrics like error rates, latencies, and CPU utilizationbut what about metrics that matter to a titlessuccess? Using the source of truth: Logs serve as a reliable source of truth by providing a comprehensive record of system events.

Traffic

Traffic Scalability Strategy Monitoring

Introducing Impressions at Netflix

The Netflix TechBlog

FEBRUARY 14, 2025

Collecting Raw Impression Events As Netflix members explore our platform, their interactions with the user interface spark a vast array of raw events. These events are promptly relayed from the client side to our servers, entering a centralized event processing queue.

Tuning

Tuning Latency Efficiency Storage

Ensuring the Successful Launch of Ads on Netflix

The Netflix TechBlog

JUNE 1, 2023

To do this, we devised a novel way to simulate the projected traffic weeks ahead of launch by building upon the traffic migration framework described here. New content or national events may drive brief spikes, but, by and large, traffic is usually smoothly increasing or decreasing.

Traffic

Traffic Best Practices Systems Testing

DevOps automation: From event-driven automation to answer-driven automation [with causal AI]

Dynatrace

JULY 24, 2023

They need event-driven automation that not only responds to events and triggers but also analyzes and interprets the context to deliver precise and proactive actions. These initial automation endeavors paved the way for greater advancements, leading to the next evolution of event-driven automation.

DevOps

DevOps Traffic Efficiency Servers

Best Practices for Scaling RabbitMQ

Scalegrid

FEBRUARY 24, 2025

Scaling RabbitMQ ensures your system can handle growing traffic and maintain high performance. Optimizing RabbitMQ performance through strategies such as keeping queues short, enabling lazy queues, and monitoring health checks is essential for maintaining system efficiency and effectively managing high traffic loads.

Best Practices

Best Practices Traffic Strategy Efficiency

Migrating Netflix to GraphQL Safely

The Netflix TechBlog

JUNE 14, 2023

So, we relied on higher-level metrics-based testing: AB Testing and Sticky Canaries. The control group’s traffic utilized the legacy Falcor stack, while the experiment population leveraged the new GraphQL client and was directed to the GraphQL Shim. The Replay Tester tool samples raw traffic streams from Mantis.

Traffic

Traffic Latency Metrics Cache

RabbitMQ vs. Kafka: Key Differences

Scalegrid

FEBRUARY 6, 2025

RabbitMQ is designed for flexible routing and message reliability, while Kafka handles high-throughput event streaming and real-time data processing. Kafka is optimized for high-throughput event streaming , excelling in real-time analytics and large-scale data ingestion. What is Apache Kafka?

Latency

Latency Analytics Architecture Storage

A Dynatrace champions guide to get ahead of digital marketing campaigns

Dynatrace

JULY 1, 2020

In my last blog , I’ve provided an example of this happening, whereby the traffic spiked and quadrupled the usual incoming traffic. These are all interesting metrics from marketing point of view, and also highly interesting to you as they allow you to engage with the teams that are driving the traffic against your IT-system.

Traffic

Traffic Analytics Metrics Servers

Kubernetes vs Docker: What’s the difference?

Dynatrace

SEPTEMBER 29, 2021

This opens the door to auto-scalable applications, which effortlessly matches the demands of rapidly growing and varying user traffic. Containers can be replicated or deleted on the fly to meet varying end-user traffic. Event logs for ad-hoc analysis and auditing. In production, containers are easy to replicate. What is Docker?

Open Source

Open Source DevOps Traffic Cloud

Simplified observability for your SNMP devices

Dynatrace

MARCH 22, 2021

Each SNMP-enabled device provides access to its state and performance metrics in a simple and robust way that allows Dynatrace to fetch the metrics and run them through Davis®, our AI causation engine. Events and alerts. Some SNMP-enabled devices are designed to report events on their own with so-called SNMP traps.

Metrics

Metrics Network Infrastructure Traffic

Noisy Neighbor Detection with eBPF

The Netflix TechBlog

SEPTEMBER 10, 2024

To emit a run queue latency metric, we leveraged three eBPF hooks: sched_wakeup, sched_wakeup_new, and sched_switch. During this event, we generate a timestamp and store it in an eBPF hash map using the process ID as the key. ' They let us identify when a process is ready to run and is waiting for CPU time.

Latency

Latency Metrics Programming Monitoring

Seeing through hardware counters: a journey to threefold performance increase

The Netflix TechBlog

NOVEMBER 9, 2022

A quick canary test was free of errors and showed lower latency, which is expected given that our standard canary setup routes an equal amount of traffic to both the baseline running on 4xl and the canary on 12xl. What’s worse, average latency degraded by more than 50%, with both CPU and latency patterns becoming more “choppy.”

Hardware

Hardware Cache Performance Latency

Dynatrace Managed release notes version 1.230

Dynatrace

NOVEMBER 19, 2021

Problems API v2 now includes event evidence data as part of the event details. Custom events for alerting using the Build tab and advanced query mode now apply the same metric dimension limits that are applied to Code -tab-based configurations. Events API v2 has been updated. Local self-monitoring metrics.

Metrics

Metrics Monitoring Traffic Servers

Simplify troubleshooting with AI-powered insights into connection pool performance (Early Adopter)

Dynatrace

DECEMBER 9, 2020

Furthermore, with this update you can: Get insights into connection pool metrics in context with the applications and services that use them. Get insights into connection pool metrics in context with the applications and services that use them. On the overview page you’ll find the metrics aggregated across all detected pools.

Traffic

Traffic Performance Database Metrics

Dynatrace adds support for AWS Transit Gateway with VPC Flow Logs

Dynatrace

JULY 25, 2022

VPC Flow Logs is a feature that gives you the capability to capture more robust IP traffic data that traverses your VPCs. A full list of metrics can be found here and include dimensions such as the following: Packets. Problems have defined lifespans and are updated in real time with all incoming events and findings. Log Events.

AWS

AWS Transportation Network Traffic

From syslog to AWS Firehose: Dynatrace log management innovations that enhance observability

Dynatrace

SEPTEMBER 5, 2024

It also enhances syslog messages with additional context and optimizes network traffic, improving overall system resilience and security. Logs are immediately available for troubleshooting, security investigations, and auditing, becoming integral to the platform alongside traces and metrics.

Innovation

Innovation AWS Analytics Storage

What is log management? How to tame distributed cloud system complexities

Dynatrace

SEPTEMBER 8, 2022

In cloud-native environments, there can also be dozens of additional services and functions all generating data from user-driven events. Metrics, logs , and traces make up three vital prongs of modern observability. As a result, logging tools record large event volumes in real time. What else did the initial event affect?

Cloud

Cloud Systems Analytics DevOps

Data Reprocessing Pipeline in Asset Management Platform @Netflix

The Netflix TechBlog

MARCH 10, 2023

Existing data got updated to be backward compatible without impacting the existing running production traffic. After reading the asset ids using one of the ways, an event is created per asset id to be processed synchronously or asynchronously based on the use case. Generally, this flow is used for small datasets.

Media

Media Traffic Processing Design

Unlock end-to-end observability insights with Dynatrace PurePath 4 seamless integration of OpenTracing for Java

Dynatrace

DECEMBER 9, 2020

Dynatrace is fully committed to the OpenTelemetry community and to the seamless integration of OpenTelemetry data , including ingestion of custom metrics , into the Dynatrace open analytics platform. With Dynatrace OneAgent you also benefit from support for traffic routing and traffic control.

Java

Java Traffic Architecture Strategy

Gain fresh insights with key performance metrics for synthetic browser monitors

Dynatrace

APRIL 23, 2019

However, understanding the performance of different application types requires an emphasis on different performance metrics, that is, key performance metrics. For many traditional web applications , User action duration is considered the best metric available for web-performance optimization.

Metrics

Metrics Monitoring Performance Speed

Dynatrace adds support for VPC Flow Logs to Kinesis Data Firehose

Dynatrace

SEPTEMBER 7, 2022

VPC Flow Logs is an Amazon service that enables IT pros to capture information about the IP traffic that traverses network interfaces in a virtual private cloud, or VPC. By default, each record captures a network internet protocol (IP), a destination, and the source of the traffic flow that occurs within your environment.

Traffic

Traffic AWS Network Cloud

Dynatrace and Google Cloud: Intelligent Kubernetes observability and automation

Dynatrace

DECEMBER 13, 2023

Rexed, Singh, and Stull outline the importance of metrics, traces, logs, events, and the role they play in achieving full–context Kubernetes observability and driving automated responses in hybrid and multi-cloud environments. To ensure everything runs smoothly, they employ the Dynatrace automated monitoring and observability solution.

Google

Google Cloud Infrastructure Metrics

Transparent and confident software delivery with Dynatrace Release Analysis

Dynatrace

APRIL 28, 2021

Organizations that have transitioned to agile software development strategies (including the adoption of a DevOps culture and continuous delivery automation) enforce automated solutions for such decision making—or at the very least, use automation in the gathering of a release-quality metrics. Events ingestion. Kubernetes metadata.

Software

Software Software Strategy Metrics

Telltale: Netflix Application Monitoring Simplified

The Netflix TechBlog

AUGUST 13, 2020

A metric crossed a threshold. Metrics are a key part of understanding application health. But sometimes you can have too many metrics, too many graphs, and too many dashboards. An application is part of an ecosystem that can be subtly influenced by property changes or radically altered by region-wide events. By Andrei U.,

Monitoring

Monitoring Tuning Traffic Metrics

Service level objectives: 5 SLOs to get started

Dynatrace

JUNE 1, 2023

Certain SLOs can help organizations get started on measuring and delivering metrics that matter. With this objective, the app ensures that users experience real-time feedback and immediate updates when logging workouts, recording sets and reps, or tracking performance metrics. The Apdex score of 0.85

Latency

Latency Website Traffic DevOps

Build and operate multicloud FaaS with enhanced, intelligent end-to-end observability

Dynatrace

APRIL 25, 2023

For example, to handle traffic spikes and pay only for what they use. Serverless applications are composed of event-driven functions that run on demand in response to triggers from various sources, such as HTTP requests, messages, or timers. Scale automatically based on the demand and traffic patterns.

Serverless

Serverless Lambda Azure AWS

Dynatrace simplifies StatsD, Telegraf, and Prometheus observability with Davis AI

Dynatrace

OCTOBER 7, 2020

Open-source metric sources automatically map to our Smartscape model for AI analytics. We’ve just enhanced Dynatrace OneAgent with an open metric API. Here’s a quick overview of what you can achieve now that the Dynatrace Software Intelligence Platform has been extended to ingest third-party metrics. Dynatrace news.

Open Source

Open Source Metrics Analytics Tuning

Towards a Reliable Device Management Platform

The Netflix TechBlog

AUGUST 30, 2021

In the Device Management Platform, this is achieved by having device updates be event-sourced through the control plane to the cloud so that NTS will always have the most up-to-date information about the devices available for testing. The RAE is configured to be effectively a router that devices under test (DUTs) are connected to.

Latency

Latency Traffic Transportation Cloud

Detecting RegreSSHion with Dynatrace (CVE-2024-6387)

Dynatrace

JULY 2, 2024

Look for timeout events Exploitation attempts for this vulnerability can be identified by many lines of “Timeout before authentication” in the logs. Using the VPC flow log default pattern available in DPL Architect, we can extract the meaningful fields to see only the network traffic targeting the SSH port.

AWS

AWS Network Traffic Servers

Unlock the observability value of log data with processing at scale

Dynatrace

AUGUST 16, 2022

Even worse, if your service logs record critical events such as errors in a non-standard way, those errors might go unnoticed by your observability team. Whether a web server, mobile app, backend service, or other custom application, log data can provide you with deep insights into your software’s operations and events.

Processing

Processing Metrics Monitoring Java

Real user monitoring vs. synthetic monitoring: Understanding best practices

Dynatrace

JUNE 27, 2022

RUM gathers information on a variety of performance metrics. Data collected on page load events, for example, can include navigation start (when performance begins to be measured), request start (right before the user makes a request from the server), and speed index metrics (measure page load speed). Tools may be limited.

Best Practices

Best Practices Monitoring Wireless Traffic

Leverage automated and intelligent observability for OpenTelemetry for Go with Dynatrace PurePath 4

Dynatrace

JANUARY 28, 2021

To effectively address such warning signs, organizations need to focus on putting observability data into context—mapping and visualizing relationships and dependencies within all collected telemetry data—not only traces, metrics, and logs. With Dynatrace OneAgent you also benefit from support for traffic routing and traffic control.

Traffic

Traffic Open Source Servers Cloud

Implementing service-level objectives to improve software quality

Dynatrace

DECEMBER 27, 2022

By implementing service-level objectives, teams can avoid collecting and checking a huge amount of metrics for each service. First, it helps to understand that applications and all the services and infrastructure that support them generate telemetry data based on traffic from real users. So how can teams start implementing SLOs?

Software

Software Software Benchmarking Latency

Observe syslog with Dynatrace ActiveGate, a secure, trusted edge component

Dynatrace

JULY 15, 2024

You also might be required to capture syslog messages from cloud services on AWS, Azure, and Google Cloud related to resource provisioning, scaling, and security events. Without seeing syslog data in the context of your infrastructure, metrics, and transaction traces, you’re slowed down by manual work with siloed data.

Infrastructure

Infrastructure Network Azure Monitoring

Advanced analytics: Leverage edge IoT data with OpenTelemetry and Dynatrace

Dynatrace

AUGUST 29, 2024

IoT is transforming how industries operate and make decisions, from agriculture to mining, energy utilities, and traffic management. Both methods allow you to ingest and process raw data and metrics. They enable real-time tracking and enhanced situational awareness for air traffic control and collision avoidance systems.

IoT

IoT Analytics Transportation Metrics

AI techniques enhance and accelerate exploratory data analytics

Dynatrace

FEBRUARY 28, 2024

Exploratory data analytics is an analysis method that uses visualizations, including graphs and charts, to help IT teams investigate emerging data trends and circumvent issues, such as unexpected traffic spikes or performance degradations. Start by asking yourself what’s there, whether it’s logs, metrics, or traces.

Analytics

Analytics Metrics Media Monitoring

Stream logs to Dynatrace with Amazon Data Firehose to boost your cloud-native journey

Dynatrace

MAY 3, 2024

Log data—the most verbose form of observability data, complementing other standardized signals like metrics and traces—is especially critical. Take the example of Amazon Virtual Private Cloud (VPC) flow logs, which provide insights into the IP traffic of your network interfaces. Managing this change is difficult.

Cloud

Cloud Lambda AWS Analytics

Kubernetes OOMKilled troubleshooting: Diagnosing out-of-memory issues automatically

Dynatrace

DECEMBER 5, 2022

Each tenant gets its own e-commerce site deployed on a shared Kubernetes cluster, isolated through separate namespaces and additional traffic isolation. There was not much traffic during the weekend, but as Monday came along, Dynatrace started sending alerts about a high HTTP failure rate across almost every tenant on the backend service.

Java

Java Traffic Education Testing

Migrating Critical Traffic At Scale with No Downtime?—?Part 1

Rapid Event Notification System at Netflix

Trending Sources

Better dashboarding with Dynatrace Davis AI: Instant meaningful insights

Efficient SLO event integration powers successful AIOps

The keys to selecting a platform for end-to-end observability

AI-powered DNS request tracking extends infrastructure observability for high quality network traffic

Title Launch Observability at Netflix Scale

Introducing Impressions at Netflix

Ensuring the Successful Launch of Ads on Netflix

DevOps automation: From event-driven automation to answer-driven automation [with causal AI]

Best Practices for Scaling RabbitMQ

Migrating Netflix to GraphQL Safely

RabbitMQ vs. Kafka: Key Differences

A Dynatrace champions guide to get ahead of digital marketing campaigns

Kubernetes vs Docker: What’s the difference?

Simplified observability for your SNMP devices

Noisy Neighbor Detection with eBPF

Seeing through hardware counters: a journey to threefold performance increase

Dynatrace Managed release notes version 1.230

Simplify troubleshooting with AI-powered insights into connection pool performance (Early Adopter)

Dynatrace adds support for AWS Transit Gateway with VPC Flow Logs

From syslog to AWS Firehose: Dynatrace log management innovations that enhance observability

What is log management? How to tame distributed cloud system complexities

Data Reprocessing Pipeline in Asset Management Platform @Netflix

Unlock end-to-end observability insights with Dynatrace PurePath 4 seamless integration of OpenTracing for Java

Gain fresh insights with key performance metrics for synthetic browser monitors

Dynatrace adds support for VPC Flow Logs to Kinesis Data Firehose

Dynatrace and Google Cloud: Intelligent Kubernetes observability and automation

Transparent and confident software delivery with Dynatrace Release Analysis

Telltale: Netflix Application Monitoring Simplified

Service level objectives: 5 SLOs to get started

Build and operate multicloud FaaS with enhanced, intelligent end-to-end observability

Dynatrace simplifies StatsD, Telegraf, and Prometheus observability with Davis AI

Towards a Reliable Device Management Platform

Detecting RegreSSHion with Dynatrace (CVE-2024-6387)

Unlock the observability value of log data with processing at scale

Real user monitoring vs. synthetic monitoring: Understanding best practices

Leverage automated and intelligent observability for OpenTelemetry for Go with Dynatrace PurePath 4

Implementing service-level objectives to improve software quality

Observe syslog with Dynatrace ActiveGate, a secure, trusted edge component

Advanced analytics: Leverage edge IoT data with OpenTelemetry and Dynatrace

AI techniques enhance and accelerate exploratory data analytics

Stream logs to Dynatrace with Amazon Data Firehose to boost your cloud-native journey

Kubernetes OOMKilled troubleshooting: Diagnosing out-of-memory issues automatically

Stay Connected