Availability, Data and Traffic - Technology Performance Pulse

Migrating Critical Traffic At Scale with No Downtime?—?Part 1

The Netflix TechBlog

MAY 4, 2023

Migrating Critical Traffic At Scale with No Downtime — Part 1 Shyam Gala , Javier Fernandez-Ivern , Anup Rokkam Pratap , Devang Shah Hundreds of millions of customers tune into Netflix every day, expecting an uninterrupted and immersive streaming experience.

Traffic

Traffic Latency Tuning Systems

Dynatrace Managed turnkey Premium High Availability for globally distributed data centers (Early Adopter)

Dynatrace

JUNE 25, 2020

Dynatrace Managed is intrinsically highly available as it stores three copies of all events, user sessions, and metrics across its cluster nodes. Our Premium High Availability comes with the following features: Active-active deployment model for optimum hardware utilization. Minimized cross-data center network traffic.

Availability

Availability Hardware Latency Traffic

Migrating Critical Traffic At Scale with No Downtime?—?Part 2

The Netflix TechBlog

JUNE 13, 2023

Migrating Critical Traffic At Scale with No Downtime — Part 2 Shyam Gala , Javier Fernandez-Ivern , Anup Rokkam Pratap , Devang Shah Picture yourself enthralled by the latest episode of your beloved Netflix series, delighting in an uninterrupted, high-definition streaming experience. This is where large-scale system migrations come into play.

Traffic

Traffic Metrics Systems Strategy

Dynatrace Cost & Carbon Optimization certified for accuracy and transparency

Dynatrace

MARCH 5, 2025

Cloud service providers (CSPs) share carbon footprint data with their customers, but the focus of these tools is on reporting and trending, effectively targeting sustainability officers and business leaders. The certification results are now publicly available.

Energy

Energy Analytics Traffic Cloud

Better dashboarding with Dynatrace Davis AI: Instant meaningful insights

Dynatrace

JANUARY 21, 2025

However, your responsibilities might change or expand, and you need to work with unfamiliar data sets. Activate Davis AI to analyze charts within seconds Davis AI can help you expand your dashboards and dive deeper into your available data to extract additional information.

Traffic

Traffic Metrics Analytics Monitoring

AI-powered DNS request tracking extends infrastructure observability for high quality network traffic

Dynatrace

OCTOBER 1, 2020

With all the data collected and powered by our Davis AI-driven causation engine, Dynatrace automatically identifies slowdowns in your applications and services and points you to their root cause. Ensure high quality network traffic by tracking DNS requests out-of-the-box. Network services visibility (DNS, NTP, ActiveDirectory).

Traffic

Traffic Network Infrastructure Artificial Intelligence

Five-nines availability: Always-on infrastructure delivers system availability during the holidays’ peak loads

Dynatrace

NOVEMBER 22, 2022

For retail organizations, peak traffic can be a mixed blessing. While high-volume traffic often boosts sales, it can also compromise uptimes. The nirvana state of system uptime at peak loads is known as “five-nines availability.” But is five nines availability attainable? Downtime per year. 90% (one nine).

Infrastructure

Infrastructure Availability Systems Retail

Title Launch Observability at Netflix Scale

The Netflix TechBlog

MARCH 4, 2025

Accurately Reflecting Production Behavior A key part of our solution is insights into production behavior, which necessitates our requests to the endpoint result in traffic to the real service functions that mimics the same pathways the traffic would take if it came from the usualcallers. Time Travel to validate ahead oftime.

Traffic

Traffic Strategy Entertainment Innovation

Managing PostgreSQL® High Availability – Part I: PostgreSQL Automatic Failover

Scalegrid

SEPTEMBER 5, 2024

Managing High Availability (HA) in your PostgreSQL hosting is very important to ensuring your database deployment clusters maintain exceptional uptime and strong operational performance so your data is always available to your application. It reduces downtime and supports business continuity.

Availability

Availability Servers Database Open Source

Managing High Availability in PostgreSQL – Part III: Patroni

Scalegrid

AUGUST 22, 2019

In the final post of this series, we will review the last solution, Patroni by Zalando, and compare all three at the end so you can determine which high availability framework is best for your PostgreSQL hosting deployment. Managing High Availability in PostgreSQL – Part I: PostgreSQL Automatic Failover. Patroni for PostgreSQL.

Availability

Availability Servers Network Testing

Best Practices for Scaling RabbitMQ

Scalegrid

FEBRUARY 24, 2025

Scaling RabbitMQ ensures your system can handle growing traffic and maintain high performance. Youll also learn strategies for maintaining data safety and managing node failures so your RabbitMQ setup is always up to the task. This decoupling is crucial in modern architectures where scalability and fault tolerance are paramount.

Best Practices

Best Practices Traffic Strategy Efficiency

Architected for resiliency: How Dynatrace withstands data center outages

Dynatrace

JUNE 15, 2021

The subject line said: “Success Story: Major Issue in single AWS Frankfurt Availability Zone!” The problem started at 1:24PM PDT, with the services starting to become available again about 3 hours later. This number was so low because the automatic traffic redirect was so fast it kept the impact so low.

AWS

AWS Traffic Architecture Azure

Data Movement in Netflix Studio via Data Mesh

The Netflix TechBlog

JULY 26, 2021

From the moment a Netflix film or series is pitched and long before it becomes available on Netflix, it goes through many phases. Operational Reporting is a reporting paradigm specialized in covering high-resolution, low-latency data sets, serving detailed day-to-day activities¹ and processes of a business domain.

Big Data

Big Data Government Processing Analytics

Stream logs to Dynatrace with Amazon Data Firehose to boost your cloud-native journey

Dynatrace

MAY 3, 2024

Log data—the most verbose form of observability data, complementing other standardized signals like metrics and traces—is especially critical. As cloud complexity grows, it brings more volume, velocity, and variety of log data. When trying to address this challenge, your cloud architects will likely choose Amazon Data Firehose.

Cloud

Cloud Lambda AWS Analytics

Introducing Impressions at Netflix

The Netflix TechBlog

FEBRUARY 14, 2025

Every image you hover over isnt just a visual placeholder; its a critical data point that fuels our sophisticated personalization engine. This nuanced integration of data and technology empowers us to offer bespoke content recommendations. This queue ensures we are consistently capturing raw events from our global userbase.

Tuning

Tuning Latency Efficiency Storage

Advanced analytics: Leverage edge IoT data with OpenTelemetry and Dynatrace

Dynatrace

AUGUST 29, 2024

In today’s data-driven world, businesses across various industry verticals increasingly leverage the Internet of Things (IoT) to drive efficiency and innovation. IoT is transforming how industries operate and make decisions, from agriculture to mining, energy utilities, and traffic management.

IoT

IoT Analytics Transportation Metrics

Introducing Netflix TimeSeries Data Abstraction Layer

The Netflix TechBlog

OCTOBER 8, 2024

Rajiv Shringi Vinay Chella Kaidan Fullerton Oleksii Tkachuk Joey Lynch Introduction As Netflix continues to expand and diversify into various sectors like Video on Demand and Gaming , the ability to ingest and store vast amounts of temporal data — often reaching petabytes — with millisecond access latency has become increasingly vital.

Latency

Latency Storage Traffic Tuning

Data Reprocessing Pipeline in Asset Management Platform @Netflix

The Netflix TechBlog

MARCH 10, 2023

This platform has evolved from supporting studio applications to data science applications, machine-learning applications to discover the assets metadata, and build various data facts. Hence we built the data pipeline that can be used to extract the existing assets metadata and process it specifically to each new use case.

Media

Media Traffic Processing Design

Dynatrace adds support for VPC Flow Logs to Kinesis Data Firehose

Dynatrace

SEPTEMBER 7, 2022

Dynatrace and the Dynatrace Intelligent Observability Platform have added support for the newly introduced Amazon VPC Flow Logs to Amazon Kinesis Data Firehose. This support enables customers to define specific endpoint delivery of real-time streaming data to platforms such as Dynatrace. What is VPC Flow Logs? Why Dynatrace?

Traffic

Traffic AWS Network Cloud

Ensuring the Successful Launch of Ads on Netflix

The Netflix TechBlog

JUNE 1, 2023

To do this, we devised a novel way to simulate the projected traffic weeks ahead of launch by building upon the traffic migration framework described here. New content or national events may drive brief spikes, but, by and large, traffic is usually smoothly increasing or decreasing.

Traffic

Traffic Best Practices Systems Testing

DBLog: A Generic Change-Data-Capture Framework

The Netflix TechBlog

DECEMBER 17, 2019

Andreas Andreakis , Ioannis Papapanagiotou Overview Change-Data-Capture (CDC) allows capturing committed changes from a database in real-time and propagating those changes to downstream consumers [1][2]. No locks on tables are ever acquired, which prevent impacting write traffic on the source database. Writing events to any output.

Database

Database Traffic Transportation Open Source

AI techniques enhance and accelerate exploratory data analytics

Dynatrace

FEBRUARY 28, 2024

In a digital-first world, site reliability engineers and IT data analysts face numerous challenges with data quality and reliability in their quest for cloud control. Increasingly, organizations seek to address these problems using AI techniques as part of their exploratory data analytics practices.

Analytics

Analytics Metrics Media Monitoring

OneAgent for Linux on IBM Z (General Availability)

Dynatrace

NOVEMBER 20, 2019

Having released this functionality in an Early Adopter Release with OneAgent version 1.173 and Dynatrace version 1.174 back in August 2019, we’re now happy to announce the General Availability of OneAgent full-stack monitoring for Linux on the IBM Z platform, sometimes informally referred to as Z/Linux. Release details.

Availability

Availability Hardware Java Tuning

Migrating Netflix to GraphQL Safely

The Netflix TechBlog

JUNE 14, 2023

Testing Strategies: A Summary Two key factors determined our testing strategies: Functional vs. non-functional requirements Idempotency If we were testing functional requirements like data accuracy, and if the request was idempotent , we relied on Replay Testing. In such cases, we were not testing for response data but overall behavior.

Traffic

Traffic Latency Metrics Cache

Introducing Netflix’s Key-Value Data Abstraction Layer

The Netflix TechBlog

SEPTEMBER 18, 2024

Central to this infrastructure is our use of multiple online distributed databases such as Apache Cassandra , a NoSQL database known for its high availability and scalability. Second, developers had to constantly re-learn new data modeling practices and common yet critical data access patterns.

Latency

Latency Storage Cache Servers

DBLog: A Generic Change-Data-Capture Framework

The Netflix TechBlog

DECEMBER 17, 2019

Andreas Andreakis , Ioannis Papapanagiotou Overview Change-Data-Capture (CDC) allows capturing committed changes from a database in real-time and propagating those changes to downstream consumers [1][2]. No locks on tables are ever acquired, which prevent impacting write traffic on the source database. Writing events to any output.

Database

Database Traffic Transportation Open Source

5 powerful use cases beyond debugging for Dynatrace Live Debugger

Dynatrace

MARCH 25, 2025

We recently announced Dynatrace Live Debugger , which gives developers unprecedented access to real-time data and runtime behavior insights. How can you tell if an algorithm or data source changed or a new feature flag worked? Test data collection Accurate test data can mean life or death.

Benchmarking

Benchmarking Code Open Source Engineering

MySQL High Availability Framework Explained – Part III: Failure Scenarios

Scalegrid

APRIL 16, 2019

In this three-part blog series, we introduced a High Availability (HA) Framework for MySQL hosting in Part I, and discussed the details of MySQL semisynchronous replication in Part II. Now in Part III, we review how the framework handles some of the important MySQL failure scenarios and recovers to ensure high availability.

Availability

Availability Network Azure AWS

Unlock the observability value of log data with processing at scale

Dynatrace

AUGUST 16, 2022

The data locked in your log files can be a goldmine for your application developers, operations teams, and your enterprise as a whole. However, it can be complicated , expensive , or even impossible to set up robust observability that makes use of this data. Log format inconsistency makes it a challenge to access critical data.

Processing

Processing Metrics Monitoring Java

Transform log data into actionable metrics and have Davis AI do the work for you

Dynatrace

MARCH 16, 2022

Over the last year, Dynatrace extended its AI-powered log monitoring capabilities by providing support for all log data sources. We added monitoring and analytics for log streams from Kubernetes and multicloud platforms like AWS, GCP, and Azure, as well as the most widely used open-source log data frameworks. Duration: 163.41

Metrics

Metrics Lambda Infrastructure Monitoring

Large scale deployments are easy and cost-effective with network zones (Early Adopter)

Dynatrace

JULY 2, 2020

Large enterprise environments are often distributed across multiple data centers around the world. Unnecessary traffic between such data centers can result in wasted resources, unpredictable downtimes, and lost business. optimizing traffic routing. preventing unrelated traffic between data centers and regions.

Network

Network Traffic Infrastructure Tuning

Power dashboarding part 2: Dynatrace dashboard tutorial to gain better, faster answers using AI and formatting

Dynatrace

MARCH 31, 2025

Welcome back to our power dashboarding blog series , data enthusiasts! Query your data with natural language Davis CoPilot is an excellent virtual assistant that helps you create queries using natural language. exploring your data when you know your desired outcome but are unfamiliar with the available data.

Metrics

Metrics Infrastructure Network Best Practices

Data lakehouse innovations advance the three pillars of observability for more collaborative analytics

Dynatrace

FEBRUARY 16, 2023

How do you get more value from petabytes of exponentially exploding, increasingly heterogeneous data? The short answer: The three pillars of observability—logs, metrics, and traces—converging on a data lakehouse. To solve this problem, Dynatrace launched Grail, its causational data lakehouse , in 2022.

Analytics

Analytics Innovation Metrics Database

The road to observability demo part 3: Collect, instrument, and analyze telemetry data automatically with Dynatrace

Dynatrace

MAY 17, 2023

OpenTelemetry , the open source observability tool, has become the go-to standard for instrumenting custom applications to collect observability telemetry data. For this third and final part of our series, we saved the best for last: How you can enhance telemetry data even more and with less effort on your end with Dynatrace OneAgent.

Metrics

Metrics Database Monitoring Network

RabbitMQ vs. Kafka: Key Differences

Scalegrid

FEBRUARY 6, 2025

RabbitMQ is designed for flexible routing and message reliability, while Kafka handles high-throughput event streaming and real-time data processing. Both serve distinct purposes, from managing message queues to ingesting large data volumes.

Latency

Latency Analytics Architecture Storage

The new normal of digital experience delivery – lessons learned from monitoring mission-critical websites during COVID-19

Dynatrace

MAY 6, 2020

Over the last two month s, w e’ve monito red key sites and applications across industries that have been receiving surges in traffic , including government, health insurance, retail, banking, and media. Readers who share our privacy concerns, please note, all the data we monitor is publicly available. .

Website

Website Monitoring Retail Media

A Dynatrace champions guide to get ahead of digital marketing campaigns

Dynatrace

JULY 1, 2020

In my last blog , I’ve provided an example of this happening, whereby the traffic spiked and quadrupled the usual incoming traffic. These are all interesting metrics from marketing point of view, and also highly interesting to you as they allow you to engage with the teams that are driving the traffic against your IT-system.

Traffic

Traffic Analytics Metrics Servers

Orchestrating Data/ML Workflows at Scale With Netflix Maestro

The Netflix TechBlog

OCTOBER 18, 2022

by Jun He , Akash Dwivedi , Natallia Dzenisenka , Snehal Chennuru , Praneeth Yenugutala , Pawan Dixit At Netflix, Data and Machine Learning (ML) pipelines are widely used and have become central for the business, representing diverse use cases that go beyond recommendations, predictions and data transformations.

Java

Java Scalability Traffic Architecture

Log auditing and log forensics benefit from converging observability and security data

Dynatrace

APRIL 13, 2023

The massive volumes of log data associated with a breach have made cybersecurity forensics a complicated, costly problem to solve. As organizations adopt more cloud-native technologies, observability data—telemetry from applications and infrastructure, including logs, metrics, and traces—and security data are converging.

Java

Java Analytics Infrastructure Cloud

Get the insights you need for your F5 BIG-IP LTM

Dynatrace

JUNE 5, 2023

The F5 BIG-IP Local Traffic Manager (LTM) is an application delivery controller (ADC) that ensures the availability, security, and optimal performance of network traffic flows. Business-critical applications typically rely on F5 for availability and success. It serves as a crucial component between applications and users.

Traffic

Traffic Virtualization Metrics Monitoring

Kubernetes vs Docker: What’s the difference?

Dynatrace

SEPTEMBER 29, 2021

This opens the door to auto-scalable applications, which effortlessly matches the demands of rapidly growing and varying user traffic. Containers can be replicated or deleted on the fly to meet varying end-user traffic. Application teams and Kubernetes/Swarm platform operators alike depend on detailed monitoring data.

Open Source

Open Source DevOps Traffic Cloud

MySQL High Availability Framework Explained – Part III: Failover Scenarios

High Scalability

APRIL 16, 2019

In this three-part blog series, we introduced a High Availability (HA) Framework for MySQL hosting in Part I, and discussed the details of MySQL semisynchronous replication in Part II. Now in Part III, we review how the framework handles some of the important MySQL failure scenarios and recovers to ensure high availability.

Availability

Availability Network Azure AWS

Automate CI/CD pipelines with Dynatrace: Part 2, Deploy stage

Dynatrace

NOVEMBER 28, 2023

Even when the staging environment closely mirrors the production environment, achieving a complete replication of all potential scenarios, such as simulating extremely high traffic volumes to assess software performance, remains challenging. This can lead to a lack of insight into how the code will behave when exposed to heavy traffic.

Traffic

Traffic Best Practices Strategy Engineering

Migrating Critical Traffic At Scale with No Downtime?—?Part 1

Dynatrace Managed turnkey Premium High Availability for globally distributed data centers (Early Adopter)

Trending Sources

Migrating Critical Traffic At Scale with No Downtime?—?Part 2

Dynatrace Cost & Carbon Optimization certified for accuracy and transparency

Better dashboarding with Dynatrace Davis AI: Instant meaningful insights

AI-powered DNS request tracking extends infrastructure observability for high quality network traffic

Five-nines availability: Always-on infrastructure delivers system availability during the holidays’ peak loads

Title Launch Observability at Netflix Scale

Managing PostgreSQL® High Availability – Part I: PostgreSQL Automatic Failover

Managing High Availability in PostgreSQL – Part III: Patroni

Best Practices for Scaling RabbitMQ

Architected for resiliency: How Dynatrace withstands data center outages

Data Movement in Netflix Studio via Data Mesh

Stream logs to Dynatrace with Amazon Data Firehose to boost your cloud-native journey

Introducing Impressions at Netflix

Advanced analytics: Leverage edge IoT data with OpenTelemetry and Dynatrace

Introducing Netflix TimeSeries Data Abstraction Layer

Data Reprocessing Pipeline in Asset Management Platform @Netflix

Dynatrace adds support for VPC Flow Logs to Kinesis Data Firehose

Ensuring the Successful Launch of Ads on Netflix

DBLog: A Generic Change-Data-Capture Framework

AI techniques enhance and accelerate exploratory data analytics

OneAgent for Linux on IBM Z (General Availability)

Migrating Netflix to GraphQL Safely

Introducing Netflix’s Key-Value Data Abstraction Layer

DBLog: A Generic Change-Data-Capture Framework

5 powerful use cases beyond debugging for Dynatrace Live Debugger

MySQL High Availability Framework Explained – Part III: Failure Scenarios

Unlock the observability value of log data with processing at scale

Transform log data into actionable metrics and have Davis AI do the work for you

Large scale deployments are easy and cost-effective with network zones (Early Adopter)

Power dashboarding part 2: Dynatrace dashboard tutorial to gain better, faster answers using AI and formatting

Data lakehouse innovations advance the three pillars of observability for more collaborative analytics

The road to observability demo part 3: Collect, instrument, and analyze telemetry data automatically with Dynatrace

RabbitMQ vs. Kafka: Key Differences

The new normal of digital experience delivery – lessons learned from monitoring mission-critical websites during COVID-19

A Dynatrace champions guide to get ahead of digital marketing campaigns

Orchestrating Data/ML Workflows at Scale With Netflix Maestro

Log auditing and log forensics benefit from converging observability and security data

Get the insights you need for your F5 BIG-IP LTM

Top PostgreSQL 17 New Features

Kubernetes vs Docker: What’s the difference?

MySQL High Availability Framework Explained – Part III: Failover Scenarios

Automate CI/CD pipelines with Dynatrace: Part 2, Deploy stage

Stay Connected