Event and Traffic - Technology Performance Pulse

How To Design For High-Traffic Events And Prevent Your Website From Crashing

Smashing Magazine

JANUARY 7, 2025

How To Design For High-Traffic Events And Prevent Your Website From Crashing How To Design For High-Traffic Events And Prevent Your Website From Crashing Saad Khan 2025-01-07T14:00:00+00:00 2025-01-07T22:04:48+00:00 This article is sponsored by Cloudways Product launches and sales typically attract large volumes of traffic.

Traffic

Traffic Website Design Cache

Rapid Event Notification System at Netflix

The Netflix TechBlog

FEBRUARY 18, 2022

To this end, we developed a Rapid Event Notification System (RENO) to support use cases that require server initiated communication with devices in a scalable and extensible manner. In this blog post, we will give an overview of the Rapid Event Notification System at Netflix and share some of the learnings we gained along the way.

Systems

Systems Traffic Architecture Mobile

Migrating Critical Traffic At Scale with No Downtime?—?Part 1

The Netflix TechBlog

MAY 4, 2023

Migrating Critical Traffic At Scale with No Downtime — Part 1 Shyam Gala , Javier Fernandez-Ivern , Anup Rokkam Pratap , Devang Shah Hundreds of millions of customers tune into Netflix every day, expecting an uninterrupted and immersive streaming experience. This approach has a handful of benefits.

Traffic

Traffic Latency Tuning Systems

Black Friday traffic exposes gaps in observability strategies

Dynatrace

SEPTEMBER 2, 2022

What’s the problem with Black Friday traffic? But that’s difficult when Black Friday traffic brings overwhelming and unpredictable peak loads to retailer websites and exposes the weakest points in a company’s infrastructure, threatening application performance and user experience. Why Black Friday traffic threatens customer experience.

Traffic

Traffic Strategy Retail Ecommerce

Efficient SLO event integration powers successful AIOps

Dynatrace

APRIL 5, 2024

The first part of this blog post briefly explores the integration of SLO events with AI. Consequently, the AI is founded upon the related events, and due to the detection parameters (threshold, period, analysis interval, frequent detection, etc), an issue arose. By analogy, envision an apple tree where an apple drops.

Efficiency

Efficiency Traffic Tuning Metrics

AI-powered DNS request tracking extends infrastructure observability for high quality network traffic

Dynatrace

OCTOBER 1, 2020

To extend Dynatrace diagnostic visibility into network traffic, we’ve added out-of-the-box DNS request tracking to our infrastructure monitoring capabilities. While our competitors only provide generic traffic monitoring without artificial intelligence, Dynatrace automatically analyzes DNS-related anomalies.

Traffic

Traffic Network Infrastructure Artificial Intelligence

The keys to selecting a platform for end-to-end observability

Dynatrace

DECEMBER 2, 2024

It should also be possible to analyze data in context to proactively address events, optimize performance, and remediate issues in real time. This enables proactive changes such as resource autoscaling, traffic shifting, or preventative rollbacks of bad code deployment ahead of time.

Artificial Intelligence

Artificial Intelligence DevOps Architecture Cloud

Better dashboarding with Dynatrace Davis AI: Instant meaningful insights

Dynatrace

JANUARY 21, 2025

For example, if you’re monitoring network traffic and the average over the past 7 days is 500 Mbps, the threshold will adapt to this baseline. An anomaly will be identified if traffic suddenly drops below 200 Mbps or above 800 Mbps, helping you identify unusual spikes or drops.

Traffic

Traffic Metrics Analytics Monitoring

Chaos Engineering With Litmus: A CNCF Incubating Project

DZone

FEBRUARY 6, 2025

We have developed a microservices architecture platform that encounters sporadic system failures when faced with heavy traffic events. System resilience stands as the key requirement for e-commerce platforms during scaling operations to keep services operational and deliver performance excellence to users.

Engineering

Engineering Traffic Architecture Network

DevOps automation: From event-driven automation to answer-driven automation [with causal AI]

Dynatrace

JULY 24, 2023

They need event-driven automation that not only responds to events and triggers but also analyzes and interprets the context to deliver precise and proactive actions. These initial automation endeavors paved the way for greater advancements, leading to the next evolution of event-driven automation.

DevOps

DevOps Traffic Efficiency Servers

Ensuring the Successful Launch of Ads on Netflix

The Netflix TechBlog

JUNE 1, 2023

To do this, we devised a novel way to simulate the projected traffic weeks ahead of launch by building upon the traffic migration framework described here. New content or national events may drive brief spikes, but, by and large, traffic is usually smoothly increasing or decreasing.

Traffic

Traffic Best Practices Systems Testing

Title Launch Observability at Netflix Scale

The Netflix TechBlog

MARCH 4, 2025

Accurately Reflecting Production Behavior A key part of our solution is insights into production behavior, which necessitates our requests to the endpoint result in traffic to the real service functions that mimics the same pathways the traffic would take if it came from the usualcallers. We call this capability TimeTravel.

Traffic

Traffic Strategy Entertainment Innovation

Introducing Impressions at Netflix

The Netflix TechBlog

FEBRUARY 14, 2025

Collecting Raw Impression Events As Netflix members explore our platform, their interactions with the user interface spark a vast array of raw events. These events are promptly relayed from the client side to our servers, entering a centralized event processing queue.

Tuning

Tuning Latency Efficiency Storage

Event-Based Autoscaling: Ensuring Smooth Operations on Your Peak Days

DZone

JANUARY 21, 2024

In today’s world, companies often find themselves grappling with unpredictable surges in workloads, especially during pivotal events. This incident serves as a stark illustration of insufficient infrastructure planning during a critical event.

Retail

Retail Games Latency Traffic

Title Launch Observability at Netflix Scale

The Netflix TechBlog

DECEMBER 17, 2024

Using the source of truth: Logs serve as a reliable source of truth by providing a comprehensive record of system events. To detect issues proactively, we need to simulate traffic and predict system behavior in advance. Once artificial traffic is generated, discarding the response object and relying solely on logs becomes inefficient.

Traffic

Traffic Scalability Strategy Monitoring

Best Practices for Scaling RabbitMQ

Scalegrid

FEBRUARY 24, 2025

Scaling RabbitMQ ensures your system can handle growing traffic and maintain high performance. Optimizing RabbitMQ performance through strategies such as keeping queues short, enabling lazy queues, and monitoring health checks is essential for maintaining system efficiency and effectively managing high traffic loads.

Best Practices

Best Practices Traffic Strategy Scalability

RabbitMQ vs. Kafka: Key Differences

Scalegrid

FEBRUARY 6, 2025

RabbitMQ is designed for flexible routing and message reliability, while Kafka handles high-throughput event streaming and real-time data processing. Kafka is optimized for high-throughput event streaming , excelling in real-time analytics and large-scale data ingestion. What is Apache Kafka?

Latency

Latency Analytics Architecture Storage

Automate CI/CD pipelines with Dynatrace: Part 2, Deploy stage

Dynatrace

NOVEMBER 28, 2023

Even when the staging environment closely mirrors the production environment, achieving a complete replication of all potential scenarios, such as simulating extremely high traffic volumes to assess software performance, remains challenging. This can lead to a lack of insight into how the code will behave when exposed to heavy traffic.

Traffic

Traffic Best Practices Strategy Engineering

5 powerful use cases beyond debugging for Dynatrace Live Debugger

Dynatrace

MARCH 25, 2025

Load generators simulate traffic. Or maybe you want to correlate an event with other events in your system. Performance benchmarking Performance benchmarking is one of the unresolved mysteries of software engineering. In many ways, it’s more of an art than a science. Sometimes, you need heavyweight tools.

Benchmarking

Benchmarking Code Open Source Engineering

Mastering Scalability and Performance: A Deep Dive Into Azure Load Balancing Options

DZone

JANUARY 8, 2024

This article provides an overview of Azure's load balancing options, encompassing Azure Load Balancer, Azure Application Gateway, Azure Front Door Service, and Azure Traffic Manager. Each of these services addresses specific use cases, offering diverse functionalities to meet the demands of modern applications.

Azure

Azure Scalability Traffic Performance

COVID-19 and Digital Services: An Action Plan for the Unexpected

Dynatrace

APRIL 22, 2020

While most government agencies and commercial enterprises have digital services in place, the current volume of usage — including traffic to critical employment, health and retail/eCommerce services — has reached levels that many organizations have never seen before or tested against. So how do you know what to prepare for?

Traffic

Traffic Ecommerce Retail Government

Migrating Netflix to GraphQL Safely

The Netflix TechBlog

JUNE 14, 2023

The control group’s traffic utilized the legacy Falcor stack, while the experiment population leveraged the new GraphQL client and was directed to the GraphQL Shim. This helped us successfully migrate 100% of the traffic on the mobile homepage canvas to GraphQL in 6 months. After validating performance, we slowly built up scope.

Traffic

Traffic Latency Metrics Cache

Seeing through hardware counters: a journey to threefold performance increase

The Netflix TechBlog

NOVEMBER 9, 2022

A quick canary test was free of errors and showed lower latency, which is expected given that our standard canary setup routes an equal amount of traffic to both the baseline running on 4xl and the canary on 12xl. What’s worse, average latency degraded by more than 50%, with both CPU and latency patterns becoming more “choppy.”

Hardware

Hardware Cache Performance Latency

Kubernetes vs Docker: What’s the difference?

Dynatrace

SEPTEMBER 29, 2021

This opens the door to auto-scalable applications, which effortlessly matches the demands of rapidly growing and varying user traffic. Containers can be replicated or deleted on the fly to meet varying end-user traffic. Event logs for ad-hoc analysis and auditing. In production, containers are easy to replicate. What is Docker?

Open Source

Open Source DevOps Traffic Cloud

Six causes of major software outages–And how to avoid them

Dynatrace

AUGUST 8, 2024

As recent events have demonstrated, major software outages are an ever-present threat in our increasingly digital world. Possible scenarios A Distributed Denial of Service (DDoS) attack overwhelms servers with traffic, making a website or service unavailable.

Software

Software Software Infrastructure Network

2023 Black Friday and Cyber Monday retail and e-commerce IT performance observations

Dynatrace

NOVEMBER 30, 2023

What was once an onslaught of consumer traffic between Black Friday and Cyber Monday has turned into a weeklong event, with most retailers offering deals well ahead of Black Friday. Where retailers can look to start tying together third-party services is through their logs and events. However, logs alone won’t solve everything.

Retail

Retail Social Media Performance Benchmarking

A Dynatrace champions guide to get ahead of digital marketing campaigns

Dynatrace

JULY 1, 2020

In my last blog , I’ve provided an example of this happening, whereby the traffic spiked and quadrupled the usual incoming traffic. These are all interesting metrics from marketing point of view, and also highly interesting to you as they allow you to engage with the teams that are driving the traffic against your IT-system.

Traffic

Traffic Analytics Metrics Servers

Get quick alerts and avoid false positives with the new baseline setting

Dynatrace

MARCH 26, 2020

This means that Dynatrace alerts more quickly when an error spike occurs in a high-traffic service (compared to a low-traffic service where statistical confidence is lower). Close issues sooner with shorter event timeouts. The screenshot below illustrates the timeout reduction for better understanding: Summary.

Traffic

Traffic Monitoring Efficiency Strategy

Data Reprocessing Pipeline in Asset Management Platform @Netflix

The Netflix TechBlog

MARCH 10, 2023

Existing data got updated to be backward compatible without impacting the existing running production traffic. After reading the asset ids using one of the ways, an event is created per asset id to be processed synchronously or asynchronously based on the use case. Generally, this flow is used for small datasets.

Traffic

Traffic Media Processing Design

Simplify troubleshooting with AI-powered insights into connection pool performance (Early Adopter)

Dynatrace

DECEMBER 9, 2020

You can even integrate Dynatrace into your CI/CD pipeline using the Events API. This allows you to create a deployment event that contains all important details each time a new version is released. Davis then watches for any new problems that might be related and associates them with the deployment event.

Traffic

Traffic Performance Database Metrics

Introducing Netflix TimeSeries Data Abstraction Layer

The Netflix TechBlog

OCTOBER 8, 2024

Building on these foundational abstractions, we developed the TimeSeries Abstraction — a versatile and scalable solution designed to efficiently store and query large volumes of temporal event data with low millisecond latencies, all in a cost-effective manner across various use cases. Let’s dive into the various aspects of this abstraction.

Latency

Latency Storage Traffic Infrastructure

Unlock end-to-end observability insights with Dynatrace PurePath 4 seamless integration of OpenTracing for Java

Dynatrace

DECEMBER 9, 2020

To address potentially high numbers of requests during online shopping events like Singles Day or Black Friday, it’s crucial that this online shop have a memory storage strategy that allows for speed, scaling, and resilience of all microservices, especially the shopping cart service.

Java

Java Traffic Architecture Strategy

Network performance monitoring top of mind for CloudOps teams

Dynatrace

MAY 19, 2023

Network traffic growth is the main reason for increasing spending, largely because of the adoption of hybrid and multi-cloud architectures. What are the issues with traffic losses and connectivity drops? Without the network, nothing will happen,” Ziemianowicz said. This starts with a different approach to data aggregation.

Network

Network Monitoring Performance Traffic

Detecting RegreSSHion with Dynatrace (CVE-2024-6387)

Dynatrace

JULY 2, 2024

Look for timeout events Exploitation attempts for this vulnerability can be identified by many lines of “Timeout before authentication” in the logs. Using the VPC flow log default pattern available in DPL Architect, we can extract the meaningful fields to see only the network traffic targeting the SSH port.

AWS

AWS Network Traffic Servers

Dynatrace adds support for AWS Transit Gateway with VPC Flow Logs

Dynatrace

JULY 25, 2022

VPC Flow Logs is a feature that gives you the capability to capture more robust IP traffic data that traverses your VPCs. Problems have defined lifespans and are updated in real time with all incoming events and findings. Log Events. What is VPC Flow Logs. Once Davis® detects a problem it lists the issue on your Problems feed.

AWS

AWS Transportation Network Traffic

Service level objectives: 5 SLOs to get started

Dynatrace

JUNE 1, 2023

Traffic This SLO measures the amount of traffic or workload an application receives, either in terms of requests per second or data transfer rate. The traffic SLO targets the website’s ability to handle a high volume of transactional activity during periods of high demand. The Apdex score of 0.85

Latency

Latency Website Traffic DevOps

Dynatrace Managed release notes version 1.230

Dynatrace

NOVEMBER 19, 2021

Problems API v2 now includes event evidence data as part of the event details. Custom events for alerting using the Build tab and advanced query mode now apply the same metric dimension limits that are applied to Code -tab-based configurations. Events API v2 has been updated. You can override these at the monitor level.

Metrics

Metrics Monitoring Traffic Servers

Simplified observability for your SNMP devices

Dynatrace

MARCH 22, 2021

Events and alerts. Some SNMP-enabled devices are designed to report events on their own with so-called SNMP traps. This allows for almost instant notification as soon as an important event is reported. It’s essential to focus on those events that provide useful information or report potential device problems.

Metrics

Metrics Network Infrastructure Traffic

Kubernetes OOMKilled troubleshooting: Diagnosing out-of-memory issues automatically

Dynatrace

DECEMBER 5, 2022

Each tenant gets its own e-commerce site deployed on a shared Kubernetes cluster, isolated through separate namespaces and additional traffic isolation. There was not much traffic during the weekend, but as Monday came along, Dynatrace started sending alerts about a high HTTP failure rate across almost every tenant on the backend service.

Java

Java Traffic Education Testing

Towards a Reliable Device Management Platform

The Netflix TechBlog

AUGUST 30, 2021

In the Device Management Platform, this is achieved by having device updates be event-sourced through the control plane to the cloud so that NTS will always have the most up-to-date information about the devices available for testing. The RAE is configured to be effectively a router that devices under test (DUTs) are connected to.

Latency

Latency Traffic Transportation Cloud

DBLog: A Generic Change-Data-Capture Framework

The Netflix TechBlog

DECEMBER 17, 2019

In databases like MySQL and PostgreSQL, transaction logs are the source of CDC events. Some of DBLog’s features are: Processes captured log events in-order. Interleaves log with dump events, by taking dumps in chunks. No locks on tables are ever acquired, which prevent impacting write traffic on the source database.

Database

Database Traffic Transportation Open Source

Innovate. Collaborate. Deliver. Our digital hub is live

Dynatrace

APRIL 9, 2020

As the world socially distances, we are seeing significant increases in website traffic as people turn to their phones and devices, to connect with loved ones, buy online, distance learn, work remotely, and continuously keep up with the news. . We are hopeful that the world can, and will, quickly return to normal. it’s not increasing!).

Innovation

Innovation Traffic Website Monitoring

What is log management? How to tame distributed cloud system complexities

Dynatrace

SEPTEMBER 8, 2022

In cloud-native environments, there can also be dozens of additional services and functions all generating data from user-driven events. Event logging and software tracing help application developers and operations teams understand what’s happening throughout their application flow and system.

Cloud

Cloud Systems Analytics DevOps

What is application security monitoring?

Dynatrace

MARCH 20, 2024

Continuously monitoring application behavior, network traffic, and system logs allows teams to identify abnormal or suspicious activities that could indicate a security breach. Incident detection and response In the event of a security incident, there is a well-defined incident response process to investigate and mitigate the issue.

Monitoring

Monitoring Analytics Traffic Best Practices

How To Design For High-Traffic Events And Prevent Your Website From Crashing

Rapid Event Notification System at Netflix

Trending Sources

Migrating Critical Traffic At Scale with No Downtime?—?Part 1

Black Friday traffic exposes gaps in observability strategies

Efficient SLO event integration powers successful AIOps

AI-powered DNS request tracking extends infrastructure observability for high quality network traffic

The keys to selecting a platform for end-to-end observability

Better dashboarding with Dynatrace Davis AI: Instant meaningful insights

Chaos Engineering With Litmus: A CNCF Incubating Project

DevOps automation: From event-driven automation to answer-driven automation [with causal AI]

Ensuring the Successful Launch of Ads on Netflix

Title Launch Observability at Netflix Scale

Introducing Impressions at Netflix

Event-Based Autoscaling: Ensuring Smooth Operations on Your Peak Days

Title Launch Observability at Netflix Scale

Best Practices for Scaling RabbitMQ

RabbitMQ vs. Kafka: Key Differences

Automate CI/CD pipelines with Dynatrace: Part 2, Deploy stage

5 powerful use cases beyond debugging for Dynatrace Live Debugger

Mastering Scalability and Performance: A Deep Dive Into Azure Load Balancing Options

COVID-19 and Digital Services: An Action Plan for the Unexpected

Migrating Netflix to GraphQL Safely

Seeing through hardware counters: a journey to threefold performance increase

Kubernetes vs Docker: What’s the difference?

Six causes of major software outages–And how to avoid them

2023 Black Friday and Cyber Monday retail and e-commerce IT performance observations

A Dynatrace champions guide to get ahead of digital marketing campaigns

Get quick alerts and avoid false positives with the new baseline setting

Data Reprocessing Pipeline in Asset Management Platform @Netflix

Simplify troubleshooting with AI-powered insights into connection pool performance (Early Adopter)

Introducing Netflix TimeSeries Data Abstraction Layer

Unlock end-to-end observability insights with Dynatrace PurePath 4 seamless integration of OpenTracing for Java

Network performance monitoring top of mind for CloudOps teams

Detecting RegreSSHion with Dynatrace (CVE-2024-6387)

Dynatrace adds support for AWS Transit Gateway with VPC Flow Logs

Service level objectives: 5 SLOs to get started

Dynatrace Managed release notes version 1.230

Simplified observability for your SNMP devices

Kubernetes OOMKilled troubleshooting: Diagnosing out-of-memory issues automatically

Towards a Reliable Device Management Platform

DBLog: A Generic Change-Data-Capture Framework

Innovate. Collaborate. Deliver. Our digital hub is live

What is log management? How to tame distributed cloud system complexities

What is application security monitoring?

Stay Connected