AWS, Systems and Traffic - Technology Performance Pulse

Breaking AWS Lambda: Chaos Engineering for Serverless Devs

DZone

MARCH 24, 2025

Our "serverless" order processing system built on AWS Lambda and API Gateway was humming along, handling 1,000 transactions/minute. A sudden spike in traffic caused Lambda timeouts, API Gateway threw 5xx errors, and customers started tweeting, Why cant I check out?! Then, disaster struck.

Lambda

Lambda Serverless AWS Engineering

Rapid Event Notification System at Netflix

The Netflix TechBlog

FEBRUARY 18, 2022

To this end, we developed a Rapid Event Notification System (RENO) to support use cases that require server initiated communication with devices in a scalable and extensible manner. In this blog post, we will give an overview of the Rapid Event Notification System at Netflix and share some of the learnings we gained along the way.

Systems

Systems Traffic Architecture Mobile

How Netflix Accurately Attributes eBPF Flow Logs

The Netflix TechBlog

APRIL 8, 2025

As noted in our previous blog post, our initial attribution approach relied on Sonar , an internal IP address tracking service that emits an event whenever an IP address in Netflixs AWS VPCs is assigned or unassigned to a workload. Netflixs cloud microservices operate across multiple AWS regions.

AWS

AWS Traffic Network Programming

Choosing the Appropriate AWS Load Balancer: ALB vs. NLB

DZone

SEPTEMBER 14, 2023

With the advent of cloud computing, managing network traffic and ensuring optimal performance have become critical aspects of system architecture. Amazon Web Services (AWS), a leading cloud service provider, offers a suite of load balancers to manage network traffic effectively for applications running on its platform.

AWS

AWS Traffic Network Architecture

AWS observability: AWS monitoring best practices for resiliency

Dynatrace

NOVEMBER 22, 2021

Visibility into system activity and behavior has become increasingly critical given organizations’ widespread use of Amazon Web Services (AWS) and other serverless platforms. These challenges make AWS observability a key practice for building and monitoring cloud-native applications. What is AWS observability?

Best Practices

Best Practices AWS Monitoring Serverless

Dynatrace Cost & Carbon Optimization certified for accuracy and transparency

Dynatrace

MARCH 5, 2025

Integration with existing systems and processes : Integration with existing IT infrastructure, observability solutions, and workflows often requires significant investment and customization. The certification focuses on accuracy and transparency in calculating greenhouse gas (GHG) emissions for AWS, Azure, GCP, and on-premises host instances.

Energy

Energy Analytics Traffic Cloud

From syslog to AWS Firehose: Dynatrace log management innovations that enhance observability

Dynatrace

SEPTEMBER 5, 2024

We’re excited to announce several log management innovations, including native support for Syslog messages, seamless integration with AWS Firehose, an agentless approach using Kubernetes Platform Monitoring solution with Fluent Bit, a new out-of-the-box ingest dashboard, and OpenPipeline ingest improvements.

Innovation

Innovation AWS Analytics Storage

Ensuring the Successful Launch of Ads on Netflix

The Netflix TechBlog

JUNE 1, 2023

To do this, we devised a novel way to simulate the projected traffic weeks ahead of launch by building upon the traffic migration framework described here. New content or national events may drive brief spikes, but, by and large, traffic is usually smoothly increasing or decreasing.

Traffic

Traffic Best Practices Systems Testing

Dynatrace adds support for AWS Transit Gateway with VPC Flow Logs

Dynatrace

JULY 25, 2022

Dynatrace has added support for the newly introduced Amazon Virtual Private Cloud (VPC) Flow Logs for AWS Transit Gateway. What is AWS Transit Gateway? AWS Transit Gateway is a service offering from Amazon Web Services that connects network resources via a centralized hub. What is VPC Flow Logs.

AWS

AWS Transportation Network Traffic

Using Dynatrace to master the 5 pillars of the AWS Well-Architected Framework (Part 1)

Dynatrace

NOVEMBER 24, 2020

As organizations plan, migrate, transform, and operate their workloads on AWS, it’s vital that they follow a consistent approach to evaluating both the on-premises architecture and the upcoming design for cloud-based architecture. AWS 5-pillars. Dynatrace and AWS. through our AWS integrations and monitoring support.

AWS

AWS Artificial Intelligence Best Practices Lambda

Netflix at AWS re:Invent 2019

The Netflix TechBlog

NOVEMBER 22, 2019

by Shefali Vyas Dalal AWS re:Invent is a couple weeks away and our engineers & leaders are thrilled to be in attendance yet again this year! Netflix shares how Amazon EC2 Auto Scaling allows its infrastructure to automatically adapt to changing traffic patterns in order to keep its audience entertained and its costs on target.

AWS

AWS Entertainment Open Source Benchmarking

Five-nines availability: Always-on infrastructure delivers system availability during the holidays’ peak loads

Dynatrace

NOVEMBER 22, 2022

For retail organizations, peak traffic can be a mixed blessing. While high-volume traffic often boosts sales, it can also compromise uptimes. The nirvana state of system uptime at peak loads is known as “five-nines availability.” How can IT teams deliver system availability under peak loads that will satisfy customers?

Infrastructure

Infrastructure Availability Systems Retail

Supporting Diverse ML Systems at Netflix

The Netflix TechBlog

MARCH 7, 2024

The Machine Learning Platform (MLP) team at Netflix provides an entire ecosystem of tools around Metaflow , an open source machine learning infrastructure framework we started, to empower data scientists and machine learning practitioners to build and manage a variety of ML systems. ETL workflows), as well as downstream (e.g.

Systems

Systems Media Cache Open Source

Migrating Netflix to GraphQL Safely

The Netflix TechBlog

JUNE 14, 2023

The control group’s traffic utilized the legacy Falcor stack, while the experiment population leveraged the new GraphQL client and was directed to the GraphQL Shim. The AB experiment results hinted that GraphQL’s correctness was not up to par with the legacy system. The Replay Tester tool samples raw traffic streams from Mantis.

Traffic

Traffic Latency Metrics Cache

Detecting RegreSSHion with Dynatrace (CVE-2024-6387)

Dynatrace

JULY 2, 2024

The Qualys Threat Research Unit (TRU) has discovered a Remote Unauthenticated Code Execution (RCE) vulnerability in OpenSSH server (sshd) in glibc-based Linux systems. This can result in a complete system takeover, malware installation, data manipulation, and the creation of backdoors for persistent access.

AWS

AWS Network Traffic Servers

Simplify complex cloud-native environments with AI-driven observability

Dynatrace

OCTOBER 3, 2024

In the latest enhancements of Dynatrace Log Management and Analytics , Dynatrace extends coverage for Native Syslog support: Use Dynatrace ActiveGate to automatically add context and optimize network traffic to your Syslog messages. AWS : Automate your AWS infrastructure with actions across EC2, S3, Lambda, and more.

Cloud

Cloud Lambda AWS Analytics

Architected for resiliency: How Dynatrace withstands data center outages

Dynatrace

JUNE 15, 2021

The fact is, Reliability and Resiliency must be rooted in the architecture of a distributed system. The subject line said: “Success Story: Major Issue in single AWS Frankfurt Availability Zone!” Fact #1: AWS EC2 outage properly documented. Fact #1: AWS EC2 outage properly documented. Ready to learn more? Then read on!

AWS

AWS Traffic Architecture Azure

Bring Your Own Cloud (BYOC) vs. Dedicated Hosting at ScaleGrid

Scalegrid

APRIL 16, 2020

Each of these models is suitable for production deployments and high traffic applications, and are available for all of our supported databases, including MySQL , PostgreSQL , Redis™ and MongoDB® database ( Greenplum® database coming soon). AWS , Azure. AWS , Azure. AWS , Azure. AWS , Azure. AWS , Azure.

Cloud

Cloud Azure AWS Database

Ciao Milano! – An AWS Region is coming to Italy!

All Things Distributed

NOVEMBER 13, 2018

Today, I am happy to announce our plans to open a new AWS Region in Italy! The AWS Europe (Milan) Region is the 25th AWS Region that we've announced globally. It's the sixth AWS Region in Europe, joining existing regions in France, Germany, Ireland, the UK, and the new Region that we recently announced in Sweden.

AWS

AWS Energy Automotive Traffic

Build and operate multicloud FaaS with enhanced, intelligent end-to-end observability

Dynatrace

APRIL 25, 2023

For example, to handle traffic spikes and pay only for what they use. Observability is essential to ensure the reliability, security and quality of any software system. Scale automatically based on the demand and traffic patterns. The elasticity of serverless services helps organizations scale as needed.

Serverless

Serverless Lambda Azure AWS

AWS EKS Monitoring as a Self-Service with Dynatrace

Dynatrace

SEPTEMBER 17, 2019

AWS EKS for Integration and Production. When focusing on the LanguageController service we learn that it’s currently deployed in three pods across three EKS nodes across two AWS Availability Zones (AZ). 4 AWS EFS monitoring. Their technology stack looks like this: Spring Boot-based Microservices. NGINX as an API Gateway.

AWS

AWS Monitoring Ecommerce Lambda

Dynatrace adds support for VPC Flow Logs to Kinesis Data Firehose

Dynatrace

SEPTEMBER 7, 2022

VPC Flow Logs is an Amazon service that enables IT pros to capture information about the IP traffic that traverses network interfaces in a virtual private cloud, or VPC. By default, each record captures a network internet protocol (IP), a destination, and the source of the traffic flow that occurs within your environment.

Traffic

Traffic AWS Network Cloud

Expanding the AWS Cloud – Introducing the AWS Europe (Stockholm) Region

All Things Distributed

DECEMBER 12, 2018

In April 2017, Amazon Web Services announced that it would launch a new AWS infrastructure region Region in Sweden. Today, I'm happy to announce that the AWS Europe (Stockholm) Region, our 20th Region globally, is now generally available for use by customers. Public sector.

AWS

AWS Cloud Games Serverless

Observe syslog with Dynatrace ActiveGate, a secure, trusted edge component

Dynatrace

JULY 15, 2024

You also might be required to capture syslog messages from cloud services on AWS, Azure, and Google Cloud related to resource provisioning, scaling, and security events. One change to send syslog to Dynatrace You can now use the syslog ingestion endpoint on Dynatrace Environment ActiveGate for performant network and system monitoring.

Infrastructure

Infrastructure Network Azure Monitoring

Expanding the Cloud – The Second AWS GovCloud (US) Region, AWS GovCloud (US-East)

All Things Distributed

NOVEMBER 12, 2018

Today, I'm happy to announce that the AWS GovCloud (US-East) Region, our 19th global infrastructure Region, is now available for use by customers in the US. With this launch, AWS now provides 57 Availability Zones, with another 12 zones and four Regions in Bahrain, Cape Town, Hong Kong SAR, and Stockholm expected to come online by 2020.

AWS

AWS Healthcare Cloud Government

Maximizing Performance of AWS RDS for MySQL with Dedicated Log Volumes

Percona

DECEMBER 11, 2023

A quick configuration change may do the trick in improving the performance of your AWS RDS for MySQL instance. This segregation facilitates optimized I/O operations, preventing potential bottlenecks and enhancing overall system performance. Amazon RDS extends support for DLVs across various database engines: MariaDB: 10.6.7

AWS

AWS Benchmarking Performance Traffic

Transform log data into actionable metrics and have Davis AI do the work for you

Dynatrace

MARCH 16, 2022

We added monitoring and analytics for log streams from Kubernetes and multicloud platforms like AWS, GCP, and Azure, as well as the most widely used open-source log data frameworks. Key information about your system and applications comes from logs. Key information about your system and applications comes from logs.

Metrics

Metrics Lambda Infrastructure Monitoring

Dynatrace Managed release notes version 1.216

Dynatrace

MAY 6, 2021

To improve management of node capabilities , we added Enable/disable Web UI traffic operation for cluster node in Cluster Mission Control UI. Operating systems support. Future Dynatrace Managed operating systems support changes. Future Dynatrace Managed operating systems support changes. Linux: Amazon Linux AMI 2017.x.

Operating System

Operating System AWS Metrics Storage

Save Money in AWS RDS: Don’t Trust the Defaults

Percona

MAY 1, 2023

Want to save money on your AWS RDS bill? I’ll show you some MySQL settings to tune to get better performance, and cost savings, with AWS RDS. Default settings can help you get started quickly – but they can also cost you performance and a higher cloud bill at the end of the month. After that, things went back to normal.

AWS

AWS Hardware Storage Tuning

What is cloud migration?

Dynatrace

SEPTEMBER 30, 2021

Cloud services platforms like AWS, Azure, and GCP are reshaping how organizations deliver value to their customers, making cloud migration an increasingly attractive option for running applications. In case of a spike in traffic, you can automatically spin up more resources, often in a matter of seconds. Inconsistent performance.

Cloud

Cloud Traffic Best Practices Strategy

Netflix at AWS re:Invent 2019

The Netflix TechBlog

NOVEMBER 22, 2019

by Shefali Vyas Dalal AWS re:Invent is a couple weeks away and our engineers & leaders are thrilled to be in attendance yet again this year! Netflix shares how Amazon EC2 Auto Scaling allows its infrastructure to automatically adapt to changing traffic patterns in order to keep its audience entertained and its costs on target.

AWS

AWS Entertainment Open Source Benchmarking

Netflix at AWS re:Invent 2019

The Netflix TechBlog

NOVEMBER 22, 2019

by Shefali Vyas Dalal AWS re:Invent is a couple weeks away and our engineers & leaders are thrilled to be in attendance yet again this year! Netflix shares how Amazon EC2 Auto Scaling allows its infrastructure to automatically adapt to changing traffic patterns in order to keep its audience entertained and its costs on target.

AWS

AWS Entertainment Open Source Benchmarking

Expanding the Cloud – An AWS Region is coming to Hong Kong

All Things Distributed

JUNE 20, 2017

Today, I am very excited to announce our plans to open a new AWS Region in Hong Kong! The new region will give Hong Kong-based businesses, government organizations, non-profits, and global companies with customers in Hong Kong, the ability to leverage AWS technologies from data centers in Hong Kong.

AWS

AWS Logistics Cloud Social Media

Evolving Regional Evacuation

The Netflix TechBlog

SEPTEMBER 23, 2019

In order to achieve this level of availability, we leverage an N+1 architecture where we treat Amazon Web Services (AWS) regions as fault domains, allowing us to withstand single region failures. So, if we evacuate South American traffic to North America, demand for CE and Android DRM won’t grow uniformly.

Traffic

Traffic Metrics Mobile Government

Python at Netflix

The Netflix TechBlog

APRIL 29, 2019

takes place in Amazon Web Services (AWS), whereas everything that happens afterwards (i.e., Various software systems are needed to design, build, and operate this CDN infrastructure, and a significant number of them are written in Python. We have created streams of events from a number of systems that get unified into a single tool.

Open Source

Open Source Network Infrastructure Big Data

Breaking data silos: Liquid Reply’s journey to custom API observability with OpenTelemetry and Dynatrace

Dynatrace

JUNE 20, 2023

Data is proliferating in separate silos from containers and Kubernetes to open source APIs and software to serverless compute services, such as AWS and Azure. In this case, the bank group could not rely solely on Google Cloud Trace because they needed to collect traces and monitor the applications across all their systems.

Open Source

Open Source Google Serverless Azure

Telltale: Netflix Application Monitoring Simplified

The Netflix TechBlog

AUGUST 13, 2020

Our streaming teams need a monitoring system that enables them to quickly diagnose and remediate problems; seconds count! Our Node team needs a system that empowers a small group to operate a large fleet. It usually has dependencies, talks to other services, and lives in different AWS regions. Regional traffic evacuations.

Monitoring

Monitoring Tuning Traffic Metrics

Välkommen till Stockholm – An AWS Region is coming to the Nordics

All Things Distributed

APRIL 4, 2017

Today, I am very excited to announce our plans to open a new AWS Region in the Nordics! The new region will give Nordic-based businesses, government organisations, non-profits, and global companies with customers in the Nordics, the ability to leverage the AWS technology infrastructure from data centers in Sweden.

AWS

AWS Airlines Latency Games

Consistent caching mechanism in Titus Gateway

The Netflix TechBlog

NOVEMBER 3, 2022

As the number of Titus users increased over the years, the load and pressure on the system increased substantially. cell): Titus Job Coordinator is a leader elected process managing the active state of the system. For example, a batch workflow orchestration system may create multiple jobs which are part of a single workflow execution.

Cache

Cache Latency Traffic Systems

Building Netflix’s Distributed Tracing Infrastructure

The Netflix TechBlog

OCTOBER 19, 2020

which is difficult when troubleshooting distributed systems. Troubleshooting a session in Edgar When we started building Edgar four years ago, there were very few open-source distributed tracing systems that satisfied our needs. Investigating a video streaming failure consists of inspecting all aspects of a member account.

Infrastructure

Infrastructure Transportation Storage Open Source

Looking back at 10 years of compartmentalization at AWS

All Things Distributed

MARCH 26, 2018

At AWS, we don't mark many anniversaries. A concept that has changed infrastructure architecture is now at the core of both AWS and customer reliability and operations. Powering the virtual instances and other resources that make up the AWS Cloud are real physical data centers with AWS servers in them.

AWS

AWS Latency Lambda Architecture

Orchestrating Data/ML Workflows at Scale With Netflix Maestro

The Netflix TechBlog

OCTOBER 18, 2022

Due to its popularity, the number of workflows managed by the system has grown exponentially. We started seeing signs of scale issues, like: Slowness during peak traffic moments like 12 AM UTC, leading to increased operational burden. The scheduler on-call has to closely monitor the system during non-business hours.

Java

Java Scalability Traffic Architecture

How Data Inspires Building a Scalable, Resilient and Secure Cloud Infrastructure At Netflix

The Netflix TechBlog

MARCH 5, 2019

All these micro-services are currently operated in AWS cloud infrastructure. Can we adjust our auto-scaling policies to be more efficiency without risking our availability during traffic spikes? A majority of the Netflix product features are either partially or completely dependent on one of our many micro-services (e.g.,

Infrastructure

Infrastructure Cloud Scalability AWS

TTP-based threat hunting with Dynatrace Security Analytics and Falco Alerts solves alert noise

Dynatrace

AUGUST 9, 2023

Install Falco in AWS EKS to gather security-relevant events from all the happenings in Unguard. To keep it real, we have a load generator that creates benign traffic. For the demonstration in this blog post, we want to deploy Unguard in AWS EKS and hunt for attacks within that environment.

Analytics

Analytics AWS Infrastructure Strategy

Breaking AWS Lambda: Chaos Engineering for Serverless Devs

Rapid Event Notification System at Netflix

Trending Sources

How Netflix Accurately Attributes eBPF Flow Logs

Choosing the Appropriate AWS Load Balancer: ALB vs. NLB

AWS observability: AWS monitoring best practices for resiliency

Dynatrace Cost & Carbon Optimization certified for accuracy and transparency

From syslog to AWS Firehose: Dynatrace log management innovations that enhance observability

Ensuring the Successful Launch of Ads on Netflix

Dynatrace adds support for AWS Transit Gateway with VPC Flow Logs

Using Dynatrace to master the 5 pillars of the AWS Well-Architected Framework (Part 1)

Netflix at AWS re:Invent 2019

Five-nines availability: Always-on infrastructure delivers system availability during the holidays’ peak loads

Supporting Diverse ML Systems at Netflix

Migrating Netflix to GraphQL Safely

Detecting RegreSSHion with Dynatrace (CVE-2024-6387)

Simplify complex cloud-native environments with AI-driven observability

Architected for resiliency: How Dynatrace withstands data center outages

Bring Your Own Cloud (BYOC) vs. Dedicated Hosting at ScaleGrid

Ciao Milano! – An AWS Region is coming to Italy!

Build and operate multicloud FaaS with enhanced, intelligent end-to-end observability

AWS EKS Monitoring as a Self-Service with Dynatrace

Dynatrace adds support for VPC Flow Logs to Kinesis Data Firehose

Expanding the AWS Cloud – Introducing the AWS Europe (Stockholm) Region

Observe syslog with Dynatrace ActiveGate, a secure, trusted edge component

Expanding the Cloud – The Second AWS GovCloud (US) Region, AWS GovCloud (US-East)

Maximizing Performance of AWS RDS for MySQL with Dedicated Log Volumes

Transform log data into actionable metrics and have Davis AI do the work for you

Dynatrace Managed release notes version 1.216

Save Money in AWS RDS: Don’t Trust the Defaults

What is cloud migration?

Netflix at AWS re:Invent 2019

Netflix at AWS re:Invent 2019

Expanding the Cloud – An AWS Region is coming to Hong Kong

Evolving Regional Evacuation

Python at Netflix

Breaking data silos: Liquid Reply’s journey to custom API observability with OpenTelemetry and Dynatrace

Telltale: Netflix Application Monitoring Simplified

Välkommen till Stockholm – An AWS Region is coming to the Nordics

Consistent caching mechanism in Titus Gateway

Building Netflix’s Distributed Tracing Infrastructure

Looking back at 10 years of compartmentalization at AWS

Orchestrating Data/ML Workflows at Scale With Netflix Maestro

How Data Inspires Building a Scalable, Resilient and Secure Cloud Infrastructure At Netflix

TTP-based threat hunting with Dynatrace Security Analytics and Falco Alerts solves alert noise

Stay Connected