AWS, Engineering and Traffic - Technology Performance Pulse

Breaking AWS Lambda: Chaos Engineering for Serverless Devs

DZone

MARCH 24, 2025

Our "serverless" order processing system built on AWS Lambda and API Gateway was humming along, handling 1,000 transactions/minute. A sudden spike in traffic caused Lambda timeouts, API Gateway threw 5xx errors, and customers started tweeting, Why cant I check out?! Then, disaster struck.

Lambda

Lambda Serverless AWS Engineering

AWS observability: AWS monitoring best practices for resiliency

Dynatrace

NOVEMBER 22, 2021

Visibility into system activity and behavior has become increasingly critical given organizations’ widespread use of Amazon Web Services (AWS) and other serverless platforms. These challenges make AWS observability a key practice for building and monitoring cloud-native applications. What is AWS observability? AWS Lambda.

Best Practices

Best Practices AWS Monitoring Serverless

From syslog to AWS Firehose: Dynatrace log management innovations that enhance observability

Dynatrace

SEPTEMBER 5, 2024

We’re excited to announce several log management innovations, including native support for Syslog messages, seamless integration with AWS Firehose, an agentless approach using Kubernetes Platform Monitoring solution with Fluent Bit, a new out-of-the-box ingest dashboard, and OpenPipeline ingest improvements.

Innovation

Innovation AWS Analytics Storage

Dynatrace adds support for AWS Transit Gateway with VPC Flow Logs

Dynatrace

JULY 25, 2022

Dynatrace has added support for the newly introduced Amazon Virtual Private Cloud (VPC) Flow Logs for AWS Transit Gateway. What is AWS Transit Gateway? AWS Transit Gateway is a service offering from Amazon Web Services that connects network resources via a centralized hub. What is VPC Flow Logs.

AWS

AWS Transportation Network Traffic

Netflix at AWS re:Invent 2019

The Netflix TechBlog

NOVEMBER 22, 2019

by Shefali Vyas Dalal AWS re:Invent is a couple weeks away and our engineers & leaders are thrilled to be in attendance yet again this year! Netflix shares how Amazon EC2 Auto Scaling allows its infrastructure to automatically adapt to changing traffic patterns in order to keep its audience entertained and its costs on target.

AWS

AWS Entertainment Open Source Benchmarking

Using Dynatrace to master the 5 pillars of the AWS Well-Architected Framework (Part 1)

Dynatrace

NOVEMBER 24, 2020

As organizations plan, migrate, transform, and operate their workloads on AWS, it’s vital that they follow a consistent approach to evaluating both the on-premises architecture and the upcoming design for cloud-based architecture. AWS 5-pillars. Dynatrace and AWS. through our AWS integrations and monitoring support.

AWS

AWS Artificial Intelligence Best Practices Lambda

Ensuring the Successful Launch of Ads on Netflix

The Netflix TechBlog

JUNE 1, 2023

To do this, we devised a novel way to simulate the projected traffic weeks ahead of launch by building upon the traffic migration framework described here. New content or national events may drive brief spikes, but, by and large, traffic is usually smoothly increasing or decreasing.

Traffic

Traffic Best Practices Systems Testing

Stream logs to Dynatrace with Amazon Data Firehose to boost your cloud-native journey

Dynatrace

MAY 3, 2024

Without the ability to see the logs that are relevant to your service, infrastructure, or cloud function—at exactly the right time and in exactly the right format—your cloud or DevOps engineers lose the ability to find the root causes of the issues they troubleshoot. These already provide a common integration with AWS log sources.

Cloud

Cloud Lambda AWS Analytics

Migrating Netflix to GraphQL Safely

The Netflix TechBlog

JUNE 14, 2023

By the summer of 2020, many UI engineers were ready to move to GraphQL. The GraphQL shim enabled client engineers to move quickly onto GraphQL, figure out client-side concerns like cache normalization, experiment with different GraphQL clients, and investigate client performance without being blocked by server-side migrations.

Traffic

Traffic Latency Metrics Cache

Architected for resiliency: How Dynatrace withstands data center outages

Dynatrace

JUNE 15, 2021

The subject line said: “Success Story: Major Issue in single AWS Frankfurt Availability Zone!” And the last sentence of the email was what made me want to share this story publicly, as it’s a testimonial to how modern software engineering and operations should make you feel. Fact #1: AWS EC2 outage properly documented.

AWS

AWS Traffic Architecture Azure

Rapid Event Notification System at Netflix

The Netflix TechBlog

FEBRUARY 18, 2022

Personalized Experience Refresh Netflix Recommendation engine continuously refreshes recommendations for every member. We thus assigned a priority to each use case and sharded event traffic by routing to priority-specific queues and the corresponding event processing clusters.

Systems

Systems Traffic Architecture Mobile

Dynatrace adds support for VPC Flow Logs to Kinesis Data Firehose

Dynatrace

SEPTEMBER 7, 2022

VPC Flow Logs is an Amazon service that enables IT pros to capture information about the IP traffic that traverses network interfaces in a virtual private cloud, or VPC. By default, each record captures a network internet protocol (IP), a destination, and the source of the traffic flow that occurs within your environment.

Traffic

Traffic AWS Network Cloud

AWS EKS Monitoring as a Self-Service with Dynatrace

Dynatrace

SEPTEMBER 17, 2019

AWS EKS for Integration and Production. When focusing on the LanguageController service we learn that it’s currently deployed in three pods across three EKS nodes across two AWS Availability Zones (AZ). 4 AWS EFS monitoring. Their technology stack looks like this: Spring Boot-based Microservices. NGINX as an API Gateway.

AWS

AWS Monitoring Ecommerce Lambda

Auth0 Architecture: Running In Multiple Cloud Providers And Regions

High Scalability

AUGUST 27, 2018

This is article was written by Dirceu Pereira Tiegs, Site Reliability Engineer at Auth0, and originally was originally published in Auth0. The number of services that compose our product in order to scale our organization and handle the increases in traffic went from under 10 to over 30 services. Core service architecture.

Architecture

Architecture Cloud Traffic Infrastructure

Growth Engineering at Netflix?—?Automated Imagery Generation

The Netflix TechBlog

FEBRUARY 9, 2021

Growth Engineering at Netflix?—?Automated In the Growth Engineering team, we refer to this as the top of the signup funnel. For more background on the signup funnel and Growth Engineering’s role in the signup funnel, please read our initial post on the topic: Growth Engineering at Netflix? Growth Engineering at Netflix?—?Automated

Engineering

Engineering Storage Latency Entertainment

Detecting RegreSSHion with Dynatrace (CVE-2024-6387)

Dynatrace

JULY 2, 2024

If you’re using AWS for your cloud platform and you have forwarded your VPC flow logs to Dynatrace, you can use DQL to look for your potential victims AND suspicious IP addresses that have tried to access your servers using SSH. Analyze network flow logs Last but not least, your network logs are the ultimate source of data.

AWS

AWS Network Traffic Servers

Platform Engineering Teams Done Right…

Adrian Cockcroft

FEBRUARY 9, 2023

There are three current underlying reasons for the platform engineering meme today. The virtualization and networking platform could be datacenter based, with something like VMware, or cloud based using one of the cloud providers such as AWS EC2. Above that there’s a deployment platform such as Kubernetes or AWS Lambda.

Engineering

Engineering Serverless Lambda AWS

Leverage automated and intelligent observability for OpenTelemetry for Go with Dynatrace PurePath 4

Dynatrace

JANUARY 28, 2021

OneAgent takes care of auto-instrumentation and creation of the Smartscape model, which enables the Davis AI causation engine. With Dynatrace OneAgent you also benefit from support for traffic routing and traffic control. OneAgent implements network zones to create traffic routing rules and limit cross-data-center traffic.

Traffic

Traffic Open Source Servers Cloud

Expanding the AWS Cloud – Introducing the AWS Europe (Stockholm) Region

All Things Distributed

DECEMBER 12, 2018

In April 2017, Amazon Web Services announced that it would launch a new AWS infrastructure region Region in Sweden. Today, I'm happy to announce that the AWS Europe (Stockholm) Region, our 20th Region globally, is now generally available for use by customers. Public sector.

AWS

AWS Cloud Games Serverless

The Story of Apollo - Amazon’s Deployment Engine

All Things Distributed

NOVEMBER 12, 2014

Without efficient, reliable, and repeatable software updates, engineers need to redirect their focus from developing new features to managing and debugging their deployments. An automated deployment system needs to carefully sequence a software update across a fleet while it is actively receiving traffic.

Engineering

Engineering AWS Java Traffic

Explore and analyze Amazon CloudWatch Synthetics data in Dynatrace

Dynatrace

APRIL 28, 2020

Dynatrace is an AWS Partner Network (APN) Advanced Technology Partner that provides software intelligence for simplifying enterprise cloud complexity and accelerating digital transformation. Partner Solutions Architect at AWS and will also be published via the AWS Partner Network (APN) Blog. Dynatrace news. Solution Overview.

AWS

AWS Monitoring Traffic Website

Maximizing Performance of AWS RDS for MySQL with Dedicated Log Volumes

Percona

DECEMBER 11, 2023

A quick configuration change may do the trick in improving the performance of your AWS RDS for MySQL instance. Amazon RDS extends support for DLVs across various database engines: MariaDB: 10.6.7 Amazon RDS extends support for DLVs across various database engines: MariaDB: 10.6.7 and later v10 versions MySQL: 8.0.28

AWS

AWS Benchmarking Performance Traffic

Python at Netflix

The Netflix TechBlog

APRIL 29, 2019

takes place in Amazon Web Services (AWS), whereas everything that happens afterwards (i.e., Python has long been a popular programming language in the networking space because it’s an intuitive language that allows engineers to quickly solve networking problems. are you logged in? what plan do you have? what do you want to watch?)

Open Source

Open Source Network Infrastructure Big Data

Netflix at AWS re:Invent 2019

The Netflix TechBlog

NOVEMBER 22, 2019

by Shefali Vyas Dalal AWS re:Invent is a couple weeks away and our engineers & leaders are thrilled to be in attendance yet again this year! Netflix shares how Amazon EC2 Auto Scaling allows its infrastructure to automatically adapt to changing traffic patterns in order to keep its audience entertained and its costs on target.

AWS

AWS Entertainment Open Source Benchmarking

Netflix at AWS re:Invent 2019

The Netflix TechBlog

NOVEMBER 22, 2019

by Shefali Vyas Dalal AWS re:Invent is a couple weeks away and our engineers & leaders are thrilled to be in attendance yet again this year! Netflix shares how Amazon EC2 Auto Scaling allows its infrastructure to automatically adapt to changing traffic patterns in order to keep its audience entertained and its costs on target.

AWS

AWS Entertainment Open Source Benchmarking

Building Netflix’s Distributed Tracing Infrastructure

The Netflix TechBlog

OCTOBER 19, 2020

a Netflix member via Twitter This is an example of a question our on-call engineers need to answer to help resolve a member issue?—?which We needed to increase engineering productivity via distributed request tracing. That is the first question our engineering teams asked us when integrating the tracer library.

Infrastructure

Infrastructure Transportation Storage Open Source

Transform log data into actionable metrics and have Davis AI do the work for you

Dynatrace

MARCH 16, 2022

We added monitoring and analytics for log streams from Kubernetes and multicloud platforms like AWS, GCP, and Azure, as well as the most widely used open-source log data frameworks. Instead of manually looking for meaning in logs and reacting to discovered insights, you can hand the job over to the Dynatrace Davis® AI causation engine.

Metrics

Metrics Lambda Infrastructure Monitoring

Expanding the Cloud – An AWS Region is coming to Hong Kong

All Things Distributed

JUNE 20, 2017

Today, I am very excited to announce our plans to open a new AWS Region in Hong Kong! The new region will give Hong Kong-based businesses, government organizations, non-profits, and global companies with customers in Hong Kong, the ability to leverage AWS technologies from data centers in Hong Kong.

AWS

AWS Logistics Cloud Social Media

What is cloud migration?

Dynatrace

SEPTEMBER 30, 2021

Cloud services platforms like AWS, Azure, and GCP are reshaping how organizations deliver value to their customers, making cloud migration an increasingly attractive option for running applications. In case of a spike in traffic, you can automatically spin up more resources, often in a matter of seconds.

Cloud

Cloud Traffic Best Practices Strategy

Five-nines availability: Always-on infrastructure delivers system availability during the holidays’ peak loads

Dynatrace

NOVEMBER 22, 2022

For retail organizations, peak traffic can be a mixed blessing. While high-volume traffic often boosts sales, it can also compromise uptimes. Five-nines availability has long been the goal of site reliability engineers (SREs) to provide system availability that is “always on.”

Infrastructure

Infrastructure Availability Systems Retail

Välkommen till Stockholm – An AWS Region is coming to the Nordics

All Things Distributed

APRIL 4, 2017

Today, I am very excited to announce our plans to open a new AWS Region in the Nordics! The new region will give Nordic-based businesses, government organisations, non-profits, and global companies with customers in the Nordics, the ability to leverage the AWS technology infrastructure from data centers in Sweden.

AWS

AWS Airlines Latency Games

What Adrian Did Next — Part 4 — how I helped Netflix launch on iPad and iPhone — 2007 to 2010

Adrian Cockcroft

JANUARY 27, 2025

One of the Java engineers on my teamJian Wujoined me to help figure out the API. We found an engineer from another team who knew C++ and was willing to learn Objective C and borrowed him for the project (unfortunately Ive forgotten his name). We simply didnt have enough capacity in our datacenter to run the traffic, so it had to work.

C++

C++ Mobile Hardware Java

How Data Inspires Building a Scalable, Resilient and Secure Cloud Infrastructure At Netflix

The Netflix TechBlog

MARCH 5, 2019

Netflix’s engineering culture is predicated on Freedom & Responsibility, the idea that everyone (and every team) at Netflix is entrusted with a core responsibility and they are free to operate with freedom to satisfy their mission. All these micro-services are currently operated in AWS cloud infrastructure.

Infrastructure

Infrastructure Cloud Scalability AWS

Evolving Regional Evacuation

The Netflix TechBlog

SEPTEMBER 23, 2019

Niosha Behnam | Demand Engineering @ Netflix At Netflix we prioritize innovation and velocity in pursuit of the best experience for our 150+ million global customers. In the event of an isolated failure we first pre-scale microservices in the healthy regions after which we can shift traffic away from the failing one.

Traffic

Traffic Metrics Mobile Government

Telltale: Netflix Application Monitoring Simplified

The Netflix TechBlog

AUGUST 13, 2020

Over the years we’ve learned from on-call engineers about the pain points of application monitoring: too many alerts, too many dashboards to scroll through, and too much configuration and maintenance. It usually has dependencies, talks to other services, and lives in different AWS regions. Regional traffic evacuations.

Monitoring

Monitoring Tuning Traffic Metrics

Looking back at 10 years of compartmentalization at AWS

All Things Distributed

MARCH 26, 2018

At AWS, we don't mark many anniversaries. A concept that has changed infrastructure architecture is now at the core of both AWS and customer reliability and operations. Powering the virtual instances and other resources that make up the AWS Cloud are real physical data centers with AWS servers in them.

AWS

AWS Latency Lambda Architecture

Dynatrace PurePath 4 integrates OpenTelemetry and the latest cloud-native technologies and provides analytics and AI at scale

Dynatrace

NOVEMBER 17, 2020

PurePath 4 supports serverless computing out-of-the-box, including Kubernetes services from Amazon Web Services (AWS) , Microsoft Azure , and Google Cloud Platform (GCP). FaaS like AWS Lambda and Azure Functions are seamlessly integrated with no code changes. The only deterministic and open AI -engine for observability data.

Analytics

Analytics Technology Technology Cloud

Hyper Scale VPC Flow Logs enrichment to provide Network Insight

The Netflix TechBlog

MAY 26, 2020

Service Segmentation: The ease of the cloud deployments has led to the organic growth of multiple AWS accounts, deployment practices, interconnection practices, etc. VPC Flow Logs VPC Flow Logs is an AWS feature that captures information about the IP traffic going to and from network interfaces in a VPC. 43416 5001 52.213.180.42

Network

Network Tuning AWS Traffic

So many bad takes?—?What is there to learn from the Prime Video microservices to monolith story

Adrian Cockcroft

MAY 6, 2023

Then they tried to scale it to cope with high traffic and discovered that some of the state transitions in their step functions were too frequent, and they had some overly chatty calls between AWS lambda functions and S3. A real-time user experience analytics engine for live video, that looked at all users rather than a subsample.

Serverless

Serverless Lambda Best Practices Traffic

Orchestrating Data/ML Workflows at Scale With Netflix Maestro

The Netflix TechBlog

OCTOBER 18, 2022

We started seeing signs of scale issues, like: Slowness during peak traffic moments like 12 AM UTC, leading to increased operational burden. As the usage increased, we had to vertically scale the system to keep up and were approaching AWS instance type limits. Meson was based on a single leader architecture with high availability.

Java

Java Scalability Traffic Architecture

Supporting Diverse ML Systems at Netflix

The Netflix TechBlog

MARCH 7, 2024

Without these integrations, projects would be stuck at the prototyping stage, or they would have to be maintained as outliers outside the systems maintained by our engineering teams, incurring unsustainable operational overhead. Importantly, all the use cases were engineered by practitioners themselves.

Systems

Systems Media Cache Open Source

Zero Configuration Service Mesh with On-Demand Cluster Discovery

The Netflix TechBlog

AUGUST 29, 2023

A brief history of IPC at Netflix Netflix was early to the cloud, particularly for large-scale companies: we began the migration in 2008, and by 2010, Netflix streaming was fully run on AWS. The ability to run in a degraded but available state during an outage is still a marked improvement over completely stopping traffic flow.

Traffic

Traffic Latency Cloud C++

Reducing PostgreSQL Costs in the Cloud

Percona

FEBRUARY 24, 2023

Figure 1: PMM Home Dashboard From the Amazon Web Services (AWS) documentation , an instance is considered over-provisioned when at least one specification of your instance, such as CPU, memory, or network, can be sized down while still meeting the performance requirements of your workload and no specification is under-provisioned.

Cloud

Cloud Serverless AWS Storage

Using Encryption-at-Rest for PostgreSQL in Kubernetes

Percona

APRIL 20, 2023

This will instruct Container Storage Interface (CSI) to provision encrypted storage volume on your block storage (AWS EBS, GCP Persistent Disk, Ceph, etc.). For example, in Google Kubernetes Engine (GKE), you need to create the key in Cloud Key Management Service (KMS) and set it in the StorageClass: apiVersion: storage.k8s.io/v1beta1

Storage

Storage AWS Cloud Database

Breaking AWS Lambda: Chaos Engineering for Serverless Devs

AWS observability: AWS monitoring best practices for resiliency

Trending Sources

From syslog to AWS Firehose: Dynatrace log management innovations that enhance observability

Dynatrace adds support for AWS Transit Gateway with VPC Flow Logs

Netflix at AWS re:Invent 2019

Using Dynatrace to master the 5 pillars of the AWS Well-Architected Framework (Part 1)

Ensuring the Successful Launch of Ads on Netflix

Stream logs to Dynatrace with Amazon Data Firehose to boost your cloud-native journey

Migrating Netflix to GraphQL Safely

Architected for resiliency: How Dynatrace withstands data center outages

Rapid Event Notification System at Netflix

Dynatrace adds support for VPC Flow Logs to Kinesis Data Firehose

AWS EKS Monitoring as a Self-Service with Dynatrace

Auth0 Architecture: Running In Multiple Cloud Providers And Regions

Growth Engineering at Netflix?—?Automated Imagery Generation

Detecting RegreSSHion with Dynatrace (CVE-2024-6387)

Platform Engineering Teams Done Right…

Leverage automated and intelligent observability for OpenTelemetry for Go with Dynatrace PurePath 4

Expanding the AWS Cloud – Introducing the AWS Europe (Stockholm) Region

The Story of Apollo - Amazon’s Deployment Engine

Explore and analyze Amazon CloudWatch Synthetics data in Dynatrace

Maximizing Performance of AWS RDS for MySQL with Dedicated Log Volumes

Python at Netflix

Netflix at AWS re:Invent 2019

Netflix at AWS re:Invent 2019

Building Netflix’s Distributed Tracing Infrastructure

Transform log data into actionable metrics and have Davis AI do the work for you

Expanding the Cloud – An AWS Region is coming to Hong Kong

What is cloud migration?

Five-nines availability: Always-on infrastructure delivers system availability during the holidays’ peak loads

Välkommen till Stockholm – An AWS Region is coming to the Nordics

What Adrian Did Next — Part 4 — how I helped Netflix launch on iPad and iPhone — 2007 to 2010

How Data Inspires Building a Scalable, Resilient and Secure Cloud Infrastructure At Netflix

Evolving Regional Evacuation

Telltale: Netflix Application Monitoring Simplified

Looking back at 10 years of compartmentalization at AWS

Dynatrace PurePath 4 integrates OpenTelemetry and the latest cloud-native technologies and provides analytics and AI at scale

Hyper Scale VPC Flow Logs enrichment to provide Network Insight

So many bad takes?—?What is there to learn from the Prime Video microservices to monolith story

Orchestrating Data/ML Workflows at Scale With Netflix Maestro

Supporting Diverse ML Systems at Netflix

Zero Configuration Service Mesh with On-Demand Cluster Discovery

Reducing PostgreSQL Costs in the Cloud

Using Encryption-at-Rest for PostgreSQL in Kubernetes

Stay Connected