Entertainment, Latency and Processing - Technology Performance Pulse

Foundation Model for Personalized Recommendation

The Netflix TechBlog

MARCH 28, 2025

Yet, many are confined to a brief temporal window due to constraints in serving latency or training costs. The impetus for constructing a foundational recommendation model is based on the paradigm shift in natural language processing (NLP) to large language models (LLMs).

Tuning

Tuning Efficiency Latency Strategy

Netflix at AWS re:Invent 2019

The Netflix TechBlog

NOVEMBER 22, 2019

Netflix shares how Amazon EC2 Auto Scaling allows its infrastructure to automatically adapt to changing traffic patterns in order to keep its audience entertained and its costs on target. Netflix runs dozens of stateful services on AWS under strict sub-millisecond tail-latency requirements, which brings unique challenges. Wednesday?—?December

AWS

AWS Entertainment Open Source Benchmarking

Migrating Critical Traffic At Scale with No Downtime?—?Part 2

The Netflix TechBlog

JUNE 13, 2023

Behind these perfect moments of entertainment is a complex mechanism, with numerous gears and cogs working in harmony. Replay traffic testing gives us the initial foundation of validation, but as our migration process unfolds, we are met with the need for a carefully controlled migration process.

Traffic

Traffic Metrics Systems Strategy

Supporting Diverse ML Systems at Netflix

The Netflix TechBlog

MARCH 7, 2024

In addition to Spark, we want to support last-mile data processing in Python, addressing use cases such as feature transformations, batch inference, and training. We use metaflow.Table to resolve all input shards which are distributed to Metaflow tasks which are responsible for processing terabytes of data collectively.

Systems

Systems Media Cache Open Source

Telltale: Netflix Application Monitoring Simplified

The Netflix TechBlog

AUGUST 13, 2020

For example, a latency increase is less critical than error rate increase and some error codes are less critical than others. This simplifies the post-incident review process for many teams. A healthy Netflix service enables us to entertain the world. Client metrics and QoE changes. Alerts triggered by our alerting platform.

Monitoring

Monitoring Tuning Traffic Metrics

Growth Engineering at Netflix?—?Automated Imagery Generation

The Netflix TechBlog

FEBRUARY 9, 2021

entertainment?—?and Server-generated assets, since client-side generation would require the retrieval of many individual images, which would increase latency and time-to-render. To reduce latency, assets should be generated in an offline fashion and not in real time. the background image shown above).

Engineering

Engineering Storage Latency Entertainment

Netflix at AWS re:Invent 2019

The Netflix TechBlog

NOVEMBER 22, 2019

Netflix shares how Amazon EC2 Auto Scaling allows its infrastructure to automatically adapt to changing traffic patterns in order to keep its audience entertained and its costs on target. Netflix runs dozens of stateful services on AWS under strict sub-millisecond tail-latency requirements, which brings unique challenges. Wednesday?—?December

AWS

AWS Entertainment Open Source Benchmarking

Netflix at AWS re:Invent 2019

The Netflix TechBlog

NOVEMBER 22, 2019

Netflix shares how Amazon EC2 Auto Scaling allows its infrastructure to automatically adapt to changing traffic patterns in order to keep its audience entertained and its costs on target. Netflix runs dozens of stateful services on AWS under strict sub-millisecond tail-latency requirements, which brings unique challenges. Wednesday?—?December

AWS

AWS Entertainment Open Source Benchmarking

Snap: a microkernel approach to host networking

The Morning Paper

NOVEMBER 10, 2019

You need a lot of software engineers and the willingness to rewrite a lot of software to entertain that idea. Here are the bombshell paragraphs: Our datacenter applications seek ever more CPU-efficient and lower-latency communication, which Pony Express delivers. The desire for CPU efficiency and lower latencies is easy to understand.

Network

Network Transportation Latency Entertainment

Expanding the Cloud - Cluster Compute Instances for Amazon EC2.

All Things Distributed

JULY 13, 2010

Customers with complex computational workloads such as tightly coupled, parallel processes, or with applications that are very sensitive to network performance, can now achieve the same high compute and networking performance provided by custom-built infrastructure while benefiting from the elasticity, flexibility and cost advantages of Amazon EC2.

Cloud

Cloud AWS Automotive Latency

Growth Engineering at Netflix- Creating a Scalable Offers Platform

The Netflix TechBlog

FEBRUARY 9, 2021

Introduction and account creation Highlight our value propositions and begin the account creation process. Given the flow and mode, the Orchestration Service can then process the request. Lower latency as a result of fewer service calls, which means fewer errors for our visitors. Each step of the flow serves a distinct purpose.

Engineering

Engineering Scalability Architecture Innovation

Understanding What Kubernetes Is Used For: The Key to Cloud-Native Efficiency

Percona

NOVEMBER 9, 2023

Role-Based Access Control (RBAC) : RBAC manages permissions to ensure that only individuals, programs, or processes with the proper authorization can utilize particular resources. It is an invaluable tool for resolving complicated issues and streamlining processes due to its flexibility and scalability. have adopted Kubernetes.

Efficiency

Efficiency Cloud Healthcare Open Source

Why Traditional Monitoring Isn’t Enough for Modern Web Applications

Dotcom-Montior

MAY 12, 2020

Users who rely on the websites for their fundamental needs or entertainment will not tolerate even a few seconds delay. A request will be sent from the client-side and an HTTP check waits on the server port to get the message, process it, and then send back the response. Network latency. Processing and generating the response.

Monitoring

Monitoring Entertainment Hardware Traffic

Page Simulator

The Netflix TechBlog

NOVEMBER 12, 2019

Each queue is then processed by a separate Page Generation cluster that is launched to serve a particular partition. Once the generator is running, it processes the requests in the queue to compute the simulated pages. Generated pages are then persisted to an S3-backed Hive table for metrics processing.

Metrics

Metrics Testing Government Systems

What is a Real-Time Data Platform?

VoltDB

AUGUST 8, 2024

Real-time data platform defined A real-time data platform is designed to ingest, process, analyze, and act upon data instantaneously — right when it’s generated or received. Improved operational efficiency Real-time data platforms enhance operational efficiency by providing timely insights and automating processes.

IoT

IoT Latency Traffic Logistics

I Actually Chatted with ChatGPT

O'Reilly

JANUARY 16, 2024

Whenever I paused for too long, ChatGPT would interpret what I said so far as my request and start processing it. Unpredictable wait times : Wait times (latency) for ChatGPT’s responses are unpredictable, and there aren’t audio cues to help me establish an expectation for how long I need to wait before it responds.

Internet

Internet Internet Entertainment Design

Page Simulator

The Netflix TechBlog

NOVEMBER 12, 2019

Each queue is then processed by a separate Page Generation cluster that is launched to serve a particular partition. Once the generator is running, it processes the requests in the queue to compute the simulated pages. Generated pages are then persisted to an S3-backed Hive table for metrics processing.

Metrics

Metrics Testing Government Systems

Page Simulator

The Netflix TechBlog

NOVEMBER 12, 2019

Each queue is then processed by a separate Page Generation cluster that is launched to serve a particular partition. Once the generator is running, it processes the requests in the queue to compute the simulated pages. Generated pages are then persisted to an S3-backed Hive table for metrics processing.

Metrics

Metrics Testing Government Systems

Working at Netflix 2017

Brendan Gregg

MAY 16, 2017

Instead, it has been a process of continual improvements, by many engineers across the company. ## A Day in the Life (Performance Engineering) What do I actually do all day? A latency outlier issue that happened every 15 minutes. But there was no single crisis point. Java core dump analysis for a crashing JVM. -

Java

Java Entertainment Engineering Scalability

Progress Delayed Is Progress Denied

Alex Russell

APRIL 29, 2021

Combined with (delayed) advanced graphics APIs and threading support, WebXR enables critical immersive, low-friction commerce and entertainment on the web. For heavily latency-sensitive use-cases like WebXR, this is a critical component in delivering a good experience. Offscreen Canvas. TextEncoderStream & TextDecoderStream.

Media

Media Games Education Engineering

Solaris to Linux Migration 2017

Brendan Gregg

SEPTEMBER 5, 2017

There's also a ZFS send/recv code path that should try to use the TASK_INTERRUPTIBLE flag (as suggested by a coworker), to avoid a kernel hang (can't kill -9 the process). Here's some output from my zfsdist tool, in bcc/BPF, which measures ZFS latency as a histogram on Linux: # zfsdist. Tracing ZFS operation latency.

Virtualization

Virtualization AWS Engineering Hardware

Part 3: A Survey of Analytics Engineering Work at Netflix

The Netflix TechBlog

JANUARY 6, 2025

Although there was already a process for creating and comparing budgets for new productions against similar past projects, it was highly manual. Batch processing data may provide a similar impact and take significantly less time. Align on Performance Expectations A major challenge during development was managing API latency.

Analytics

Analytics Engineering Cache Entertainment

Technology Performance Pulse

Foundation Model for Personalized Recommendation

Netflix at AWS re:Invent 2019

Trending Sources

Migrating Critical Traffic At Scale with No Downtime?—?Part 2

Supporting Diverse ML Systems at Netflix

Telltale: Netflix Application Monitoring Simplified

Growth Engineering at Netflix?—?Automated Imagery Generation

Netflix at AWS re:Invent 2019

Netflix at AWS re:Invent 2019

Snap: a microkernel approach to host networking

Expanding the Cloud - Cluster Compute Instances for Amazon EC2.

Growth Engineering at Netflix- Creating a Scalable Offers Platform

Understanding What Kubernetes Is Used For: The Key to Cloud-Native Efficiency

Why Traditional Monitoring Isn’t Enough for Modern Web Applications

Page Simulator

What is a Real-Time Data Platform?

I Actually Chatted with ChatGPT

Page Simulator

Page Simulator

Working at Netflix 2017

Progress Delayed Is Progress Denied

Solaris to Linux Migration 2017

Part 3: A Survey of Analytics Engineering Work at Netflix

Stay Connected