Cache, Data Engineering and Processing - Technology Performance Pulse

Cache

Data Engineering

Processing

Supporting Diverse ML Systems at Netflix

The Netflix TechBlog

MARCH 7, 2024

Data: Fast Data Our main data lake is hosted on S3, organized as Apache Iceberg tables. For ETL and other heavy lifting of data, we mainly rely on Apache Spark. In addition to Spark, we want to support last-mile data processing in Python, addressing use cases such as feature transformations, batch inference, and training.

Systems

Systems Media Cache Open Source

Pushy to the Limit: Evolving Netflix’s WebSocket proxy for the future

The Netflix TechBlog

SEPTEMBER 10, 2024

The voice service then constructs a message for the device and places it on the message queue, which is then processed and sent to Pushy to deliver to the device. The previous version of the message processor was a Mantis stream-processing job that processed messages from the message queue.

Latency

Latency Cache Tuning Efficiency

Join 5,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

Formulating ‘Out of Memory Kill’ Prediction on the Netflix App as a Machine Learning Problem

The Netflix TechBlog

JULY 21, 2022

Since memory management is not something one usually associates with classification problems, this blog focuses on formulating the problem as an ML problem and the data engineering that goes along with it. We now explore each of these components individually, while highlighting the nuances of the data pipeline and pre-processing.

Big Data

Big Data Cache Engineering Data Engineering

How Data Inspires Building a Scalable, Resilient and Secure Cloud Infrastructure At Netflix

The Netflix TechBlog

MARCH 5, 2019

the order of the rows on your Netflix home page, issuing content licenses when you click play, finding the Open Connect cache closest to you with the content you requested, and many more). Can we leverage this lineage solution to help forecast SLA misses and address Data Lifecycle Management questions (job cost, table cost, and retention)?

Infrastructure

Infrastructure Cloud Scalability AWS

5 data integration trends that will define the future of ETL in 2018

Abhishek Tiwari

DECEMBER 27, 2017

A common theme across all these trends is to remove the complexity by simplifying data management as a whole. In 2018, we anticipate that ETL will either lose relevance or the ETL process will disintegrate and be consumed by new data architectures. Unified data management architecture.

Big Data

Big Data Artificial Intelligence Storage Hardware

Part 3: A Survey of Analytics Engineering Work at Netflix

The Netflix TechBlog

JANUARY 6, 2025

Data is a tool that enhances decision-making, complementing the deep expertise and industry knowledge of our creative professionals. Although there was already a process for creating and comparing budgets for new productions against similar past projects, it was highly manual.

Analytics

Analytics Engineering Cache Entertainment

Supporting Diverse ML Systems at Netflix

Pushy to the Limit: Evolving Netflix’s WebSocket proxy for the future

Trending Sources

Formulating ‘Out of Memory Kill’ Prediction on the Netflix App as a Machine Learning Problem

How Data Inspires Building a Scalable, Resilient and Secure Cloud Infrastructure At Netflix

5 data integration trends that will define the future of ETL in 2018

Part 3: A Survey of Analytics Engineering Work at Netflix

Stay Connected