Cache and Data Engineering - Technology Performance Pulse

Cache

Data Engineering

Formulating ‘Out of Memory Kill’ Prediction on the Netflix App as a Machine Learning Problem

The Netflix TechBlog

JULY 21, 2022

Since memory management is not something one usually associates with classification problems, this blog focuses on formulating the problem as an ML problem and the data engineering that goes along with it. Some nuances while creating this dataset come from the on-field domain knowledge of our engineers.

Big Data

Big Data Cache Engineering Data Engineering

Supporting Diverse ML Systems at Netflix

The Netflix TechBlog

MARCH 7, 2024

Since then, open-source Metaflow has gained support for Argo Workflows , a Kubernetes-native orchestrator, as well as support for Airflow which is still widely used by data engineering teams. Deployment: Cache To produce business value, all our Metaflow projects are deployed to work with other production systems.

Systems

Systems Media Cache Open Source

Join 5,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

Pushy to the Limit: Evolving Netflix’s WebSocket proxy for the future

The Netflix TechBlog

SEPTEMBER 10, 2024

The folks on the Cloud Data Engineering (CDE) team, the ones building the paved path for internal data at Netflix, graciously helped us scale it up and make adjustments, but it ended up being an involved process as we kept growing. As Pushy’s portfolio grew, we experienced some pain points with Dynomite.

Latency

Latency Cache Tuning Efficiency

How Data Inspires Building a Scalable, Resilient and Secure Cloud Infrastructure At Netflix

The Netflix TechBlog

MARCH 5, 2019

the order of the rows on your Netflix home page, issuing content licenses when you click play, finding the Open Connect cache closest to you with the content you requested, and many more). Can we leverage this lineage solution to help forecast SLA misses and address Data Lifecycle Management questions (job cost, table cost, and retention)?

Infrastructure

Infrastructure Cloud Scalability AWS

How LinkedIn Serves Over 4.8 Million Member Profiles per Second

InfoQ

JULY 3, 2023

LinkedIn introduced Couchbase as a centralized caching tier for scaling member profile reads to handle increasing traffic that has outgrown their existing database cluster. The new solution achieved over 99% hit rate, helped reduce tail latencies by more than 60% and costs by 10% annually. By Rafal Gancarz

Cache

Cache Latency Traffic Database

5 data integration trends that will define the future of ETL in 2018

Abhishek Tiwari

DECEMBER 27, 2017

Apache Arrow's in-memory columnar layout is specifically optimized for data locality for better performance on modern hardware like CPUs and GPUs. In contrast, Alluxio a middleware for data access - think Alluxio storage layer as fast cache. It generally improves performance by placing frequently accessed data in memory.

Big Data

Big Data Artificial Intelligence Storage Hardware

How Doris SQL Cache Saved My Daily Morning Meetings

DZone

MARCH 21, 2025

"I checked this data yesterday; why does it take so long today?" "The As a DBA or data engineer, you've likely experienced the awkward moments of being "swarmed" by users. The morning meeting is about to start, and the report is still loading." Do these complaints sound familiar?

Cache

Cache Data Engineering Engineering Systems

Part 3: A Survey of Analytics Engineering Work at Netflix

The Netflix TechBlog

JANUARY 6, 2025

Batch processing data may provide a similar impact and take significantly less time. Its easier to develop and maintain, and tends to be more familiar for analytics engineers, data scientists, and data engineers. Additionally, we set up a nightly batch job to pre-cache results.

Analytics

Analytics Engineering Cache Entertainment

Formulating ‘Out of Memory Kill’ Prediction on the Netflix App as a Machine Learning Problem

Supporting Diverse ML Systems at Netflix

Trending Sources

Pushy to the Limit: Evolving Netflix’s WebSocket proxy for the future

How Data Inspires Building a Scalable, Resilient and Secure Cloud Infrastructure At Netflix

How LinkedIn Serves Over 4.8 Million Member Profiles per Second

5 data integration trends that will define the future of ETL in 2018

How Doris SQL Cache Saved My Daily Morning Meetings

Part 3: A Survey of Analytics Engineering Work at Netflix

Stay Connected