Data Engineering and Tuning - Technology Performance Pulse

A Recap of the Data Engineering Open Forum at Netflix

The Netflix TechBlog

JUNE 20, 2024

A summary of sessions at the first Data Engineering Open Forum at Netflix on April 18th, 2024 The Data Engineering Open Forum at Netflix on April 18th, 2024. At Netflix, we aspire to entertain the world, and our data engineering teams play a crucial role in this mission by enabling data-driven decision-making at scale.

Data Engineering

Data Engineering Engineering Entertainment Software Engineering

Optimizing Vector Search Performance With Elasticsearch

DZone

NOVEMBER 4, 2024

For instance, a streaming service can employ vector search to recommend films tailored to individual viewing histories and ratings, while a retail brand can analyze customer sentiments to fine-tune marketing strategies.

Retail

Retail Performance Best Practices Tuning

Introducing Impressions at Netflix

The Netflix TechBlog

FEBRUARY 14, 2025

Without a defined schema, it can be difficult to determine whether missing data was intentional or due to a logging error. Automating Performance Tuning with Autoscalers Tuning the performance of our Apache Flink jobs is currently a manual process.

Tuning

Tuning Latency Efficiency Storage

What is IT automation?

Dynatrace

JULY 6, 2022

Expect to spend time fine-tuning automation scripts as you find the right balance between automated and manual processing. This requires significant data engineering efforts, as well as work to build machine-learning models. By tuning workflows, you can increase their efficiency and effectiveness.

Artificial Intelligence

Artificial Intelligence Tuning Strategy Big Data

Analytics at Netflix: Who we are and what we do

The Netflix TechBlog

SEPTEMBER 18, 2020

The Engineer enjoys making data available by piping it in from new sources in optimal ways, building robust data models, prototyping systems, and doing project-specific engineering.

Analytics

Analytics Engineering Data Engineering Tuning

Evolving from Rule-based Classifier: Machine Learning Powered Auto Remediation in Netflix Data…

The Netflix TechBlog

MARCH 4, 2024

Operational automation–including but not limited to, auto diagnosis, auto remediation, auto configuration, auto tuning, auto scaling, auto debugging, and auto testing–is key to the success of modern data platforms. We have also noted a great potential for further improvement by model tuning (see the section of Rollout in Production).

Tuning

Tuning Efficiency Big Data Engineering

3. Psyberg: Automated end to end catch up

The Netflix TechBlog

NOVEMBER 14, 2023

This helps overwrite data only when required and minimizes unnecessary reprocessing. As seen above, by chaining these Psyberg workflows, we could automate the catchup for late-arriving data from hours 2 and 6. The Data Engineer does not need to perform any manual intervention in this case and can thus focus on more important things!

Processing

Processing Tuning C++ Efficiency

Optimizing data warehouse storage

The Netflix TechBlog

DECEMBER 21, 2020

Some of the optimizations are prerequisites for a high-performance data warehouse. Sometimes Data Engineers write downstream ETLs on ingested data to optimize the data/metadata layouts to make other ETL processes cheaper and faster. Orient: Gather tuning parameters for a particular table that changed.

Storage

Storage Latency Efficiency Data Engineering

Building and Scaling Data Lineage at Netflix to Improve Data Infrastructure Reliability, and…

The Netflix TechBlog

MARCH 25, 2019

We adopted the following mission statement to guide our investments: “Provide a complete and accurate data lineage system enabling decision-makers to win moments of truth.” Please share your experience by adding your comments below and stay tuned for more on data lineage at Netflix in the follow up blog posts. .

Infrastructure

Infrastructure Big Data Transportation Architecture

Hyper Scale VPC Flow Logs enrichment to provide Network Insight

The Netflix TechBlog

MAY 26, 2020

And in order to gain visibility into these logs, we need to somehow ingest and enrich this data. It is easier to tune a large Spark job for a consistent volume of data. In other words, we are able to ensure that our Spark app does not “eat” more data than it was tuned to handle. We named this library Sqooby.

Network

Network Tuning AWS Traffic

Pushy to the Limit: Evolving Netflix’s WebSocket proxy for the future

The Netflix TechBlog

SEPTEMBER 10, 2024

The folks on the Cloud Data Engineering (CDE) team, the ones building the paved path for internal data at Netflix, graciously helped us scale it up and make adjustments, but it ended up being an involved process as we kept growing. We’ll be writing about those new features as well — stay tuned for future posts.

Latency

Latency Cache Tuning Efficiency

Formulating ‘Out of Memory Kill’ Prediction on the Netflix App as a Machine Learning Problem

The Netflix TechBlog

JULY 21, 2022

Since memory management is not something one usually associates with classification problems, this blog focuses on formulating the problem as an ML problem and the data engineering that goes along with it. Some nuances while creating this dataset come from the on-field domain knowledge of our engineers.

Big Data

Big Data Cache Engineering Data Engineering

Incremental Processing using Netflix Maestro and Apache Iceberg

The Netflix TechBlog

NOVEMBER 20, 2023

These challenges are currently addressed in suboptimal and less cost efficient ways by individual local teams to fulfill the needs, such as Lookback: This is a generic and simple approach that data engineers use to solve the data accuracy problem. Users configure the workflow to read the data in a window (e.g.

Processing

Processing Big Data Efficiency Engineering

Orchestrating Data/ML Workflows at Scale With Netflix Maestro

The Netflix TechBlog

OCTOBER 18, 2022

It is a general-purpose workflow orchestrator that provides a fully managed workflow-as-a-service (WAAS) to the data platform at Netflix. It serves thousands of users, including data scientists, data engineers, machine learning engineers, software engineers, content producers, and business analysts, for various use cases.

Java

Java Scalability Traffic Architecture

Presentation: Modern Compute Stack for Scaling Large AI/ML/LLM Workloads

InfoQ

MAY 8, 2024

Jules Damji discusses which infrastructure should be used for distributed fine-tuning and training, how to scale ML workloads, how to accommodate large models, and how can CPUs and GPUs be utilized? By Jules Damji

Tuning

Tuning Infrastructure Artificial Intelligence Data Engineering

Friends don't let friends build data pipelines

Abhishek Tiwari

JULY 12, 2018

Unfortunately, building data pipelines remains a daunting, time-consuming, and costly activity. Not everyone is operating at Netflix or Spotify scale data engineering function. Often companies underestimate the necessary effort and cost involved to build and maintain data pipelines.

Latency

Latency Analytics Scalability Engineering

QCon London: Lessons Learned From Building LinkedIn’s AI/ML Data Platform

InfoQ

APRIL 15, 2024

He specifically delved into Venice DB, the NoSQL data store used for feature persistence. At the QCon London 2024 conference, Félix GV from LinkedIn discussed the AI/ML platform powering the company’s products. By Rafal Gancarz

Big Data

Big Data Artificial Intelligence Data Engineering Latency

Organise your engineering teams around the work by reteaming

Abhishek Tiwari

JULY 20, 2019

Because you are changing team composition, you need robust norms of conduct and engineering practices in place. Secondly, fine-tune team composition based on work. Thirdly, let engineers themselves choose the delivery teams and organise them around the initiative.

Engineering

Engineering Retail Airlines Healthcare

Technology Performance Pulse

A Recap of the Data Engineering Open Forum at Netflix

Optimizing Vector Search Performance With Elasticsearch

Trending Sources

Introducing Impressions at Netflix

What is IT automation?

Analytics at Netflix: Who we are and what we do

Evolving from Rule-based Classifier: Machine Learning Powered Auto Remediation in Netflix Data…

3. Psyberg: Automated end to end catch up

Optimizing data warehouse storage

Building and Scaling Data Lineage at Netflix to Improve Data Infrastructure Reliability, and…

Hyper Scale VPC Flow Logs enrichment to provide Network Insight

Pushy to the Limit: Evolving Netflix’s WebSocket proxy for the future

Formulating ‘Out of Memory Kill’ Prediction on the Netflix App as a Machine Learning Problem

Incremental Processing using Netflix Maestro and Apache Iceberg

Orchestrating Data/ML Workflows at Scale With Netflix Maestro

Presentation: Modern Compute Stack for Scaling Large AI/ML/LLM Workloads

Friends don't let friends build data pipelines

QCon London: Lessons Learned From Building LinkedIn’s AI/ML Data Platform

Organise your engineering teams around the work by reteaming

Stay Connected