Remove Data Engineering Remove Efficiency Remove Training
article thumbnail

Supporting Diverse ML Systems at Netflix

The Netflix TechBlog

For ETL and other heavy lifting of data, we mainly rely on Apache Spark. In addition to Spark, we want to support last-mile data processing in Python, addressing use cases such as feature transformations, batch inference, and training. Correspondingly, each application brings its own bespoke set of dependencies.

Systems 238
article thumbnail

Evolving from Rule-based Classifier: Machine Learning Powered Auto Remediation in Netflix Data…

The Netflix TechBlog

Operational automation–including but not limited to, auto diagnosis, auto remediation, auto configuration, auto tuning, auto scaling, auto debugging, and auto testing–is key to the success of modern data platforms. the retry success probability) and compute cost efficiency (i.e., Multi-objective optimizations.

Tuning 219
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

What is IT automation?

Dynatrace

Ultimately, IT automation can deliver consistency, efficiency, and better business outcomes for modern enterprises. Automating IT practices offers enterprises faster data centers and cloud operations, as well as increased flexibility and accuracy. AI that is based on machine learning needs to be trained. Read eBook now!

article thumbnail

Incremental Processing using Netflix Maestro and Apache Iceberg

The Netflix TechBlog

It also improves the engineering productivity by simplifying the existing pipelines and unlocking the new patterns. We will show how we are building a clean and efficient incremental processing solution (IPS) by using Netflix Maestro and Apache Iceberg. Introduction Netflix relies on data to power its business in all phases.

article thumbnail

How Our Paths Brought Us to Data and Netflix

The Netflix TechBlog

A role in data science eventually seemed like a natural transition, but it wasn’t without its hurdles: With my consulting background, I had to go through a few other roles first while learning how to code on the side. Julie] Chris and I have the same primary stakeholders (or engineering team that we support): Encoding Technologies.

Analytics 232
article thumbnail

Orchestrating Data/ML Workflows at Scale With Netflix Maestro

The Netflix TechBlog

by Jun He , Akash Dwivedi , Natallia Dzenisenka , Snehal Chennuru , Praneeth Yenugutala , Pawan Dixit At Netflix, Data and Machine Learning (ML) pipelines are widely used and have become central for the business, representing diverse use cases that go beyond recommendations, predictions and data transformations.

Java 214
article thumbnail

Post: Essilen Research, Fauna, Sisu, Educative, PA File Sight, Etleap, Triplebyte, Stream

High Scalability

Etleap is analyst-friendly , enterprise-grade ETL-as-a-service , built for Redshift and Snowflake data warehouses and S3/Glue data lakes. Our intuitive software allows data engineers to maintain pipelines without writing code, and lets analysts gain access to data in minutes instead of months.

Education 114