Remove Entertainment Remove Latency Remove Systems
article thumbnail

Foundation Model for Personalized Recommendation

The Netflix TechBlog

By Ko-Jen Hsiao , Yesu Feng and Sudarshan Lamkhede Motivation Netflixs personalized recommender system is a complex system, boasting a variety of specialized machine learned models each catering to distinct needs including Continue Watching and Todays Top Picks for You. Refer to our recent overview for more details).

Tuning 165
article thumbnail

Supporting Diverse ML Systems at Netflix

The Netflix TechBlog

The Machine Learning Platform (MLP) team at Netflix provides an entire ecosystem of tools around Metaflow , an open source machine learning infrastructure framework we started, to empower data scientists and machine learning practitioners to build and manage a variety of ML systems. ETL workflows), as well as downstream (e.g.

Systems 235
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Netflix at AWS re:Invent 2019

The Netflix TechBlog

Netflix shares how Amazon EC2 Auto Scaling allows its infrastructure to automatically adapt to changing traffic patterns in order to keep its audience entertained and its costs on target. In this talk, we share how Netflix deploys systems to meet its demands, Ceph’s design for high availability, and results from our benchmarking.

AWS 38
article thumbnail

Migrating Critical Traffic At Scale with No Downtime?—?Part 2

The Netflix TechBlog

Behind these perfect moments of entertainment is a complex mechanism, with numerous gears and cogs working in harmony. This is where large-scale system migrations come into play. By tracking metrics only at the level of service being updated, we might miss capturing deviations in broader end-to-end system functionality.

Traffic 285
article thumbnail

Telltale: Netflix Application Monitoring Simplified

The Netflix TechBlog

Our streaming teams need a monitoring system that enables them to quickly diagnose and remediate problems; seconds count! Our Node team needs a system that empowers a small group to operate a large fleet. For example, a latency increase is less critical than error rate increase and some error codes are less critical than others.

article thumbnail

Netflix at AWS re:Invent 2019

The Netflix TechBlog

Netflix shares how Amazon EC2 Auto Scaling allows its infrastructure to automatically adapt to changing traffic patterns in order to keep its audience entertained and its costs on target. In this talk, we share how Netflix deploys systems to meet its demands, Ceph’s design for high availability, and results from our benchmarking.

AWS 100
article thumbnail

Netflix at AWS re:Invent 2019

The Netflix TechBlog

Netflix shares how Amazon EC2 Auto Scaling allows its infrastructure to automatically adapt to changing traffic patterns in order to keep its audience entertained and its costs on target. In this talk, we share how Netflix deploys systems to meet its demands, Ceph’s design for high availability, and results from our benchmarking.

AWS 100