Remove Architecture Remove Data Engineering Remove Traffic
article thumbnail

Introducing Impressions at Netflix

The Netflix TechBlog

Architecture Overview The first pivotal step in managing impressions begins with the creation of a Source-of-Truth (SOT) dataset. The enriched data is seamlessly accessible for both real-time applications via Kafka and historical analysis through storage in an Apache Iceberg table.

Tuning 166
article thumbnail

Orchestrating Data/ML Workflows at Scale With Netflix Maestro

The Netflix TechBlog

We started seeing signs of scale issues, like: Slowness during peak traffic moments like 12 AM UTC, leading to increased operational burden. Meson was based on a single leader architecture with high availability. At Netflix, the peak traffic load can be a few orders of magnitude higher than the average load.

Java 211
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

How Data Inspires Building a Scalable, Resilient and Secure Cloud Infrastructure At Netflix

The Netflix TechBlog

While our engineering teams have and continue to build solutions to lighten this cognitive load (better guardrails, improved tooling, …), data and its derived products are critical elements to understanding, optimizing and abstracting our infrastructure. What will be the cost of rolling out the winning cell of an AB test to all users?

article thumbnail

Hyper Scale VPC Flow Logs enrichment to provide Network Insight

The Netflix TechBlog

VPC Flow Logs VPC Flow Logs is an AWS feature that captures information about the IP traffic going to and from network interfaces in a VPC. At Netflix we publish the Flow Log data to Amazon S3. We also adjusted the frequency in which Spark app instances are spun up such that any backlog would burn off during a trough in traffic.

Network 154
article thumbnail

Experimentation is a major focus of Data Science across Netflix

The Netflix TechBlog

To learn about Analytics and Viz Engineering, have a look at Analytics at Netflix: Who We Are and What We Do by Molly Jackman & Meghana Reddy and How Our Paths Brought Us to Data and Netflix by Julie Beckley & Chris Pham. Curious to learn about what it’s like to be a Data Engineer at Netflix?

article thumbnail

Spice up your Analytics: Amazon QuickSight Now Generally Available in N. Virginia, Oregon, and Ireland.

All Things Distributed

The reality is that many traditional BI solutions are built on top of legacy desktop and on-premises architectures that are decades old. They require teams of data engineers to spend months building complex data models and synthesizing the data before they can generate their first report.

Analytics 126
article thumbnail

How HubSpot Uses Apache Kafka Swimlanes for Timely Processing of Workflow Actions

InfoQ

HubSpot adopted routing messages over multiple Kafka topics (called swimlanes) for the same producer to avoid the build-up in the consumer group lag and prioritize the processing of real-time traffic. By Rafal Gancarz