Architecture, Big Data and Tuning - Technology Performance Pulse

Migrating Critical Traffic At Scale with No Downtime?—?Part 1

The Netflix TechBlog

MAY 4, 2023

Migrating Critical Traffic At Scale with No Downtime — Part 1 Shyam Gala , Javier Fernandez-Ivern , Anup Rokkam Pratap , Devang Shah Hundreds of millions of customers tune into Netflix every day, expecting an uninterrupted and immersive streaming experience. This technique facilitates validation on multiple fronts.

Traffic

Traffic Latency Tuning Systems

What is IT automation?

Dynatrace

JULY 6, 2022

As organizations continue to adopt multicloud strategies, the complexity of these environments grows, increasing the need to automate cloud engineering operations to ensure organizations can enforce their policies and architecture principles. Big data automation tools. How organizations benefit from automating IT practices.

Artificial Intelligence

Artificial Intelligence Tuning Strategy Big Data

Auto-Diagnosis and Remediation in Netflix Data Platform

The Netflix TechBlog

JANUARY 13, 2022

This blog will explore these two systems and how they perform auto-diagnosis and remediation across our Big Data Platform and Real-time infrastructure. This has led to a dramatic reduction in the time it takes to detect issues in hardware or bugs in recently rolled out data platform software.

Big Data

Big Data Infrastructure Metrics Games

How Netflix uses eBPF flow logs at scale for network insight

The Netflix TechBlog

JUNE 7, 2021

By collecting, accessing and analyzing network data from a variety of sources like VPC Flow Logs , ELB Access Logs, eBPF flow logs on the instances, etc, we can provide network insight to users and central teams through multiple data visualization techniques like Lumen , Atlas , etc. What is BPF?

Network

Network Transportation AWS Cloud

Building and Scaling Data Lineage at Netflix to Improve Data Infrastructure Reliability, and…

The Netflix TechBlog

MARCH 25, 2019

We adopted the following mission statement to guide our investments: “Provide a complete and accurate data lineage system enabling decision-makers to win moments of truth.” Nonetheless, Netflix data landscape (see below) is complex and many teams collaborate effectively for sharing the responsibility of our data system management.

Infrastructure

Infrastructure Big Data Transportation Architecture

Optimizing anomaly detection and noise

Dynatrace

MARCH 11, 2021

I took a big-data-analysis approach, which started with another problem visualization. I wanted to understand how I could tune Dynatrace’s problem detection, but to do that I needed to understand the situation first. To achieve that I took two approaches: Visualizing historic problem data via a “Swimlane Visualization”.

Tuning

Tuning Architecture Monitoring Big Data

Conducting log analysis with an observability platform and full data context

Dynatrace

APRIL 20, 2023

Logs highlight observability challenges Ingesting, storing, and processing the unprecedented explosion of data from sources such as software as a service, multicloud environments, containers, and serverless architectures can be overwhelming for today’s organizations. Seamless integration.

Analytics

Analytics Infrastructure Storage Architecture

Bulldozer: Batch Data Moving from Data Warehouse to Online Key-Value Stores

The Netflix TechBlog

OCTOBER 27, 2020

Data scientists and engineers collect this data from our subscribers and videos, and implement data analytics models to discover customer behaviour with the goal of maximizing user joy. The processed data is typically stored as data warehouse tables in AWS S3.

Latency

Latency Storage Big Data Tuning

Data lakehouse innovations advance the three pillars of observability for more collaborative analytics

Dynatrace

FEBRUARY 16, 2023

As teams try to gain insight into this data deluge, they have to balance the need for speed, data fidelity, and scale with capacity constraints and cost. To solve this problem, Dynatrace launched Grail, its causational data lakehouse , in 2022. Logs on Grail Log data is foundational for any IT analytics. .

Analytics

Analytics Innovation Metrics Database

Hyper Scale VPC Flow Logs enrichment to provide Network Insight

The Netflix TechBlog

MAY 26, 2020

And in order to gain visibility into these logs, we need to somehow ingest and enrich this data. It is easier to tune a large Spark job for a consistent volume of data. In other words, we are able to ensure that our Spark app does not “eat” more data than it was tuned to handle. We named this library Sqooby.

Network

Network Tuning AWS Big Data

Evolving from Rule-based Classifier: Machine Learning Powered Auto Remediation in Netflix Data…

The Netflix TechBlog

MARCH 4, 2024

Operational automation–including but not limited to, auto diagnosis, auto remediation, auto configuration, auto tuning, auto scaling, auto debugging, and auto testing–is key to the success of modern data platforms. We have also noted a great potential for further improvement by model tuning (see the section of Rollout in Production).

Tuning

Tuning Efficiency Big Data Engineering

A Recap of the Data Engineering Open Forum at Netflix

The Netflix TechBlog

JUNE 20, 2024

To address this, we propose developing an intelligent agent that can automatically discover, map, and query all data within an enterprise. This “Enterprise Data Model/Architect Agent” employs generative AI techniques for autonomous enterprise data modeling and architecture. Until next time!

Data Engineering

Data Engineering Engineering Entertainment Software Engineering

Data Movement in Netflix Studio via Data Mesh

The Netflix TechBlog

JULY 26, 2021

This happens at an unprecedented scale and introduces many interesting challenges; one of the challenges is how to provide visibility of Studio data across multiple phases and systems to facilitate operational excellence and empower decision making. Please stay tuned! The audits check for equality (i.e. Endnotes ¹ Inmon, Bill.

Big Data

Big Data Government Processing Analytics

Orchestrating Data/ML Workflows at Scale With Netflix Maestro

The Netflix TechBlog

OCTOBER 18, 2022

by Jun He , Akash Dwivedi , Natallia Dzenisenka , Snehal Chennuru , Praneeth Yenugutala , Pawan Dixit At Netflix, Data and Machine Learning (ML) pipelines are widely used and have become central for the business, representing diverse use cases that go beyond recommendations, predictions and data transformations.

Java

Java Scalability Traffic Architecture

Streaming SQL in Data Mesh

The Netflix TechBlog

NOVEMBER 3, 2023

The Data Mesh SQL Processor is a platform-managed, parameterized Flink Job that takes schematized sources and a Flink SQL query that will be executed against those sources. By leveraging Flink SQL within a Data Mesh Processor, we were able to support the streaming SQL functionality without changing the architecture of Data Mesh.

Processing

Processing Engineering Infrastructure Latency

Optimizing data warehouse storage

The Netflix TechBlog

DECEMBER 21, 2020

We built AutoOptimize to efficiently and transparently optimize the data and metadata storage layout while maximizing their cost and performance benefits. This article will list some of the use cases of AutoOptimize, discuss the design principles that help enhance efficiency, and present the high-level architecture.

Storage

Storage Latency Efficiency Data Engineering

Mastering Distributed SQL™ Databases in 2025

Scalegrid

JANUARY 10, 2025

They keep the features that developers like but can handle much more data, similar to NoSQL systems. Notably, they simplify handling big data flows, offer consistent transactions, and sustain high performance even when they’re used for real-time data analysis and complex queries.

Database

Database Best Practices Scalability Blockchain

Incremental Processing using Netflix Maestro and Apache Iceberg

The Netflix TechBlog

NOVEMBER 20, 2023

For example, a job would reprocess aggregates for the past 3 days because it assumes that there would be late arriving data, but data prior to 3 days isn’t worth the cost of reprocessing. Backfill: Backfilling datasets is a common operation in big data processing.

Processing

Processing Big Data Efficiency Engineering

Music to my Ears - All Things Distributed

All Things Distributed

MARCH 28, 2011

We see that with our Amazon customers; when they hear a great tune on a radio they may identify it using the Shazam or Soundhound apps on their mobile phone and buy that song instantly from the Amazon MP3 store. Driving down the cost of Big-Data analytics. Introducing the AWS South America (Sao Paulo) Region.

AWS

AWS Cloud Storage Internet

Why MySQL Could Be Slow With Large Tables

Percona

JANUARY 19, 2023

ProxySQL: It is a feature-rich open-source MySQL proxy solution, that allows query routing for the most common MySQL architectures (PXC/Galera, Replication, Group Replication, etc.). It was developed for optimizing data storage and access for big data sets. It is available under a paid subscription.

Open Source

Open Source Storage Database Big Data

Web Performance Bookshelf

Rigor

JANUARY 13, 2020

Take, for example, The Web Almanac , the golden collection of Big Data combined with the collective intelligence from most of the authors listed below, brilliantly spearheaded by Google’s @rick_viscomi. Information Architecture. Web Performance Tuning. Web Performance Daybook-Volume-2. Website Optimization.

Performance

Performance Social Media Website Website Performance

USENIX LISA 2018: CFP Now Open

Brendan Gregg

APRIL 30, 2018

In this year's CFP we’re looking for topics covering the latest trends and best practices in cloud computing, containerization, machine learning, big data, infrastructure, scalability, DevOps, IT management, automation, reliability, monitoring, performance tuning, security, databases, programming, datacenters, and more.

DevOps

DevOps Network Best Practices Programming

USENIX LISA 2018: CFP Now Open

Brendan Gregg

APRIL 29, 2018

In this year's CFP we’re looking for topics covering the latest trends and best practices in cloud computing, containerization, machine learning, big data, infrastructure, scalability, DevOps, IT management, automation, reliability, monitoring, performance tuning, security, databases, programming, datacenters, and more.

DevOps

DevOps Network Best Practices Programming

QCon London: Lessons Learned From Building LinkedIn’s AI/ML Data Platform

InfoQ

APRIL 15, 2024

He specifically delved into Venice DB, the NoSQL data store used for feature persistence. At the QCon London 2024 conference, Félix GV from LinkedIn discussed the AI/ML platform powering the company’s products. By Rafal Gancarz

Artificial Intelligence

Artificial Intelligence Big Data Data Engineering Latency

Microsoft Engineering loves SQLBits

SQL Server According to Bob

FEBRUARY 15, 2018

Best practices on Building a Big Data Analytics Solution – Michael Rys. If you want to learn about Azure Data Lake, there is no one better. Friday Sessions: SQL Intelligence excels your tuning and security expertise – Veljko Vasic, Ron Matchoro and Frans Lytzen. I’ve known Michael for a very long time.

Engineering

Engineering Azure Best Practices Servers

Delta: A Data Synchronization and Enrichment Platform

The Netflix TechBlog

OCTOBER 15, 2019

In Netflix the microservice architecture is widely adopted and each microservice typically handles only one type of data. The core movie data resides in a microservice called Movie Service, and related data such as movie deals, talents, vendors and so on are managed by multiple other microservices (e.g Please stay tuned.

Transportation

Transportation Architecture Processing Storage

Will AWS Have Anything New To Say About Sustainability at re:Invent 2024?

Adrian Cockcroft

NOVEMBER 18, 2024

Discover data sources to gain insights into your resource efficiency and environmental impact, including the AWS Customer Carbon Footprint Tool and proxy metrics from the AWS Cost & Usage Reports. This lightning talk explores how companies can cut costs and carbon emissions through architectural best practices and workload optimization.

AWS

AWS Energy Lambda Government

Technology Performance Pulse

Migrating Critical Traffic At Scale with No Downtime?—?Part 1

What is IT automation?

Trending Sources

Auto-Diagnosis and Remediation in Netflix Data Platform

How Netflix uses eBPF flow logs at scale for network insight

Building and Scaling Data Lineage at Netflix to Improve Data Infrastructure Reliability, and…

Optimizing anomaly detection and noise

Conducting log analysis with an observability platform and full data context

Bulldozer: Batch Data Moving from Data Warehouse to Online Key-Value Stores

Data lakehouse innovations advance the three pillars of observability for more collaborative analytics

Hyper Scale VPC Flow Logs enrichment to provide Network Insight

Evolving from Rule-based Classifier: Machine Learning Powered Auto Remediation in Netflix Data…

A Recap of the Data Engineering Open Forum at Netflix

Data Movement in Netflix Studio via Data Mesh

Orchestrating Data/ML Workflows at Scale With Netflix Maestro

Streaming SQL in Data Mesh

Optimizing data warehouse storage

Mastering Distributed SQL™ Databases in 2025

Incremental Processing using Netflix Maestro and Apache Iceberg

Music to my Ears - All Things Distributed

Why MySQL Could Be Slow With Large Tables

Web Performance Bookshelf

USENIX LISA 2018: CFP Now Open

USENIX LISA 2018: CFP Now Open

QCon London: Lessons Learned From Building LinkedIn’s AI/ML Data Platform

Microsoft Engineering loves SQLBits

Delta: A Data Synchronization and Enrichment Platform

Will AWS Have Anything New To Say About Sustainability at re:Invent 2024?

Stay Connected