Analytics, Data and Data Engineering - Technology Performance Pulse

Ensuring Data Integrity Through Anomaly Detection: Essential Tools for Data Engineers

DZone

JULY 31, 2024

However, amidst this rapid evolution, ensuring a robust data universe characterized by high quality and integrity is indispensable. While much emphasis is often placed on refining AI models, the significance of pristine datasets can sometimes be overshadowed.

Data Engineering

Data Engineering FinTech Engineering Analytics

1. Streamlining Membership Data Engineering at Netflix with Psyberg

The Netflix TechBlog

NOVEMBER 14, 2023

By Abhinaya Shetty , Bharath Mummadisetty At Netflix, our Membership and Finance Data Engineering team harnesses diverse data related to plans, pricing, membership life cycle, and revenue to fuel analytics, power various dashboards, and make data-informed decisions. What is late-arriving data? Let’s dive in!

Data Engineering

Data Engineering Engineering Processing Games

A Recap of the Data Engineering Open Forum at Netflix

The Netflix TechBlog

JUNE 20, 2024

A summary of sessions at the first Data Engineering Open Forum at Netflix on April 18th, 2024 The Data Engineering Open Forum at Netflix on April 18th, 2024. At Netflix, we aspire to entertain the world, and our data engineering teams play a crucial role in this mission by enabling data-driven decision-making at scale.

Data Engineering

Data Engineering Engineering Entertainment Software Engineering

Data Engineers of Netflix?—?Interview with Kevin Wylie

The Netflix TechBlog

JULY 15, 2021

Data Engineers of Netflix?—?Interview Interview with Kevin Wylie This post is part of our “Data Engineers of Netflix” series, where our very own data engineers talk about their journeys to Data Engineering @ Netflix. Kevin, what drew you to data engineering?

Data Engineering

Data Engineering Engineering Entertainment Big Data

Data Engineers of Netflix?—?Interview with Pallavi Phadnis

The Netflix TechBlog

OCTOBER 28, 2021

Data Engineers of Netflix?—?Interview Interview with Pallavi Phadnis This post is part of our “ Data Engineers of Netflix ” series, where our very own data engineers talk about their journeys to Data Engineering @ Netflix. Pallavi Phadnis is a Senior Software Engineer at Netflix.

Data Engineering

Data Engineering Engineering Big Data Software Engineering

Analytics at Netflix: Who we are and what we do

The Netflix TechBlog

SEPTEMBER 18, 2020

Analytics at Netflix: Who We Are and What We Do An Introduction to Analytics and Visualization Engineering at Netflix by Molly Jackman & Meghana Reddy Explained: Season 1 (Photo Credit: Netflix) Across nearly every industry, there is recognition that data analytics is key to driving informed business decision-making.

Analytics

Analytics Engineering Data Engineering Tuning

Ready-to-go sample data pipelines with Dataflow

The Netflix TechBlog

DECEMBER 3, 2022

by Jasmine Omeke , Obi-Ike Nwoke , Olek Gorajek Intro This post is for all data practitioners, who are interested in learning about bootstrapping, standardization and automation of batch data pipelines at Netflix. You may remember Dataflow from the post we wrote last year titled Data pipeline asset management with Dataflow.

Best Practices

Best Practices Code Testing Data Engineering

How Uber Sped Up SQL-based Data Analytics with Presto and Express Queries

InfoQ

NOVEMBER 18, 2024

Uber uses Presto, an open-source distributed SQL query engine, to provide analytics across several data sources, including Apache Hive, Apache Pinot, MySQL, and Apache Kafka. To improve its performance, Uber engineers explored the advantages of dealing with quick queries, a.k.a.

Analytics

Analytics Open Source Engineering Performance

Introducing Impressions at Netflix

The Netflix TechBlog

FEBRUARY 14, 2025

Every image you hover over isnt just a visual placeholder; its a critical data point that fuels our sophisticated personalization engine. This nuanced integration of data and technology empowers us to offer bespoke content recommendations. This queue ensures we are consistently capturing raw events from our global userbase.

Tuning

Tuning Latency Efficiency Storage

How Our Paths Brought Us to Data and Netflix

The Netflix TechBlog

SEPTEMBER 18, 2020

Part of our series on who works in Analytics at Netflix?—?and and what the role entails by Julie Beckley & Chris Pham This Q&A provides insights into the diverse set of skills, projects, and culture within Data Science and Engineering (DSE) at Netflix through the eyes of two team members: Chris Pham and Julie Beckley.

Analytics

Analytics Education Innovation Engineering

2. Diving Deeper into Psyberg: Stateless vs Stateful Data Processing

The Netflix TechBlog

NOVEMBER 14, 2023

By Abhinaya Shetty , Bharath Mummadisetty In the inaugural blog post of this series, we introduced you to the state of our pipelines before Psyberg and the challenges with incremental processing that led us to create the Psyberg framework within Netflix’s Membership and Finance data engineering team.

Processing

Processing Data Engineering Efficiency Analytics

Mythbusting the Analytics Journey

The Netflix TechBlog

DECEMBER 18, 2020

Part of our series on who works in Analytics at Netflix?—?and and what the role entails by Alex Diamond This Q&A aims to mythbust some common misconceptions about succeeding in analytics at a big tech company. Working in Studio Data Science & Engineering (“Studio DSE”) was basically a dream come true.

Analytics

Analytics Retail Education Engineering

Experimentation is a major focus of Data Science across Netflix

The Netflix TechBlog

JANUARY 11, 2022

Here we describe the role of Experimentation and A/B testing within the larger Data Science and Engineering organization at Netflix, including how our platform investments support running tests at scale while enabling innovation. Curious to learn more about other Data Science and Engineering functions at Netflix?

Innovation

Innovation Metrics Engineering Testing

How Data Inspires Building a Scalable, Resilient and Secure Cloud Infrastructure At Netflix

The Netflix TechBlog

MARCH 5, 2019

While our engineering teams have and continue to build solutions to lighten this cognitive load (better guardrails, improved tooling, …), data and its derived products are critical elements to understanding, optimizing and abstracting our infrastructure. In the Reliability space, our data teams focus on two main approaches.

Infrastructure

Infrastructure Cloud Scalability AWS

SQL Extensions for Time-Series Data in QuestDB

DZone

JANUARY 13, 2023

In this tutorial, you are going to learn about QuestDB SQL extensions which prove to be very useful with time-series data. Using some sample data sets, you will learn how designated timestamps work and how to use extended SQL syntax to write queries on time-series data.

IoT

IoT Analytics Database Metrics

A Day in the Life of a Content Analytics Engineer

The Netflix TechBlog

OCTOBER 30, 2020

Part of our series on who works in Analytics at Netflix?—?and I’m a Senior Analytics Engineer on the Content and Marketing Analytics Research team. Being an Analytics Engineer is like being a hybrid of a librarian ?? One of my favorite things about being an Analytics Engineer is the variety.

Analytics

Analytics Engineering Innovation Metrics

A Day in the Life of an Experimentation and Causal Inference Scientist @ Netflix

The Netflix TechBlog

MARCH 2, 2021

Stephanie Lane , Wenjing Zheng , Mihir Tendulkar Source credit: Netflix Within the rapid expansion of data-related roles in the last decade, the title Data Scientist has emerged as an umbrella term for myriad skills and areas of business focus. Learning through data is in Netflix’s DNA. It can be hard to know from the outside.

Analytics

Analytics C++ Innovation Engineering

Spice up your Analytics: Amazon QuickSight Now Generally Available in N. Virginia, Oregon, and Ireland.

All Things Distributed

NOVEMBER 15, 2016

Previously, I wrote about Amazon QuickSight , a new service targeted at business users that aims to simplify the process of deriving insights from a wide variety of data sources quickly, easily, and at a low cost. Put simply, data is not always readily available and accessible to organizational end users. Enter Amazon QuickSight.

Analytics

Analytics Availability Media Social Media

It's 2025: How Do You Choose Between Doris and ClickHouse?

DZone

APRIL 9, 2025

Database selection is a challenge every data engineer faces. Among the many databases available, Apache Doris and ClickHouse, as two mainstream analytical databases, are often compared. Each has its strengths and is suited to different scenarios, making the choice difficult.

Data Engineering

Data Engineering Analytics Database Engineering

2021 Data/AI Salary Survey

O'Reilly

SEPTEMBER 15, 2021

In June 2021, we asked the recipients of our Data & AI Newsletter to respond to a survey about compensation. The average salary for data and AI professionals who responded to the survey was $146,000. We didn’t use the data from these respondents; in practice, discarding this data had no effect on the results.

Azure

Azure Programming AWS Social Media

What is IT automation?

Dynatrace

JULY 6, 2022

Scripts and procedures usually focus on a particular task, such as deploying a new microservice to a Kubernetes cluster, implementing data retention policies on archived files in the cloud, or running a vulnerability scanner over code before it’s deployed. Big data automation tools. Batch process automation.

Artificial Intelligence

Artificial Intelligence Tuning Strategy Big Data

Incremental Processing using Netflix Maestro and Apache Iceberg

The Netflix TechBlog

NOVEMBER 20, 2023

by Jun He , Yingyi Zhang , and Pawan Dixit Incremental processing is an approach to process new or changed data in workflows. The key advantage is that it only incrementally processes data that are newly added or updated to a dataset, instead of re-processing the complete dataset.

Processing

Processing Big Data Efficiency Engineering

Hyper Scale VPC Flow Logs enrichment to provide Network Insight

The Netflix TechBlog

MAY 26, 2020

Cloud Network Insight is a suite of solutions that provides both operational and analytical insight into the Cloud Network Infrastructure to address the identified problems. At Netflix we publish the Flow Log data to Amazon S3. And in order to gain visibility into these logs, we need to somehow ingest and enrich this data.

Network

Network Tuning AWS Traffic

Data Ingestion: The First Step Towards a Flawless Data Pipeline

Simform

JANUARY 8, 2023

Data ingestion is the foremost layer in a data engineering pipeline, acting as a vital pillar in the overall analytics architecture. Thus, it is essential to implement data ingestion just right. Here is everything you need to know to take the first step toward a flawless data pipeline.

Data Engineering

Data Engineering Analytics Architecture Engineering

Expanding the Cloud: Introducing Amazon QuickSight

All Things Distributed

OCTOBER 7, 2015

We live in a world where massive volumes of data are generated from websites, connected devices and mobile apps. In such a data intensive environment, making key business decisions such as running marketing and sales campaigns, logistic planning, financial analysis and ad targeting require deriving insights from these data.

Cloud

Cloud Big Data AWS Analytics

Friends don't let friends build data pipelines

Abhishek Tiwari

JULY 12, 2018

Building data pipelines can offer strategic advantages to the business. It can be used to power new analytics, insight, and product features. Often companies underestimate the necessary effort and cost involved to build and maintain data pipelines. Data pipeline initiatives are generally unfinished projects.

Latency

Latency Analytics Scalability Engineering

Optimizing dbt and Google’s BigQuery

DZone

DECEMBER 21, 2020

Setting up a data warehouse is the first step towards fully utilizing big data analysis. Still, it is one of many that need to be taken before you can generate value from the data you gather. An important step in that chain of the process is data modeling and transformation.

Big Data

Big Data Google Scalability Processing

Canva Opts for Amazon KDS over SNS+SQS to Save 85% with 25 Billion Events per Day

InfoQ

AUGUST 7, 2024

Canva evaluated different data massaging solutions for its Product Analytics Platform, including the combination of AWS SNS and SQS, MKS, and Amazon KDS, and eventually chose the latter, primarily based on its much lower costs. The company compared many aspects of these solutions, like performance, maintenance effort, and cost.

AWS

AWS Analytics Performance Data Engineering

5 data integration trends that will define the future of ETL in 2018

Abhishek Tiwari

DECEMBER 27, 2017

ETL refers to extract, transform, load and it is generally used for data warehousing and data integration. There are several emerging data trends that will define the future of ETL in 2018. A common theme across all these trends is to remove the complexity by simplifying data management as a whole.

Big Data

Big Data Artificial Intelligence Storage Hardware

Sustainability at AWS re:Invent 2022 All the talks and videos I could find…

Adrian Cockcroft

FEBRUARY 13, 2023

STP213 Scaling global carbon footprint management — Blake Blackwell Persefoni Manager Data Engineering and Michael Floyd AWS Head of Sustainability Solutions. SUS312 How innovators are driving more sustainable manufacturing — Marcus Ulmefors Northvolt Director Data and ML Platforms and Muhammad Sajid AWS SA. AWS Inferentia 2.6x

AWS

AWS Energy Architecture Programming

Part 3: A Survey of Analytics Engineering Work at Netflix

The Netflix TechBlog

JANUARY 6, 2025

This article is the last in a multi-part series sharing a breadth of Analytics Engineering work at Netflix, recently presented as part of our annual internal Analytics Engineering conference. This solution enabled data to be used more effectively for real-time decision-making. Need to catch up?

Analytics

Analytics Engineering Cache Entertainment

Cloud Efficiency at Netflix

The Netflix TechBlog

DECEMBER 17, 2024

This diverse technological landscape generates extensive and rich data from various infrastructure entities, from which, data engineers and analysts collaborate to provide actionable insights to the engineering organization in a continuous feedback loop that ultimately enhances the business.

Efficiency

Efficiency Cloud Analytics Infrastructure

Technology Performance Pulse

Ensuring Data Integrity Through Anomaly Detection: Essential Tools for Data Engineers

1. Streamlining Membership Data Engineering at Netflix with Psyberg

Trending Sources

A Recap of the Data Engineering Open Forum at Netflix

Data Engineers of Netflix?—?Interview with Kevin Wylie

Data Engineers of Netflix?—?Interview with Pallavi Phadnis

Analytics at Netflix: Who we are and what we do

Ready-to-go sample data pipelines with Dataflow

How Uber Sped Up SQL-based Data Analytics with Presto and Express Queries

Introducing Impressions at Netflix

How Our Paths Brought Us to Data and Netflix

2. Diving Deeper into Psyberg: Stateless vs Stateful Data Processing

Mythbusting the Analytics Journey

Experimentation is a major focus of Data Science across Netflix

How Data Inspires Building a Scalable, Resilient and Secure Cloud Infrastructure At Netflix

SQL Extensions for Time-Series Data in QuestDB

A Day in the Life of a Content Analytics Engineer

A Day in the Life of an Experimentation and Causal Inference Scientist @ Netflix

Spice up your Analytics: Amazon QuickSight Now Generally Available in N. Virginia, Oregon, and Ireland.

It's 2025: How Do You Choose Between Doris and ClickHouse?

2021 Data/AI Salary Survey

What is IT automation?

Incremental Processing using Netflix Maestro and Apache Iceberg

Hyper Scale VPC Flow Logs enrichment to provide Network Insight

Data Ingestion: The First Step Towards a Flawless Data Pipeline

Expanding the Cloud: Introducing Amazon QuickSight

Friends don't let friends build data pipelines

Optimizing dbt and Google’s BigQuery

Canva Opts for Amazon KDS over SNS+SQS to Save 85% with 25 Billion Events per Day

5 data integration trends that will define the future of ETL in 2018

Sustainability at AWS re:Invent 2022 All the talks and videos I could find…

Part 3: A Survey of Analytics Engineering Work at Netflix

Cloud Efficiency at Netflix

Stay Connected