Architecture, Data and Latency - Technology Performance Pulse

Cut costs and complexity: 5 strategies for reducing tool sprawl with Dynatrace

Dynatrace

APRIL 10, 2025

As an executive, I am always seeking simplicity and efficiency to make sure the architecture of the business is as streamlined as possible. Simplify data ingestion and up-level storage for better, faster querying : With Dynatrace, petabytes of data are always hot for real-time insights, at a cold cost.

Strategy

Strategy Storage Network Architecture

Efficient Multimodal Data Processing: A Technical Deep Dive

DZone

FEBRUARY 27, 2025

Multimodal data processing is the evolving need of the latest data platforms powering applications like recommendation systems, autonomous vehicles, and medical diagnostics. Handling multimodal data spanning text, images, videos, and sensor inputs requires resilient architecture to manage the diversity of formats and scale.

Efficiency

Efficiency Processing Latency Storage

Optimizing Database Performance in Middleware Applications

DZone

FEBRUARY 14, 2025

In the realm of modern software architecture, middleware plays a pivotal role in connecting various components of distributed systems. This is crucial because middleware often serves as the bridge between client applications and backend databases, handling a high volume of requests and data processing tasks.

Database

Database Performance Software Architecture Latency

Netflix’s Distributed Counter Abstraction

The Netflix TechBlog

NOVEMBER 12, 2024

By: Rajiv Shringi , Oleksii Tkachuk , Kartik Sathyanarayanan Introduction In our previous blog post, we introduced Netflix’s TimeSeries Abstraction , a distributed service designed to store and query large volumes of temporal event data with low millisecond latencies. Today, we’re excited to present the Distributed Counter Abstraction.

Latency

Latency Cache Infrastructure Strategy

RabbitMQ vs. Kafka: Key Differences

Scalegrid

FEBRUARY 6, 2025

RabbitMQ is designed for flexible routing and message reliability, while Kafka handles high-throughput event streaming and real-time data processing. Both serve distinct purposes, from managing message queues to ingesting large data volumes.

Latency

Latency Analytics Architecture Storage

How to Scale Elasticsearch to Solve Your Scalability Issues

DZone

FEBRUARY 26, 2025

With the evolution of modern applications serving increasing needs for real-time data processing and retrieval, scalability does, too. One such open-source, distributed search and analytics engine is Elasticsearch, which is very efficient at handling data in large sets and high-velocity queries.

Scalability

Scalability Open Source Latency Architecture

Foundation Model for Personalized Recommendation

The Netflix TechBlog

MARCH 28, 2025

Furthermore, it was difficult to transfer innovations from one model to another, given that most are independently trained despite using common data sources. This scenario underscored the need for a new recommender system architecture where member preference learning is centralized, enhancing accessibility and utility across different models.

Tuning

Tuning Efficiency Latency Strategy

Introducing Netflix’s Key-Value Data Abstraction Layer

The Netflix TechBlog

SEPTEMBER 18, 2024

Second, developers had to constantly re-learn new data modeling practices and common yet critical data access patterns. These include challenges with tail latency and idempotency, managing “wide” partitions with many rows, handling single large “fat” columns, and slow response pagination.

Latency

Latency Storage Cache Servers

Introducing Impressions at Netflix

The Netflix TechBlog

FEBRUARY 14, 2025

Every image you hover over isnt just a visual placeholder; its a critical data point that fuels our sophisticated personalization engine. This nuanced integration of data and technology empowers us to offer bespoke content recommendations.

Tuning

Tuning Latency Efficiency Storage

Single-core memory bandwidth: Latency, Bandwidth, and Concurrency

John McCalpin

FEBRUARY 17, 2025

In my previous post , I reviewed historical data on single-core/single-thread memory bandwidth in multicore processors from Intel and AMD from 2010 to the present. “Concurrency” is the amount of data that must be “in flight” between the core and the memory in order to maintain a steady-state system.

Latency

Latency Hardware Cache Systems

Data ingestion pipeline with Operation Management

The Netflix TechBlog

MARCH 7, 2023

These media focused machine learning algorithms as well as other teams generate a lot of data from the media files, which we described in our previous blog , are stored as annotations in Marken. But we cannot search or present low latency retrievals from files Etc. Marken Architecture Marken’s architecture diagram is as follows.

Media

Media Latency Architecture Database

Why applying chaos engineering to data-intensive applications matters

Dynatrace

MAY 23, 2024

The jobs executing such workloads are usually required to operate indefinitely on unbounded streams of continuous data and exhibit heterogeneous modes of failure as they run over long periods. This significantly increases event latency. Performance is usually a primary concern when using stream processing frameworks.

Engineering

Engineering Tuning Latency Open Source

Introducing Netflix TimeSeries Data Abstraction Layer

The Netflix TechBlog

OCTOBER 8, 2024

Rajiv Shringi Vinay Chella Kaidan Fullerton Oleksii Tkachuk Joey Lynch Introduction As Netflix continues to expand and diversify into various sectors like Video on Demand and Gaming , the ability to ingest and store vast amounts of temporal data — often reaching petabytes — with millisecond access latency has become increasingly vital.

Latency

Latency Storage Traffic Tuning

Analyze OpenTelemetry traces and log data at scale: Accelerate troubleshooting and optimize application performance

Dynatrace

OCTOBER 3, 2024

Considering the latest State of Observability 2024 report, it’s evident that multicloud environments not only come with an explosion of data beyond humans’ ability to manage it. It’s increasingly difficult to ingest, manage, store, and sort through this amount of data. You can find the list of use cases here.

Performance

Performance Architecture Innovation Latency

Bulldozer: Batch Data Moving from Data Warehouse to Online Key-Value Stores

The Netflix TechBlog

OCTOBER 27, 2020

By Tianlong Chen and Ioannis Papapanagiotou Netflix has more than 195 million subscribers that generate petabytes of data everyday. Data scientists and engineers collect this data from our subscribers and videos, and implement data analytics models to discover customer behaviour with the goal of maximizing user joy.

Latency

Latency Storage Big Data Tuning

Data Movement in Netflix Studio via Data Mesh

The Netflix TechBlog

JULY 26, 2021

This happens at an unprecedented scale and introduces many interesting challenges; one of the challenges is how to provide visibility of Studio data across multiple phases and systems to facilitate operational excellence and empower decision making. With the latest Data Mesh Platform, data movement in Netflix Studio reaches a new stage.

Big Data

Big Data Government Processing Analytics

What is a data lakehouse? Combining data lakes and warehouses for the best of both worlds

Dynatrace

OCTOBER 4, 2022

While data lakes and data warehousing architectures are commonly used modes for storing and analyzing data, a data lakehouse is an efficient third way to store and analyze data that unifies the two architectures while preserving the benefits of both. What is a data lakehouse?

Artificial Intelligence

Artificial Intelligence Storage Analytics Government

Streaming SQL in Data Mesh

The Netflix TechBlog

NOVEMBER 3, 2023

Democratizing Stream Processing @ Netflix By Guil Pires , Mark Cho , Mingliang Liu , Sujay Jain Data powers much of what we do at Netflix. On the Data Platform team, we build the infrastructure used across the company to process data at scale. The existing Data Mesh Processors have a lot of overlap with SQL.

Processing

Processing Engineering Infrastructure Latency

Timestone: Netflix’s High-Throughput, Low-Latency Priority Queueing System with Built-in Support…

The Netflix TechBlog

SEPTEMBER 29, 2022

Timestone: Netflix’s High-Throughput, Low-Latency Priority Queueing System with Built-in Support for Non-Parallelizable Workloads by Kostas Christidis Introduction Timestone is a high-throughput, low-latency priority queueing system we built in-house to support the needs of Cosmos , our media encoding platform. Over the past 2.5

Latency

Latency Systems Media Serverless

Best Practices for Scaling RabbitMQ

Scalegrid

FEBRUARY 24, 2025

Youll also learn strategies for maintaining data safety and managing node failures so your RabbitMQ setup is always up to the task. Implementing clustering and quorum queues in RabbitMQ significantly improves load distribution and data redundancy, ensuring high availability and fault tolerance for messaging services.

Best Practices

Best Practices Traffic Strategy Efficiency

Scalable Annotation Service?—?Marken

The Netflix TechBlog

JANUARY 25, 2023

Scalable Annotation Service — Marken by Varun Sekhri , Meenakshi Jindal Introduction At Netflix, we have hundreds of micro services each with its own data models or entities. But there are more interesting cases where users want to store temporal (time-based) data or spatial data. Movie Entity with id 1234 has violence.

Scalability

Scalability Latency Media Architecture

Optimizing data warehouse storage

The Netflix TechBlog

DECEMBER 21, 2020

By Anupom Syam Background At Netflix, our current data warehouse contains hundreds of Petabytes of data stored in AWS S3 , and each day we ingest and create additional Petabytes. We built AutoOptimize to efficiently and transparently optimize the data and metadata storage layout while maximizing their cost and performance benefits.

Storage

Storage Latency Efficiency Data Engineering

Architectural Insights: Designing Efficient Multi-Layered Caching With Instagram Example

DZone

FEBRUARY 27, 2024

Caching is a critical technique for optimizing application performance by temporarily storing frequently accessed data, allowing for faster retrieval during subsequent requests. Multi-layered caching involves using multiple levels of cache to store and retrieve data.

Cache

Cache Efficiency Architecture Design

Migrating Critical Traffic At Scale with No Downtime?—?Part 1

The Netflix TechBlog

MAY 4, 2023

When undertaking system migrations, one of the main challenges is establishing confidence and seamlessly transitioning the traffic to the upgraded architecture without adversely impacting the customer experience. It provides a good read on the availability and latency ranges under different production conditions.

Traffic

Traffic Latency Tuning Systems

Uber’s Big Data Platform: 100+ Petabytes with Minute Latency

Uber Engineering

OCTOBER 17, 2018

To accomplish this, Uber relies heavily on making data-driven decisions at every level, from forecasting rider demand during high traffic events to identifying and addressing bottlenecks … The post Uber’s Big Data Platform: 100+ Petabytes with Minute Latency appeared first on Uber Engineering Blog.

Big Data

Big Data Latency Transportation Traffic

Implementing service-level objectives to improve software quality

Dynatrace

DECEMBER 27, 2022

As more organizations embrace microservices-based architecture to deliver goods and services digitally, maintaining customer satisfaction has become exponentially more challenging. First, it helps to understand that applications and all the services and infrastructure that support them generate telemetry data based on traffic from real users.

Software

Software Software Benchmarking Latency

Solve hybrid Kubernetes performance and reliability problems with unified observability

Dynatrace

APRIL 10, 2025

Integrating data at an OS-agnostic cluster level is another hurdle, often leading to data silos and incomplete visibility. While this hybrid architectural approach offers flexibility, it also introduces the need for unified observability.

Performance

Performance Java Operating System Infrastructure

Comparing PostgreSQL DigitalOcean Performance & Pricing – ScaleGrid vs. DigitalOcean Managed Databases

Scalegrid

JUNE 4, 2020

Compare Latency. lower latency compared to DigitalOcean for PostgreSQL. Now, let’s take a look at the throughput and latency performance of our comparison. Next, we are going to test and compare the latency performance between ScaleGrid and DigitalOcean for PostgreSQL. PostgreSQL DigitalOcean Latency Averages (ms).

Database

Database Latency Benchmarking Performance

These 7 Edge Data Challenges Will Test Companies the Most in 2025

VoltDB

DECEMBER 11, 2024

Edge computing has transformed how businesses and industries process and manage data. By bringing computation closer to the data source, edge-based deployments reduce latency, enhance real-time capabilities, and optimize network bandwidth. Data interception during transit. Redundancy and inefficiency in data aggregation.

IoT

IoT Energy Logistics Latency

DBLog: A Generic Change-Data-Capture Framework

The Netflix TechBlog

DECEMBER 17, 2019

Andreas Andreakis , Ioannis Papapanagiotou Overview Change-Data-Capture (CDC) allows capturing committed changes from a database in real-time and propagating those changes to downstream consumers [1][2]. Requirements In a previous blog post, we discussed Delta , a data enrichment and synchronization platform.

Database

Database Traffic Transportation Open Source

Optimize your observability pipeline for AWS Lambda serverless functions

Dynatrace

NOVEMBER 10, 2022

As companies accelerate digital transformation, cloud services such as AWS Lambda help companies to modernize their application architectures to quickly adapt to the needs of their customers while offloading the operational complexity to their cloud vendor. The need for a simplified approach to capture telemetry.

Lambda

Lambda Serverless AWS Latency

What is observability? Not just logs, metrics and traces

Dynatrace

OCTOBER 1, 2021

As dynamic systems architectures increase in complexity and scale, IT teams face mounting pressure to track and respond to conditions and issues across their multi-cloud environments. As teams begin collecting and working with observability data, they are also realizing its benefits to the business, not just IT. Dynatrace news.

Metrics

Metrics Open Source Monitoring Cloud

Designing Instagram

High Scalability

JANUARY 11, 2022

Architecture. from a client it performs two parallel operations: i) persisting the action in the data store ii) publish the action in a streaming data store for a pub-sub model. User Feed Service, Media Counter Service) read the actions from the streaming data store and performs their specific tasks. High Level Design.

Design

Design Media Storage Logistics

Consistent caching mechanism in Titus Gateway

The Netflix TechBlog

NOVEMBER 3, 2022

The original assumptions and architectural choices were no longer viable. We introduce a caching mechanism in the API gateway layer, allowing us to offload processing from singleton leader elected controllers without giving up strict data consistency and guarantees clients observe. Titus Gateway handles user requests.

Cache

Cache Latency Traffic Systems

DBLog: A Generic Change-Data-Capture Framework

The Netflix TechBlog

DECEMBER 17, 2019

Andreas Andreakis , Ioannis Papapanagiotou Overview Change-Data-Capture (CDC) allows capturing committed changes from a database in real-time and propagating those changes to downstream consumers [1][2]. Requirements In a previous blog post, we discussed Delta , a data enrichment and synchronization platform.

Database

Database Traffic Transportation Open Source

Netflix Cloud Packaging in the Terabyte Era

The Netflix TechBlog

SEPTEMBER 24, 2021

As described by the white paper Apple ProRes ( link ), the target data rate of the Apple ProRes HQ for 1920x1080 at 29.97 Table 1: Movie and File Size Examples Initial Architecture A simplified view of our initial cloud video processing pipeline is illustrated in the following diagram. is 220 Mbps.

Cloud

Cloud Media Storage Cache

Dynatrace supports Azure Managed Instance for Apache Cassandra

Dynatrace

MAY 13, 2022

Because of its scalability and distributed architecture, thousands of companies trust it to run their cloud and hybrid-based workloads at high availability without compromising performance. From there, you can dive deeper into infrastructure metrics (cluster, datacenter, racks, and nodes) and data metrics (keyspaces and tables).

Azure

Azure Latency Metrics Infrastructure

Lessons learned from enterprise service-level objective management

Dynatrace

MAY 19, 2022

Example 1: Architecture boundaries. Due to the massive amount of data, no one knew what action to take if a number went red. First, they took a big step back and looked at their end-to-end architecture (Figure 2). SLO dashboard defined by architectural boundary. The “Four Golden Signals” include the following: Latency.

Automotive

Automotive Latency Architecture Azure

What is serverless computing? Driving efficiency without sacrificing observability

Dynatrace

JANUARY 26, 2021

Within this paradigm, it is possible to run entire architectures without touching a traditional virtual server, either locally or in the cloud. In a serverless architecture, applications are distributed to meet demand and scale requirements efficiently. When an application is triggered, it can cause latency as the application starts.

Serverless

Serverless Efficiency Lambda Azure

How to maximize serverless benefits and overcome its challenges

Dynatrace

OCTOBER 10, 2022

Reduced latency. Serverless architecture makes it possible to host code anywhere, rather than relying on an origin server. By using cloud providers with multiple server sites, organizations can reduce function latency for end users. Architectural complexity. Optimizes resources. Difficult to monitor.

Serverless

Serverless Infrastructure Lambda Latency

Dynatrace accelerates business transformation with new AI observability solution

Dynatrace

JANUARY 31, 2024

This blog post explores how AI observability enables organizations to predict and control costs, performance, and data reliability. It also shows how data observability relates to business outcomes as organizations embrace generative AI. GenAI is prone to erratic behavior due to unforeseen data scenarios or underlying system issues.

Cache

Cache Azure Infrastructure Monitoring

Seamlessly Swapping the API backend of the Netflix Android app

The Netflix TechBlog

SEPTEMBER 8, 2020

This allowed Android engineers to have much more control and observability over how we get our data. Background The Netflix Android app uses the falcor data model and query protocol. For example, the artwork service is separate from the video metadata service, but we need the data from both in the detail key. It was a Node.js

Latency

Latency Cache Java Traffic

Improved Alerting with Atlas Streaming Eval

The Netflix TechBlog

APRIL 27, 2023

Atlas is an in-memory time-series database that ingests multiple billions of time-series per day and retains the last two weeks of data. Moreover, common database optimizations like caching recently queried data don’t really work for alerting queries because, generally speaking, the last received datapoint is required for correctness.

Storage

Storage Cache Metrics Database

Rebuilding Netflix Video Processing Pipeline with Microservices

The Netflix TechBlog

JANUARY 10, 2024

This architecture shift greatly reduced the processing latency and increased system resiliency. We expanded pipeline support to serve our studio/content-development use cases, which had different latency and resiliency requirements as compared to the traditional streaming use case. divide the input video into small chunks 2.

Processing

Processing Media Latency Innovation

Cut costs and complexity: 5 strategies for reducing tool sprawl with Dynatrace

Efficient Multimodal Data Processing: A Technical Deep Dive

Trending Sources

Optimizing Database Performance in Middleware Applications

Netflix’s Distributed Counter Abstraction

RabbitMQ vs. Kafka: Key Differences

How to Scale Elasticsearch to Solve Your Scalability Issues

Foundation Model for Personalized Recommendation

Introducing Netflix’s Key-Value Data Abstraction Layer

Introducing Impressions at Netflix

Single-core memory bandwidth: Latency, Bandwidth, and Concurrency

Data ingestion pipeline with Operation Management

Why applying chaos engineering to data-intensive applications matters

Introducing Netflix TimeSeries Data Abstraction Layer

Analyze OpenTelemetry traces and log data at scale: Accelerate troubleshooting and optimize application performance

Bulldozer: Batch Data Moving from Data Warehouse to Online Key-Value Stores

Data Movement in Netflix Studio via Data Mesh

What is a data lakehouse? Combining data lakes and warehouses for the best of both worlds

Streaming SQL in Data Mesh

Timestone: Netflix’s High-Throughput, Low-Latency Priority Queueing System with Built-in Support…

Best Practices for Scaling RabbitMQ

Scalable Annotation Service?—?Marken

Optimizing data warehouse storage

Architectural Insights: Designing Efficient Multi-Layered Caching With Instagram Example

Migrating Critical Traffic At Scale with No Downtime?—?Part 1

Uber’s Big Data Platform: 100+ Petabytes with Minute Latency

Implementing service-level objectives to improve software quality

Solve hybrid Kubernetes performance and reliability problems with unified observability

Comparing PostgreSQL DigitalOcean Performance & Pricing – ScaleGrid vs. DigitalOcean Managed Databases

These 7 Edge Data Challenges Will Test Companies the Most in 2025

DBLog: A Generic Change-Data-Capture Framework

Optimize your observability pipeline for AWS Lambda serverless functions

What is observability? Not just logs, metrics and traces

Designing Instagram

Consistent caching mechanism in Titus Gateway

DBLog: A Generic Change-Data-Capture Framework

Netflix Cloud Packaging in the Terabyte Era

Dynatrace supports Azure Managed Instance for Apache Cassandra

Lessons learned from enterprise service-level objective management

What is serverless computing? Driving efficiency without sacrificing observability

How to maximize serverless benefits and overcome its challenges

Dynatrace accelerates business transformation with new AI observability solution

Seamlessly Swapping the API backend of the Netflix Android app

Improved Alerting with Atlas Streaming Eval

Rebuilding Netflix Video Processing Pipeline with Microservices

Stay Connected