Processing and Storage - Technology Performance Pulse

Dynatrace elevates data security with separated storage and unique encryption keys for each tenant

Dynatrace

NOVEMBER 29, 2024

Enhancing data separation by partitioning each customer’s data on the storage level and encrypting it with a unique encryption key adds an additional layer of protection against unauthorized data access. A unique encryption key is applied to each tenant’s storage and automatically rotated every 365 days.

Storage

Storage AWS Azure Architecture

Efficient Multimodal Data Processing: A Technical Deep Dive

DZone

FEBRUARY 27, 2025

Multimodal data processing is the evolving need of the latest data platforms powering applications like recommendation systems, autonomous vehicles, and medical diagnostics. Handling multimodal data spanning text, images, videos, and sensor inputs requires resilient architecture to manage the diversity of formats and scale.

Efficiency

Efficiency Processing Latency Storage

Cut costs and complexity: 5 strategies for reducing tool sprawl with Dynatrace

Dynatrace

APRIL 10, 2025

Simplify data ingestion and up-level storage for better, faster querying : With Dynatrace, petabytes of data are always hot for real-time insights, at a cold cost. Worsened by separate tools to track metrics, logs, traces, and user behaviorcrucial, interconnected details are separated into different storage.

Strategy

Strategy Storage Network Architecture

Business process observability: An IT solution to a business challenge

Dynatrace

FEBRUARY 24, 2025

Business processes support virtually all aspects of an organizations operations. Theyre often categorized by their function; core processes directly create customer value, support processes increase departmental efficiency, and management processes drive strategic goals and compliance.

Processing

Processing Analytics Infrastructure Monitoring

Data Storage Formats for Big Data Analytics: Performance and Cost Implications of Parquet, Avro, and ORC

DZone

SEPTEMBER 9, 2024

Efficient data processing is crucial for businesses and organizations that rely on big data analytics to make informed decisions. One key factor that significantly affects the performance of data processing is the storage format of the data.

Big Data

Big Data Storage Analytics Benchmarking

Cutting Big Data Costs: Effective Data Processing With Apache Spark

DZone

SEPTEMBER 14, 2023

In today's data-driven world, efficient data processing plays a pivotal role in the success of any project. Apache Spark , a robust open-source data processing framework, has emerged as a game-changer in this domain. Optimizing Data Input Make Use of Data Forma t In most cases, the data being processed is stored in a columnar format.

Big Data

Big Data Processing Open Source Games

Optimizing data warehouse storage

The Netflix TechBlog

DECEMBER 21, 2020

At this scale, we can gain a significant amount of performance and cost benefits by optimizing the storage layout (records, objects, partitions) as the data lands into our warehouse. On the other hand, these optimizations themselves need to be sufficiently inexpensive to justify their own processing cost over the gains they bring.

Storage

Storage Latency Efficiency Data Engineering

Navigate end-to-end data compliance through effective log management

Dynatrace

NOVEMBER 29, 2024

By ensuring that all processes—from data collection to storage and usage—comply with regulatory requirements, organizations can better manage potential threats. Streamlined audits: End-to-end compliance simplifies the audit process. Retention periods and access controls must be properly configured to protect such PII.

Healthcare

Healthcare Analytics Storage Government

Storage handling improvements increase retention of transaction data for Dynatrace Managed

Dynatrace

JULY 29, 2021

Using existing storage resources optimally is key to being able to capture the right data over time. Increased storage space availability. The compression of transaction data older than three days can free up to 50% more storage space in your Dynatrace Managed Cluster. Data compression is completed on June 12.

Storage

Storage Virtualization Infrastructure Availability

OpenPipeline: Simplify access to critical business data

Dynatrace

NOVEMBER 4, 2024

Dynatrace OpenPipeline is a new stream processing technology that ingests and contextualizes data from any source. Business process monitoring and optimization. All of these steps are critical components of the process, likely to be implemented using different systems. Reduced storage and query overhead for business use cases.

Analytics

Analytics Airlines Metrics Monitoring

Netflix’s Distributed Counter Abstraction

The Netflix TechBlog

NOVEMBER 12, 2024

After selecting a mode, users can interact with APIs without needing to worry about the underlying storage mechanisms and counting methods. Let’s examine some of the drawbacks of this approach: Lack of Idempotency : There is no idempotency key baked into the storage data-model preventing users from safely retrying requests.

Latency

Latency Cache Infrastructure Strategy

Catching up with OpenTelemetry in 2025

Dynatrace

FEBRUARY 27, 2025

Finally, it empowers automated systems to process and analyze OpenTelemetry data, without requiring adaptations for every framework. Acting as the middlemen, Collectors hide all the pesky little details, allowing OpenTelemetry exporters to focus on generating data, and OpenTel backends to focus on storage and analysis.

Tuning

Tuning Open Source Innovation Monitoring

New continuous compliance requirements drive the need to converge observability and security

Dynatrace

DECEMBER 12, 2024

Carefully planning and integrating new processes and tools is critical to ensuring compliance without disrupting daily operations. Visibility of all business processes starting from the back end and ending with customer experience is perhaps the biggest challenge. Were challenging these preconceptions.

Analytics

Analytics Government Efficiency Innovation

Medallion Architecture: Efficient Batch and Stream Processing Data Pipelines With Azure Databricks and Delta Lake

DZone

JULY 13, 2023

In today's data-driven world, organizations need efficient and scalable data pipelines to process and analyze large volumes of data. Medallion Architecture provides a framework for organizing data processing workflows into different zones, enabling optimized batch and stream processing.

Azure

Azure Architecture Efficiency Processing

Dynatrace OpenPipeline: Stream processing data ingestion converges observability, security, and business data at massive scale for analytics and automation in context

Dynatrace

JANUARY 31, 2024

Organizations choose data-driven approaches to maximize the value of their data, achieve better business outcomes, and realize cost savings by improving their products, services, and processes. Data is then dynamically routed into pipelines for further processing.

Analytics

Analytics Processing Transportation Storage

Process more with less using smarter cluster overload prevention for Dynatrace Managed

Dynatrace

MAY 14, 2020

By vastly increasing the number of PurePaths that are processed by a Dynatrace Managed cluster, your initial sizing considerations for Dynatrace Managed nodes and clusters may however end up being inadequate for supporting such volume. A Dynatrace Managed cluster may lack the necessary hardware to process all the additional incoming data.

Processing

Processing Hardware Traffic Storage

Globalizing Productions with Netflix’s Media Production Suite

The Netflix TechBlog

MARCH 31, 2025

A lack of automation and standardization often results in a labour-intensive process across post-production and VFX with a lot of dependencies that introduce potential human errors and security risks. Besides the need for robust cloud storage for their media, artists need access to powerful workstations and real-time playback.

Media

Media Logistics Innovation Cloud

Microsoft Ignite 2024 guide: Cloud observability for AI transformation

Dynatrace

NOVEMBER 18, 2024

The Grail™ data lakehouse provides fast, auto-indexed, schema-on-read storage with massively parallel processing (MPP) to deliver immediate, contextualized answers from all data at scale. Through Azure Native Dynatrace Service, customers can seamlessly adopt these technologies to modernize and enhance their cloud operations.

Cloud

Cloud Azure Artificial Intelligence Innovation

Block Size and Its Impact on Storage Performance

DZone

JUNE 21, 2024

This article analyzes the correlation between block sizes and their impact on storage performance. This paper deals with definitions and understanding of structured data vs unstructured data, how various storage segments react to block size changes, and differences between I/O-driven and throughput-driven workloads.

Storage

Storage Performance Benchmarking Processing

RabbitMQ vs. Kafka: Key Differences

Scalegrid

FEBRUARY 6, 2025

RabbitMQ is designed for flexible routing and message reliability, while Kafka handles high-throughput event streaming and real-time data processing. RabbitMQ follows a message broker model with advanced routing, while Kafkas event streaming architecture uses partitioned logs for distributed processing. What is RabbitMQ?

Latency

Latency Analytics Architecture Storage

Dynatrace Cost & Carbon Optimization certified for accuracy and transparency

Dynatrace

MARCH 5, 2025

Integration with existing systems and processes : Integration with existing IT infrastructure, observability solutions, and workflows often requires significant investment and customization. Actions resulting from the evaluation The certification process surfaced a few recommendations for improving the app.

Energy

Energy Analytics Traffic Cloud

What is Greenplum Database? Intro to the Big Data Database

Scalegrid

MAY 13, 2020

Greenplum Database is a massively parallel processing (MPP) SQL database that is built and based on PostgreSQL. High performance, query optimization, open source and polymorphic data storage are the major Greenplum advantages. Greenplum uses an MPP database design that can help you develop a scalable, high performance deployment.

Big Data

Big Data Database Artificial Intelligence Open Source

Introducing Impressions at Netflix

The Netflix TechBlog

FEBRUARY 14, 2025

It requires a state-of-the-art system that can track and process these impressions while maintaining a detailed history of each profiles exposure. In this multi-part blog series, we take you behind the scenes of our system that processes billions of impressions daily.

Tuning

Tuning Latency Efficiency Storage

Which IT security solution is right for your organization? CSPM vs. KSPM vs. CNAPP

Dynatrace

APRIL 16, 2025

Use cases Identifying misconfigurations: Continuously scanning cloud environments to detect misconfigurations (such as open network ports, missing security patches, and exposed storage buckets) to help maintain a secure, stable infrastructure. What is Kubernetes Security Posture Management?

Serverless

Serverless Azure Cloud Infrastructure

Distributed tracing with Dynatrace just got even better

Dynatrace

MARCH 11, 2025

Say hello to advanced trace an alytics and new data storage and capture options. Site reliability engineers, performance architects, and developers can now leverage dynamic analysis tools like dashboards and workflows to explore trends, automate processes, and maintain control at an unprecedented level. But why stop there?

Games

Games Analytics Innovation Metrics

What is a Distributed Storage System

Scalegrid

FEBRUARY 8, 2024

A distributed storage system is foundational in today’s data-driven landscape, ensuring data spread over multiple servers is reliable, accessible, and manageable. Understanding distributed storage is imperative as data volumes and the need for robust storage solutions rise.

Storage

Storage Systems Big Data Azure

Designing Instagram

High Scalability

JANUARY 11, 2022

There are two major processes which gets executed when a user posts a photo on Instagram. Firstly, the synchronous process which is responsible for uploading image content on file storage, persisting the media metadata in graph data-storage, returning the confirmation message to the user and triggering the process to update the user activity.

Design

Design Media Storage Logistics

AWS serverless services: Exploring your options

Dynatrace

OCTOBER 7, 2021

This means you no longer have to provision, scale, and maintain servers to run your applications, databases, and storage systems. Speed is next; serverless solutions are quick to spin up or down as needed, and there are no delays due to limited storage or resource access. AWS offers four serverless offerings for storage.

Serverless

Serverless AWS Lambda Storage

Title Launch Observability at Netflix Scale

The Netflix TechBlog

DECEMBER 17, 2024

As Netflix expanded globally and the volume of title launches skyrocketed, the operational challenges of maintaining this manual process became undeniable. Metadata and assets must be correctly configured, data must flow seamlessly, microservices must process titles without error, and algorithms must function as intended.

Traffic

Traffic Scalability Strategy Monitoring

Mastering Disk Space Management with MongoDB® Storage Engines

Scalegrid

MAY 11, 2024

MongoDB offers several storage engines that cater to various use cases. The default storage engine in earlier versions was MMAPv1, which utilized memory-mapped files and document-level locking. The newer, pluggable storage engine, WiredTiger, addresses this by using prefix compression, collection-level locking, and row-based storage.

Storage

Storage Engineering Cache Database

Privacy spotlight: Control compliance in Dynatrace with multiple layers of sensitive data masking

Dynatrace

MAY 21, 2024

Masking at storage: Data is persistently masked upon ingestion into Dynatrace. Leverage three masking layers Masking at capture and masking at storage operations exclude targeted sensitive data points. The selected rules can be configured for a whole environment or, more granularly, for specific process groups.

Government

Government Storage Tuning Monitoring

Using JSONB in PostgreSQL: How to Effectively Store & Index JSON Data in PostgreSQL

Scalegrid

JULY 17, 2020

JSON is faster to ingest vs. JSONB – however, if you do any further processing, JSONB will be faster. JSONB storage has some drawbacks vs. traditional columns: PostreSQL does not store column statistics for JSONB columns. JSONB storage results in a larger storage footprint. whitespace) and ordering of the keys.

Storage

Storage Database Efficiency Availability

Overcoming Challenges and Best Practices for Data Migration From On-Premise to Cloud

DZone

MARCH 29, 2023

Data migration is the process of moving data from one location to another, which is an essential aspect of cloud migration. Data migration involves transferring data from on-premise storage to the cloud. With the rapid adoption of cloud computing , businesses are moving their IT infrastructure to the cloud.

Best Practices

Best Practices Cloud Storage Data Engineering

Microsoft Azure Event Hubs

DZone

FEBRUARY 23, 2023

Introduction With big data streaming platform and event ingestion service Azure Event Hubs , millions of events can be received and processed in a single second. Any real-time analytics provider or batching/storage adaptor can transform and store data supplied to an event hub.

Azure

Azure Big Data Storage Analytics

Nine ways technology executives can get significant business value with the right observability platform

Dynatrace

MAY 21, 2024

With the latest advances from Dynatrace, this process is instantaneous. That’s because it does not require any pre-prepared schemas, and access to cold/hot storage is fully automatic and with zero latency. Moreover, it is fast, powered by its massively parallel processing data lakehouse.

Technology

Technology Technology Analytics Storage

Introducing Netflix’s Key-Value Data Abstraction Layer

The Netflix TechBlog

SEPTEMBER 18, 2024

Our goal was to build a versatile and efficient data storage solution that could handle a wide variety of use cases, ranging from the simplest hashmaps to more complex data structures, all while ensuring high availability, tunable consistency, and low latency. Developers just provide their data problem rather than a database solution!

Latency

Latency Storage Cache Servers

The history of Grail: Why you need a data lakehouse

Dynatrace

OCTOBER 4, 2022

This architecture offers rich data management and analytics features (taken from the data warehouse model) on top of low-cost cloud storage systems (which are used by data lakes). This decoupling ensures the openness of data and storage formats, while also preserving data in context. Ingest and process with Grail. Retain data.

Artificial Intelligence

Artificial Intelligence Analytics Storage Architecture

Part 1: A Survey of Analytics Engineering Work at Netflix

The Netflix TechBlog

DECEMBER 17, 2024

If country_iso_code doesnt already exist in the fact table, the metric owner only needs to tell DJ that account_id is the foreign key to an `users_dimension_table` (we call this process dimension linking ). DJ then can perform the joins to bring in any requested dimensions from `users_dimension_table`.

Analytics

Analytics Engineering Entertainment Metrics

Tuning EMQX To Scale to One Million Concurrent Connection on Kubernetes

DZone

JUNE 6, 2023

When dealing with IoT, one of the first things that come to mind is the limited processing, networking, and storage capabilities these devices operate with. A messaging protocol is a set of rules and formats that are agreed upon among entities that want to communicate with each other.

IoT

IoT Tuning Storage Network

Netflix Cloud Packaging in the Terabyte Era

The Netflix TechBlog

SEPTEMBER 24, 2021

By Xiaomei Liu , Rosanna Lee , Cyril Concolato Introduction Behind the scenes of the beloved Netflix streaming service and content, there are many technology innovations in media processing. Packaging has always been an important step in media processing. Uploading and downloading data always come with a penalty, namely latency.

Cloud

Cloud Media Storage Cache

How Netflix Accurately Attributes eBPF Flow Logs

The Netflix TechBlog

APRIL 8, 2025

FlowCollector , a backend service, collects flow logs from FlowExporter instances across the fleet, attributes the IP addresses, and sends these attributed flows to Netflixs Data Mesh for subsequent stream and batch processing. 2xlarge instances, we can process 5 million flows per second across the entire Netflixfleet. With 30 c7i.2xlarge

AWS

AWS Traffic Network Programming

What is a data lakehouse? Combining data lakes and warehouses for the best of both worlds

Dynatrace

OCTOBER 4, 2022

Data warehouses offer a single storage repository for structured data and provide a source of truth for organizations. However, organizations must structure and store data inputs in a specific format to enable extract, transform, and load processes, and efficiently query this data. Massively parallel processing. Query language.

Artificial Intelligence

Artificial Intelligence Storage Analytics Government

Scaling Media Machine Learning at Netflix

The Netflix TechBlog

FEBRUARY 13, 2023

We accomplish this by paving the path to: Accessing and processing media data (e.g. To streamline this process, we standardized media assets with pre-processing steps that create and store dedicated quality-controlled derivatives with associated snapshotted metadata. either a movie or an episode within a show). mp4, clip1.mp4,

Media

Media Storage Infrastructure Systems

Dynatrace elevates data security with separated storage and unique encryption keys for each tenant

Efficient Multimodal Data Processing: A Technical Deep Dive

Trending Sources

Cut costs and complexity: 5 strategies for reducing tool sprawl with Dynatrace

Business process observability: An IT solution to a business challenge

Data Storage Formats for Big Data Analytics: Performance and Cost Implications of Parquet, Avro, and ORC

Cutting Big Data Costs: Effective Data Processing With Apache Spark

Optimizing data warehouse storage

Navigate end-to-end data compliance through effective log management

Storage handling improvements increase retention of transaction data for Dynatrace Managed

OpenPipeline: Simplify access to critical business data

Netflix’s Distributed Counter Abstraction

Catching up with OpenTelemetry in 2025

New continuous compliance requirements drive the need to converge observability and security

Medallion Architecture: Efficient Batch and Stream Processing Data Pipelines With Azure Databricks and Delta Lake

Dynatrace OpenPipeline: Stream processing data ingestion converges observability, security, and business data at massive scale for analytics and automation in context

Process more with less using smarter cluster overload prevention for Dynatrace Managed

Globalizing Productions with Netflix’s Media Production Suite

Microsoft Ignite 2024 guide: Cloud observability for AI transformation

Block Size and Its Impact on Storage Performance

RabbitMQ vs. Kafka: Key Differences

Dynatrace Cost & Carbon Optimization certified for accuracy and transparency

What is Greenplum Database? Intro to the Big Data Database

Introducing Impressions at Netflix

Which IT security solution is right for your organization? CSPM vs. KSPM vs. CNAPP

Distributed tracing with Dynatrace just got even better

What is a Distributed Storage System

Designing Instagram

AWS serverless services: Exploring your options

Title Launch Observability at Netflix Scale

Top PostgreSQL 17 New Features

Mastering Disk Space Management with MongoDB® Storage Engines

Privacy spotlight: Control compliance in Dynatrace with multiple layers of sensitive data masking

Using JSONB in PostgreSQL: How to Effectively Store & Index JSON Data in PostgreSQL

Overcoming Challenges and Best Practices for Data Migration From On-Premise to Cloud

Microsoft Azure Event Hubs

Nine ways technology executives can get significant business value with the right observability platform

Introducing Netflix’s Key-Value Data Abstraction Layer

The history of Grail: Why you need a data lakehouse

Part 1: A Survey of Analytics Engineering Work at Netflix

Tuning EMQX To Scale to One Million Concurrent Connection on Kubernetes

Netflix Cloud Packaging in the Terabyte Era

How Netflix Accurately Attributes eBPF Flow Logs

What is a data lakehouse? Combining data lakes and warehouses for the best of both worlds

Scaling Media Machine Learning at Netflix

Stay Connected