Design, Open Source and Storage - Technology Performance Pulse

Catching up with OpenTelemetry in 2025

Dynatrace

FEBRUARY 27, 2025

In fact, observability is essential for shaping how we design smarter, more resilient systems for the future. As an open-source project, OpenTelemetry sets standards for telemetry data sets and works with a wide range of systems and platforms to collect and export telemetry data to backend systems.

Tuning

Tuning Open Source Innovation Monitoring

Open-Sourcing Metaflow, a Human-Centric Framework for Data Science

The Netflix TechBlog

DECEMBER 3, 2019

by David Berg , Ravi Kiran Chirravuri , Romain Cledat , Savin Goyal , Ferras Hamad , Ville Tuulos tl;dr Metaflow is now open-source! By design, Metaflow is a deceptively simple Python library: Data scientists can structure their workflow as a Directed Acyclic Graph of steps, as depicted above. both for compute and storage.

Open Source

Open Source AWS Infrastructure Energy

What is Greenplum Database? Intro to the Big Data Database

Scalegrid

MAY 13, 2020

Greenplum Database is an open-source , hardware-agnostic MPP database for analytics, based on PostgreSQL and developed by Pivotal who was later acquired by VMware. Greenplum uses an MPP database design that can help you develop a scalable, high performance deployment. Greenplum Architectural Design. Open Source.

Big Data

Big Data Database Artificial Intelligence Open Source

RabbitMQ vs. Kafka: Key Differences

Scalegrid

FEBRUARY 6, 2025

RabbitMQ is designed for flexible routing and message reliability, while Kafka handles high-throughput event streaming and real-time data processing. Message brokers handle validation, routing, storage, and delivery, ensuring efficient and reliable communication. What is RabbitMQ? What is Apache Kafka?

Latency

Latency Analytics Architecture Storage

Open-Sourcing Metaflow, a Human-Centric Framework for Data Science

The Netflix TechBlog

DECEMBER 3, 2019

by David Berg , Ravi Kiran Chirravuri , Romain Cledat , Savin Goyal , Ferras Hamad , Ville Tuulos tl;dr Metaflow is now open-source! By design, Metaflow is a deceptively simple Python library: Data scientists can structure their workflow as a Directed Acyclic Graph of steps, as depicted above. both for compute and storage.

Open Source

Open Source AWS Infrastructure Energy

Title Launch Observability at Netflix Scale

The Netflix TechBlog

MARCH 4, 2025

Data storage and distribution through HollowFeeds Netflix Hollow is an Open Source java library and toolset for disseminating in-memory datasets from a single producer to many consumers for high performance read-only access.

Traffic

Traffic Strategy Entertainment Innovation

The Ultimate Guide to Open Source Databases

Percona

MARCH 30, 2023

The use of open source databases has increased steadily in recent years. Past trepidation — about perceived vulnerabilities and performance issues — has faded as decision makers realize what an “open source database” really is and what it offers. What is an open source database?

Open Source

Open Source Database Storage Scalability

From Proprietary to Open Source: The Complete Guide to Database Migration

Percona

OCTOBER 18, 2023

Migrating a proprietary database to open source is a major decision that can significantly affect your organization. Advantages of migrating to open source For many reasons mentioned earlier, organizations are increasingly shifting towards open source databases for their data management needs.

Open Source

Open Source Database Hardware Strategy

Weighing the top seven Kubernetes challenges and how to solve them

Dynatrace

JUNE 6, 2023

Kubernetes has become the leading container orchestration platform for organizations adopting open source solutions to manage, scale, and automate application deployment. Kubernetes is an open source container orchestration platform for managing, automating, and scaling containerized applications. What is Kubernetes?

Open Source

Open Source Storage Analytics Innovation

10 open-source Kubernetes tools for highly effective SRE and Ops Teams

Abhishek Tiwari

JANUARY 6, 2018

Here we present a list of 10 open-source Kubernetes tools to make your SRE and Ops teams more effective to achieve their service level objectives. Interactive mode is designed to allow you to discover your cluster's components, and manually break things to see what happens. Kube-ops-view. Telepresence. Amazon S3).

Open Source

Open Source DevOps Engineering Storage

Geek Reading - Week of June 5, 2013

DZone

OCTOBER 11, 2022

Why haven’t cash-strapped American schools embraced open source? Simpler UI Testing with CasperJS ( Architects Zone – Architectural Design Patterns & Best Practices). Using MongoDB as a cache store ( Architects Zone – Architectural Design Patterns & Best Practices). Hacker News).

Java

Java Best Practices Google Analytics

What is security analytics?

Dynatrace

JUNE 10, 2024

Security analytics solutions are designed to handle modern applications that rely on dynamic code and microservices. This includes everything from multicloud deployments to microservices to Kubernetes instances and the use of open source software. Infrastructure type In most cases, legacy SIEM tools are on-premises.

Analytics

Analytics Network Open Source Hardware

Which Is the Best PostgreSQL GUI? 2019 Comparison

Scalegrid

SEPTEMBER 16, 2019

PostgreSQL graphical user interface (GUI) tools help these open source database users to manage, manipulate, and visualize their data. It supports all PostgreSQL operations and features while being free and open-source. pgAdmin Cost: Free (open source). Let’s start with the first and most popular one.

Open Source

Open Source Database Azure Cloud

The history of Grail: Why you need a data lakehouse

Dynatrace

OCTOBER 4, 2022

A data lakehouse addresses these limitations and introduces an entirely new architectural design. This architecture offers rich data management and analytics features (taken from the data warehouse model) on top of low-cost cloud storage systems (which are used by data lakes). Grail is built for such analytics, not storage.

Artificial Intelligence

Artificial Intelligence Analytics Storage Architecture

OpenShift vs. Kubernetes: Understanding the differences

Dynatrace

JUNE 7, 2023

Kubernetes is an open source container orchestration platform that enables organizations to automatically scale, manage, and deploy containerized applications in distributed environments. Like Kubernetes, OpenShift is an open source Kubernetes-based container platform. What is Kubernetes? What is OpenShift? Ease of use.

Open Source

Open Source Social Media Infrastructure Operating System

Building Netflix’s Distributed Tracing Infrastructure

The Netflix TechBlog

OCTOBER 19, 2020

Now let’s look at how we designed the tracing infrastructure that powers Edgar. Troubleshooting a session in Edgar When we started building Edgar four years ago, there were very few open-source distributed tracing systems that satisfied our needs. Storage: don’t break the bank!

Infrastructure

Infrastructure Transportation Storage Open Source

How unified data and analytics offers a new approach to software intelligence

Dynatrace

OCTOBER 4, 2022

Traditionally, though, to gain true business insight, organizations had to make tradeoffs between accessing quality, real-time data and factors such as data storage costs. It should be open by design to accelerate innovation, enable powerful integration with other tools, and purposefully unify data and analytics.

Analytics

Analytics Software Software Storage

Best practices for Fluent Bit 3.0

Dynatrace

MAY 7, 2024

Fluent Bit is a telemetry agent designed to receive data (logs, traces, and metrics), process or modify it, and export it to a destination. Fluent Bit was designed to help you adjust your data and add the proper context, which can be helpful in the observability backend. What’s the difference between Fluent Bit and Fluentd?

Best Practices

Best Practices IoT Metrics Storage

Netflix at AWS re:Invent 2019

The Netflix TechBlog

NOVEMBER 22, 2019

This entertaining romp through the tech stack serves as an introduction to how we think about and design systems, the Netflix approach to operational challenges, and how other organizations can apply our thought processes and technologies. Technology advancements in content creation and consumption have also increased its data footprint.

AWS

AWS Entertainment Open Source Benchmarking

Using JSONB in PostgreSQL: How to Effectively Store & Index JSON Data in PostgreSQL

Scalegrid

JULY 17, 2020

JSONB storage has some drawbacks vs. traditional columns: PostreSQL does not store column statistics for JSONB columns. JSONB storage results in a larger storage footprint. JSONB storage does not deduplicate the key names in the JSON. If that doesn’t work, the data is moved to out-of-line storage.

Storage

Storage Database Efficiency Processing

Data Engineers of Netflix?—?Interview with Pallavi Phadnis

The Netflix TechBlog

OCTOBER 28, 2021

During earlier years of my career, I primarily worked as a backend software engineer, designing and building the backend systems that enable big data analytics. I developed many batch and real-time data pipelines using open source technologies for AOL Advertising and eBay.

Data Engineering

Data Engineering Engineering Big Data Software Engineering

Dynatrace OpenPipeline: Stream processing data ingestion converges observability, security, and business data at massive scale for analytics and automation in context

Dynatrace

JANUARY 31, 2024

Introducing Dynatrace OpenPipeline OpenPipeline is a stream-processing technology that transforms how the Dynatrace platform ingests data from any source, at any scale, and in any format. With OpenPipeline, you can easily collect data from Dynatrace OneAgent®, open source collectors such as OpenTelemetry, or other third-party tools.

Analytics

Analytics Processing Transportation Storage

Accelerate your cloud journey with Dynatrace observability for AWS S3 logs

Dynatrace

JUNE 27, 2023

Many AWS services and third party solutions use AWS S3 for log storage. To date, some customers have used open source or community-backed components to forward logs from S3 to Dynatrace. Logs complement out-of-the-box metrics and enable automated actions for responding to availability, security, and other service events.

AWS

AWS Cloud Lambda Analytics

The Anna Key-Value Store Now Has 355x the Performance of DynamoDB for the Dollar

High Scalability

SEPTEMBER 8, 2018

They've posted about Anna's new superpowers in Going Fast and Cheap: How We Made Anna Autoscale : Using Anna v0 as an in-memory storage engine, we set out to address the cloud storage problems described above. Each storage server collects statistics about the requests it serves, the data it stores, etc. Related Articles.

Storage

Storage Performance AWS Cloud

Dynatrace key takeaways from o11yfest 2021

Dynatrace

MAY 26, 2021

From May 17 to May 18, 2021, the Open-Source Engineering team at Dynatrace attended the virtual observability conference, o11yfest. Trace-based sampling can help you save storage costs. This can help you save money in storage costs in the long run. OTel is designed to ensure flexibility and stability.

AWS

AWS Strategy Storage Metrics

Kubernetes: Challenges for observability platforms

Dynatrace

NOVEMBER 23, 2020

Nevertheless, there are related components and processes, for example, virtualization infrastructure and storage systems (see image below), that can lead to problems in your Kubernetes infrastructure. Configuring storage in Kubernetes is more complex than using a file system on your host. What does observability mean for Kubernetes?

Virtualization

Virtualization Infrastructure Monitoring Cloud

All of Netflix’s HDR video streaming is now dynamically optimized

The Netflix TechBlog

NOVEMBER 29, 2023

Bitrate versus quality comparison HDR-VMAF is designed to be format-agnostic — it measures the perceptual quality of HDR video signal regardless of its container format, for example, Dolby Vision or HDR10. Yes, we are committed to supporting the open-source community. The graphic below (Fig.

Open Source

Open Source Software Engineering Internet Internet

5 SRE best practices you can implement today

Dynatrace

JULY 6, 2022

Consider selecting platform-based solutions — whether open source or from a commercial vendor — that support open ecosystems. Virtualization has revolutionized system administration by making it possible for software to manage systems, storage, and networks. Design, implement, and tune effective SLOs.

Best Practices

Best Practices Open Source Tuning Infrastructure

AWS observability: AWS monitoring best practices for resiliency

Dynatrace

NOVEMBER 22, 2021

EC2 is Amazon’s Infrastructure-as-a-service (IaaS) compute platform designed to handle any workload at scale. With EC2, Amazon manages the basic compute, storage, networking infrastructure and virtualization layer, and leaves the rest for you to manage: OS, middleware, runtime environment, data, and applications. Amazon EC2.

Best Practices

Best Practices AWS Monitoring Serverless

How to Use IAM Roles for Service Accounts (IRSA) with Percona Operator for MongoDB on AWS

Percona Community

FEBRUARY 16, 2025

Introduction Percona Operator for MongoDB is an open-source solution designed to streamline and automate database operations within Kubernetes. When running database workloads on Amazon EKS (Elastic Kubernetes Service), backup and restore processes often require access to AWS services like S3 for storage.

AWS

AWS Open Source Storage Database

Unlocking the Secrets of TOAST: How To Optimize Large Column Storage in PostgreSQL for Top Performance and Scalability

Percona

FEBRUARY 1, 2023

This post will look at using The Oversized-Attribute Storage Technique (TOAST) to improve performance and scalability. Therefore, TOAST is a storage technique used in PostgreSQL to handle large data objects such as images, videos, and audio files.

Storage

Storage Scalability Strategy Performance

Why PostgreSQL Needs Transparent Database Encryption (TDE)

Percona

JANUARY 30, 2023

Instead, it seems PostgreSQL Developers are of the opinion that encryption is a storage-level problem and is better solved on the filesystem or block device level. For example, physical backups stored in open S3 buckets are relatively common exposure vectors. It adds defense in depth.

Database

Database Storage Government Open Source

Weekend Reading: Amazon Aurora: Design Considerations for High Throughput Cloud-Native Relational Databases.

All Things Distributed

MAY 19, 2017

They would love to migrate to an open-source style database like MySQL or PostgresSQL if only such a database could meet the enterprise-grade reliability and performance these high-scale applications required. In this paper, we describe the architecture of Aurora and the design considerations leading to that architecture.

Database

Database Design Cloud Storage

A Recap of the Data Engineering Open Forum at Netflix

The Netflix TechBlog

JUNE 20, 2024

Unbundling the Data Warehouse: The Case for Independent Storage Recording Speaker : Jason Reid (Co-founder & Head of Product at Tabular) Summary : Unbundling a data warehouse means splitting it into constituent and modular components that interact via open standard interfaces.

Data Engineering

Data Engineering Engineering Entertainment Software Engineering

Compression Methods in MongoDB: Snappy vs. Zstd

Percona

MARCH 29, 2023

Compression in any database is necessary as it has many advantages, like storage reduction, data transmission time, etc. Storage reduction alone results in significant cost savings, and we can save more data in the same space. By default, MongoDB provides a snappy block compression method for storage and network communication.

Storage

Storage Network Open Source Latency

Elasticsearch Indexing Strategy in Asset Management Platform (AMP)

The Netflix TechBlog

MARCH 10, 2023

are stored in secure storage layers. Amsterdam is built on top of three storage layers. The first layer, Cassandra , is the source of truth for us. One of the first decisions when integrating with Elasticsearch is designing the indices, their settings and mappings. Net, Ruby, Perl etc.).

Strategy

Strategy Cache Storage Analytics

Percona Backup for MongoDB 2.0.3, Updates to Percona Distribution for MySQL: Release Roundup January 30, 2023

Percona

JANUARY 30, 2023

Percona is a leading provider of unbiased open source database solutions that allow organizations to easily, securely, and affordably maintain business agility, minimize risks, and stay competitive. As such, you save on storage space and data transfer in case of cloud setups. It’s time for the release roundup!

Open Source

Open Source Storage AWS Cloud

The Ultimate Guide to Database High Availability

Percona

JUNE 22, 2023

Typically, the servers are configured in a primary/replica configuration, with one server designated as the primary server that handles all incoming requests and the others designated as replica servers that monitor the primary and take over its workload if it fails. This flexibility can be crucial in designing a scalable architecture.

Availability

Availability Database Open Source Hardware

Key takeaways from o11yfest 2021 – SRE, Observability, and OpenTelemetry

Dynatrace

MAY 26, 2021

From May 17 to May 18, 2021, the Open-Source Engineering team at Dynatrace attended the virtual observability conference, o11yfest. Trace-based sampling can help you save storage costs. This can help you save money in storage costs in the long run. OTel is designed to ensure flexibility and stability.

AWS

AWS Strategy Storage Metrics

Netflix Video Quality at Scale with Cosmos Microservices

The Netflix TechBlog

NOVEMBER 2, 2021

It has become a de facto standard for perceptual quality measurements within Netflix and, thanks to its open-source nature , throughout the video industry. For example, when we design a new version of VMAF, we need to effectively roll it out throughout the entire Netflix catalog of movies and TV shows.

Media

Media Innovation Metrics Latency

Redis vs Memcached in 2024

Scalegrid

MARCH 28, 2024

This article will explore how they handle data storage and scalability, perform in different scenarios, and, most importantly, how these factors influence your choice. Redis Revealed: An Overview Redis, a renowned open-source, in-memory remote dictionary server, stands out for its diverse data structures and advanced features.

Cache

Cache Storage Scalability Architecture

File systems unfit as distributed storage backends: lessons from ten years of Ceph evolution

The Morning Paper

NOVEMBER 5, 2019

File systems unfit as distributed storage backends: lessons from 10 years of Ceph evolution Aghayev et al., In this case, the assumption that a distributed storage backend should clearly be layered on top of a local file system. What is a distributed storage backend? SOSP’19. This is not surprising in hindsight.

Storage

Storage Systems Hardware Efficiency

Netflix at AWS re:Invent 2019

The Netflix TechBlog

NOVEMBER 22, 2019

This entertaining romp through the tech stack serves as an introduction to how we think about and design systems, the Netflix approach to operational challenges, and how other organizations can apply our thought processes and technologies. Technology advancements in content creation and consumption have also increased its data footprint.

AWS

AWS Entertainment Open Source Benchmarking

Catching up with OpenTelemetry in 2025

Open-Sourcing Metaflow, a Human-Centric Framework for Data Science

Trending Sources

What is Greenplum Database? Intro to the Big Data Database

RabbitMQ vs. Kafka: Key Differences

Open-Sourcing Metaflow, a Human-Centric Framework for Data Science

Title Launch Observability at Netflix Scale

The Ultimate Guide to Open Source Databases

From Proprietary to Open Source: The Complete Guide to Database Migration

Weighing the top seven Kubernetes challenges and how to solve them

10 open-source Kubernetes tools for highly effective SRE and Ops Teams

Geek Reading - Week of June 5, 2013

What is security analytics?

Which Is the Best PostgreSQL GUI? 2019 Comparison

The history of Grail: Why you need a data lakehouse

OpenShift vs. Kubernetes: Understanding the differences

Building Netflix’s Distributed Tracing Infrastructure

How unified data and analytics offers a new approach to software intelligence

Best practices for Fluent Bit 3.0

Netflix at AWS re:Invent 2019

Using JSONB in PostgreSQL: How to Effectively Store & Index JSON Data in PostgreSQL

Data Engineers of Netflix?—?Interview with Pallavi Phadnis

Dynatrace OpenPipeline: Stream processing data ingestion converges observability, security, and business data at massive scale for analytics and automation in context

Accelerate your cloud journey with Dynatrace observability for AWS S3 logs

The Anna Key-Value Store Now Has 355x the Performance of DynamoDB for the Dollar

Dynatrace key takeaways from o11yfest 2021

Kubernetes: Challenges for observability platforms

All of Netflix’s HDR video streaming is now dynamically optimized

5 SRE best practices you can implement today

AWS observability: AWS monitoring best practices for resiliency

How to Use IAM Roles for Service Accounts (IRSA) with Percona Operator for MongoDB on AWS

Unlocking the Secrets of TOAST: How To Optimize Large Column Storage in PostgreSQL for Top Performance and Scalability

Why PostgreSQL Needs Transparent Database Encryption (TDE)

Weekend Reading: Amazon Aurora: Design Considerations for High Throughput Cloud-Native Relational Databases.

A Recap of the Data Engineering Open Forum at Netflix

Compression Methods in MongoDB: Snappy vs. Zstd

Elasticsearch Indexing Strategy in Asset Management Platform (AMP)

Percona Backup for MongoDB 2.0.3, Updates to Percona Distribution for MySQL: Release Roundup January 30, 2023

The Ultimate Guide to Database High Availability

Top 3 Questions From Percona k8s Squad Ask-Me-Anything Session

Key takeaways from o11yfest 2021 – SRE, Observability, and OpenTelemetry

Netflix Video Quality at Scale with Cosmos Microservices

Redis vs Memcached in 2024

File systems unfit as distributed storage backends: lessons from ten years of Ceph evolution

Netflix at AWS re:Invent 2019

Stay Connected