Data, Open Source and Scalability - Technology Performance Pulse

How to Scale Elasticsearch to Solve Your Scalability Issues

DZone

FEBRUARY 26, 2025

With the evolution of modern applications serving increasing needs for real-time data processing and retrieval, scalability does, too. One such open-source, distributed search and analytics engine is Elasticsearch, which is very efficient at handling data in large sets and high-velocity queries.

Scalability

Scalability Open Source Latency Architecture

Open-Sourcing Metaflow, a Human-Centric Framework for Data Science

The Netflix TechBlog

DECEMBER 3, 2019

by David Berg , Ravi Kiran Chirravuri , Romain Cledat , Savin Goyal , Ferras Hamad , Ville Tuulos tl;dr Metaflow is now open-source! Netflix applies data science to hundreds of use cases across the company, including optimizing content delivery and video encoding. How could we improve the quality of life for data scientists?

Open Source

Open Source AWS Infrastructure Energy

Open-Sourcing a Monitoring GUI for Metaflow

The Netflix TechBlog

OCTOBER 27, 2021

Open-Sourcing a Monitoring GUI for Metaflow, Netflix’s ML Platform tl;dr Today, we are open-sourcing a long-awaited GUI for Metaflow. The Metaflow GUI allows data scientists to monitor their workflows in real-time, track experiments, and see detailed logs and results for every executed task.

Open Source

Open Source Monitoring Scalability Code

What is an open ecosystem? How an ecosystem strategy delivers open source benefits

Dynatrace

MAY 8, 2023

Today’s organizations are constantly enhancing their systems and services as new opportunities arise, inspiring new forms of collaboration while relying on open ecosystems and open source software. To realize the benefits of open ecosystems, organizations must plan for ecosystem-level observability.

Open Source

Open Source Strategy Lambda Metrics

What is Greenplum Database? Intro to the Big Data Database

Scalegrid

MAY 13, 2020

It can scale towards a multi-petabyte level data workload without a single issue, and it allows access to a cluster of powerful servers that will work together within a single SQL interface where you can view all of the data. This feature-packed database provides powerful and rapid analytics on data that scales up to petabyte volumes.

Big Data

Big Data Database Artificial Intelligence Open Source

Dynatrace OTel Collector distribution amplifies OpenTelemetry integration for scalable, production-ready observability

Dynatrace

MARCH 18, 2024

OpenTelemetry standardizes how organizations instrument, generate, and collect telemetry data for analysis and provides community-based support. Because of its flexibility, this open source approach to instrumenting and collecting telemetry data is becoming increasingly important in large-size organizations.

Scalability

Scalability Open Source Analytics Innovation

Simplify log onboarding: From zero to observability in minutes

Dynatrace

MARCH 5, 2025

The newly introduced step-by-step guidance streamlines the process, while quick data flow validation accelerates the onboarding experience even for power users. After successfully installing OneAgent, the log ingestion wizard provides a host selector drop-down to validate the data flow. The API-based approach is the most flexible.

Open Source

Open Source IoT Cloud Azure

Part 1: A Survey of Analytics Engineering Work at Netflix

The Netflix TechBlog

DECEMBER 17, 2024

Metric definitions are often scattered across various databases, documentation sites, and code repositories, making it difficult for analysts and data scientists to find reliable information quickly. DJ stands out as an open source solution that is actively developed and stress-tested at Netflix.

Analytics

Analytics Engineering Entertainment Metrics

Open-Sourcing Metaflow, a Human-Centric Framework for Data Science

The Netflix TechBlog

DECEMBER 3, 2019

by David Berg , Ravi Kiran Chirravuri , Romain Cledat , Savin Goyal , Ferras Hamad , Ville Tuulos tl;dr Metaflow is now open-source! Netflix applies data science to hundreds of use cases across the company, including optimizing content delivery and video encoding. How could we improve the quality of life for data scientists?

Open Source

Open Source AWS Infrastructure Energy

Best PostgreSQL GUI [2024]

Scalegrid

OCTOBER 18, 2024

Structured Query Language (SQL) is a simple declarative programming language utilized by various technology and business professionals to extract and transform data. Visualization & Reporting: Can the tool generate reports or visual representations like ER diagrams, data charts, or query execution plans?

Open Source

Open Source Database Cloud Operating System

Top Redis Use Cases by Core Data Structure Types

High Scalability

SEPTEMBER 3, 2019

Redis , short for Remote Dictionary Server, is a BSD-licensed, open-source in-memory key-value data structure store written in C language by Salvatore Sanfillipo and was first released on May 10, 2009. Instead, Redis stores data in data structures which makes it very flexible to use. Data Structures in Redis.

Open Source

Open Source Cache C++ Database

2019 Open Source Database Report: Top Databases, Public Cloud vs. On-Premise, Polyglot Persistence

High Scalability

JUNE 27, 2019

Ready to transition from a commercial database to open source, and want to know which databases are most popular in 2019? Wondering whether an on-premise vs. public cloud vs. hybrid cloud infrastructure is best for your database strategy?

Open Source

Open Source Database Cloud Strategy

Why applying chaos engineering to data-intensive applications matters

Dynatrace

MAY 23, 2024

The jobs executing such workloads are usually required to operate indefinitely on unbounded streams of continuous data and exhibit heterogeneous modes of failure as they run over long periods. Failures are injected using Chaos Mesh , an open source chaos engineering platform integrated with Kubernetes deployment.

Engineering

Engineering Tuning Latency Open Source

Open Source vs. Proprietary Database Software: What To Choose?

Percona

APRIL 28, 2023

We are a company of open source proponents. We’re also dedicated and active participants in the global open source community. Both open source and proprietary options have advantages. With open source database software , anyone in the general public can access the source code, read it, and modify it.

Open Source

Open Source Database Software Software

RabbitMQ vs. Kafka: Key Differences

Scalegrid

FEBRUARY 6, 2025

RabbitMQ is designed for flexible routing and message reliability, while Kafka handles high-throughput event streaming and real-time data processing. Both serve distinct purposes, from managing message queues to ingesting large data volumes. What is RabbitMQ? What is Apache Kafka?

Latency

Latency Analytics Architecture Storage

What is IT operations analytics? Extract more data insights from more sources

Dynatrace

MAY 1, 2023

IT operations analytics is the process of unifying, storing, and contextually analyzing operational data to understand the health of applications, infrastructure, and environments and streamline everyday operations. ITOA collects operational data to identify patterns and anomalies for faster incident management and near-real-time insights.

Analytics

Analytics Artificial Intelligence Big Data Open Source

The history of Grail: Why you need a data lakehouse

Dynatrace

OCTOBER 4, 2022

Some time ago, at a restaurant near Boston, three Dynatrace colleagues dined and discussed the growing data challenge for enterprises. At its core, this challenge involves a rapid increase in the amount—and complexity—of data collected within a company. Work with different and independent data types. Thus, Grail was born.

Artificial Intelligence

Artificial Intelligence Analytics Storage Architecture

Dynatrace OpenPipeline: Stream processing data ingestion converges observability, security, and business data at massive scale for analytics and automation in context

Dynatrace

JANUARY 31, 2024

Organizations choose data-driven approaches to maximize the value of their data, achieve better business outcomes, and realize cost savings by improving their products, services, and processes. However, there are many obstacles and limitations along the way to becoming a data-driven organization. Understanding the context.

Analytics

Analytics Processing Transportation Storage

Kubernetes in the wild report 2023

Dynatrace

JANUARY 16, 2023

The study analyzes factual Kubernetes production data from thousands of organizations worldwide that are using the Dynatrace Software Intelligence Platform to keep their Kubernetes clusters secure, healthy, and high performing. Open-source software drives a vibrant Kubernetes ecosystem. Java, Go, and Node.js

Open Source

Open Source Java Operating System Programming

The Ultimate Guide to Open Source Databases

Percona

MARCH 30, 2023

The use of open source databases has increased steadily in recent years. Past trepidation — about perceived vulnerabilities and performance issues — has faded as decision makers realize what an “open source database” really is and what it offers. What is an open source database?

Open Source

Open Source Database Storage Scalability

Migrating From MySQL to YugabyteDB Using YugabyteDB Voyager

DZone

FEBRUARY 15, 2023

In this article, I’m going to demonstrate how you can migrate a comprehensive web application from MySQL to YugabyteDB using the open-source data migration engine YugabyteDB Voyager. This helps improve availability, scalability, and performance.

Open Source

Open Source Scalability Database Servers

Title Launch Observability at Netflix Scale

The Netflix TechBlog

MARCH 4, 2025

Store the data in an optimized, highly distributed datastore. Additionally, some collectors will instead poll our kafka queue for impressions data. This data is processed from a real-time impressions stream into a Kafka queue, which our title health system regularly polls. Track real-time title impressions from the NetflixUI.

Traffic

Traffic Strategy Entertainment Innovation

How Data Inspires Building a Scalable, Resilient and Secure Cloud Infrastructure At Netflix

The Netflix TechBlog

MARCH 5, 2019

While our engineering teams have and continue to build solutions to lighten this cognitive load (better guardrails, improved tooling, …), data and its derived products are critical elements to understanding, optimizing and abstracting our infrastructure. In the Reliability space, our data teams focus on two main approaches.

Infrastructure

Infrastructure Cloud Scalability AWS

What is observability? Not just logs, metrics and traces

Dynatrace

OCTOBER 1, 2021

In IT and cloud computing, observability is the ability to measure a system’s current state based on the data it generates, such as logs, metrics, and traces. Organizations usually implement observability using a combination of instrumentation methods including open-source instrumentation tools, such as OpenTelemetry.

Metrics

Metrics Open Source Monitoring Cloud

PostgreSQL vs. Oracle: Difference in Costs, Ease of Use & Functionality

Scalegrid

JULY 13, 2020

The unstoppable rise of open source databases. One database in particular is causing a huge dent in Oracle’s market share – open source PostgreSQL. See how open source PostgreSQL Community version costs compare to Oracle Standard Edition and Oracle Enterprise Edition. What’s causing this massive shift?

Open Source

Open Source Tuning C++ Database

From Proprietary to Open Source: The Complete Guide to Database Migration

Percona

OCTOBER 18, 2023

Migrating a proprietary database to open source is a major decision that can significantly affect your organization. Advantages of migrating to open source For many reasons mentioned earlier, organizations are increasingly shifting towards open source databases for their data management needs.

Open Source

Open Source Database Hardware Strategy

Kubernetes vs Docker: What’s the difference?

Dynatrace

SEPTEMBER 29, 2021

This opens the door to auto-scalable applications, which effortlessly matches the demands of rapidly growing and varying user traffic. Docker Engine is built on top containerd , the leading open-source container runtime, a project of the Cloud Native Computing Foundation (DNCF). What is Docker? What is Kubernetes?

Open Source

Open Source DevOps Traffic Cloud

A Recap of the Data Engineering Open Forum at Netflix

The Netflix TechBlog

JUNE 20, 2024

A summary of sessions at the first Data Engineering Open Forum at Netflix on April 18th, 2024 The Data Engineering Open Forum at Netflix on April 18th, 2024. At Netflix, we aspire to entertain the world, and our data engineering teams play a crucial role in this mission by enabling data-driven decision-making at scale.

Data Engineering

Data Engineering Engineering Entertainment Software Engineering

Database Migration Plan: Avoiding Common Pitfalls in Open Source Migration

Percona

OCTOBER 20, 2023

Many organizations are turning to open source solutions to streamline their operations and reduce costs. Open source migration can be a game-changer, offering flexibility, scalability, and cost-effectiveness. Common pitfalls when migrating to open source Lack of proper planning Proper is the operative word here.

Open Source

Open Source Database Best Practices Strategy

Orchestrating Data/ML Workflows at Scale With Netflix Maestro

The Netflix TechBlog

OCTOBER 18, 2022

by Jun He , Akash Dwivedi , Natallia Dzenisenka , Snehal Chennuru , Praneeth Yenugutala , Pawan Dixit At Netflix, Data and Machine Learning (ML) pipelines are widely used and have become central for the business, representing diverse use cases that go beyond recommendations, predictions and data transformations.

Java

Java Scalability Traffic Architecture

Distributed tracing with Dynatrace just got even better

Dynatrace

MARCH 11, 2025

With our latest enhancements, were transforming the way you work with trace data. For deeper exploration, our Distributed Tracing app empowers you to analyze raw trace data and uncover insights, whether troubleshooting errors, optimizing performance, or discovering the unknown unknowns. But why stop there?

Games

Games Analytics Innovation Metrics

Stuff The Internet Says On Scalability For January 11th, 2019

High Scalability

JANUARY 11, 2019

can be an advantage when you’re looking for data science jobs, but unless you really want to 1) do research or 2) be a professor there’s no really no benefit to getting a Ph.D. Consume Explain the Cloud Like I'm 10 (35 nearly 5 star reviews). I think that having a Ph.D. Strong technologies adapt the world to themselves.

Internet

Internet Internet Scalability Open Source

Write Optimized Spark Code for Big Data Applications

DZone

MARCH 7, 2023

Apache Spark is a powerful open-source distributed computing framework that provides a variety of APIs to support big data processing. In addition, pySpark applications can be tuned to optimize performance and achieve better execution time, scalability, and resource utilization.

Big Data

Big Data Code Tuning Open Source

How Uber Sped Up SQL-based Data Analytics with Presto and Express Queries

InfoQ

NOVEMBER 18, 2024

Uber uses Presto, an open-source distributed SQL query engine, to provide analytics across several data sources, including Apache Hive, Apache Pinot, MySQL, and Apache Kafka. To improve its performance, Uber engineers explored the advantages of dealing with quick queries, a.k.a.

Analytics

Analytics Open Source Engineering Performance

How to Debug an Unresponsive Elasticsearch Cluster

DZone

OCTOBER 1, 2023

Elasticsearch is an open-source search engine and analytics store used by a variety of applications from search in e-commerce stores, to internal log management tools using the ELK stack (short for “Elasticsearch, Logstash, Kibana”).

Open Source

Open Source Tuning Analytics Scalability

Dynatrace Application Security protects your applications in complex cloud environments

Dynatrace

DECEMBER 8, 2020

Open source has also become a fundamental building block of the entire cloud-native stack. While leveraging cloud-native platforms, open-source and third-party libraries accelerate time to value significantly, it also creates new challenges for application security.

Cloud

Cloud Open Source Internet Internet

What is web application security? Everything you need to know.

Dynatrace

JUNE 9, 2021

According to the 2019 Verizon Data Breach Investigations Report, web application attacks were the number one type of attack. Data from Forrester Research provides more detail, finding that 39% of all attacks were designed to exploit vulnerabilities in web applications. And open-source software is rife with vulnerabilities.

Open Source

Open Source Entertainment Tuning Internet

DBLog: A Generic Change-Data-Capture Framework

The Netflix TechBlog

DECEMBER 17, 2019

Andreas Andreakis , Ioannis Papapanagiotou Overview Change-Data-Capture (CDC) allows capturing committed changes from a database in real-time and propagating those changes to downstream consumers [1][2]. Therefore, dumps are needed to capture the full state of a source. Designed with High Availability in mind.

Database

Database Traffic Transportation Open Source

Dynatrace simplifies StatsD, Telegraf, and Prometheus observability with Davis AI

Dynatrace

OCTOBER 7, 2020

At the same time, you can now simply flip a switch to begin ingesting observability data from these tools into the Dynatrace platform, all while retaining your investment and increasing the value you get from Dynatrace. Open-source metric sources automatically map to our Smartscape model for AI analytics.

Open Source

Open Source Metrics Analytics Tuning

Powering the Web: Two Decades of Open Source Publishing With WordPress and MySQL

Percona

JUNE 2, 2023

While not the first open source content management system (CMS), WordPress caught on like nothing before and helped spread open source to millions. It was simple, easy to deploy, and easy to use, and WordPress had the added benefit of being open source. MySQL was founded in 1995 and went open source in 2000.

Open Source

Open Source Traffic Tuning Database

Stuff The Internet Says On Scalability For September 28th, 2018

High Scalability

SEPTEMBER 28, 2018

2 billion : Pokémon GO revenue since launch; 10 : say happy birthday to StackOverflow; $148 million : Uber data breach fine; 75% : streaming music industry revenue in the US; 5.2 Then please recommend my well reviewed book: Explain the Cloud Like I'm 10. They'll love it and you'll be their hero forever. $2 seconds with the system.

Internet

Internet Internet Scalability Blockchain

DBLog: A Generic Change-Data-Capture Framework

The Netflix TechBlog

DECEMBER 17, 2019

Andreas Andreakis , Ioannis Papapanagiotou Overview Change-Data-Capture (CDC) allows capturing committed changes from a database in real-time and propagating those changes to downstream consumers [1][2]. Therefore, dumps are needed to capture the full state of a source. Designed with High Availability in mind.

Database

Database Traffic Transportation Open Source

A Kubernetes platform engineering strategy tames Kubernetes complexity

Dynatrace

JULY 25, 2024

After a decade of helping companies manage container orchestration, Kubernetes, the open source container platform, has established itself as a mature enterprise technology. The average deployment now spans 20 clusters running 10 or more software elements across clouds and data centers. Immediate entry. Automated notifications.

Strategy

Strategy Engineering Open Source Java

Dynatrace extends contextual analytics and AIOps for open observability

Dynatrace

JULY 29, 2021

The complexity of such deployments has accelerated with the adoption of emerging, open-source technologies that generate telemetry data, which is exploding in terms of volume, speed, and cardinality. How can we optimize for performance and scalability? Contextualization is one of the biggest challenges of observability.

Analytics

Analytics Open Source Serverless Architecture

How to Scale Elasticsearch to Solve Your Scalability Issues

Open-Sourcing Metaflow, a Human-Centric Framework for Data Science

Trending Sources

Open-Sourcing a Monitoring GUI for Metaflow

What is an open ecosystem? How an ecosystem strategy delivers open source benefits

What is Greenplum Database? Intro to the Big Data Database

Dynatrace OTel Collector distribution amplifies OpenTelemetry integration for scalable, production-ready observability

Simplify log onboarding: From zero to observability in minutes

Part 1: A Survey of Analytics Engineering Work at Netflix

Open-Sourcing Metaflow, a Human-Centric Framework for Data Science

Best PostgreSQL GUI [2024]

Top Redis Use Cases by Core Data Structure Types

2019 Open Source Database Report: Top Databases, Public Cloud vs. On-Premise, Polyglot Persistence

Why applying chaos engineering to data-intensive applications matters

Open Source vs. Proprietary Database Software: What To Choose?

RabbitMQ vs. Kafka: Key Differences

What is IT operations analytics? Extract more data insights from more sources

The history of Grail: Why you need a data lakehouse

Dynatrace OpenPipeline: Stream processing data ingestion converges observability, security, and business data at massive scale for analytics and automation in context

Kubernetes in the wild report 2023

The Ultimate Guide to Open Source Databases

Migrating From MySQL to YugabyteDB Using YugabyteDB Voyager

Title Launch Observability at Netflix Scale

How Data Inspires Building a Scalable, Resilient and Secure Cloud Infrastructure At Netflix

What is observability? Not just logs, metrics and traces

PostgreSQL vs. Oracle: Difference in Costs, Ease of Use & Functionality

From Proprietary to Open Source: The Complete Guide to Database Migration

Kubernetes vs Docker: What’s the difference?

A Recap of the Data Engineering Open Forum at Netflix

Database Migration Plan: Avoiding Common Pitfalls in Open Source Migration

Orchestrating Data/ML Workflows at Scale With Netflix Maestro

Distributed tracing with Dynatrace just got even better

Stuff The Internet Says On Scalability For January 11th, 2019

Write Optimized Spark Code for Big Data Applications

How Uber Sped Up SQL-based Data Analytics with Presto and Express Queries

How to Debug an Unresponsive Elasticsearch Cluster

Dynatrace Application Security protects your applications in complex cloud environments

What is web application security? Everything you need to know.

DBLog: A Generic Change-Data-Capture Framework

Dynatrace simplifies StatsD, Telegraf, and Prometheus observability with Davis AI

Powering the Web: Two Decades of Open Source Publishing With WordPress and MySQL

Stuff The Internet Says On Scalability For September 28th, 2018

DBLog: A Generic Change-Data-Capture Framework

A Kubernetes platform engineering strategy tames Kubernetes complexity

Dynatrace extends contextual analytics and AIOps for open observability

Stay Connected