Data, Database and Scalability - Technology Performance Pulse

Seamless RDS to DynamoDB Migration: Unlocking Scalability With the Dual Write Strategy

DZone

DECEMBER 26, 2024

Migrating from Amazon RDS to DynamoDB can be a significant challenge, especially when transitioning from a relational database like RDS (PostgreSQL, MySQL, etc.) One of the most effective strategies for migrating data incrementally is the Dual Write approach. to DynamoDB, a NoSQL, key-value store.

Strategy

Strategy Scalability Best Practices Database

What is Greenplum Database? Intro to the Big Data Database

Scalegrid

MAY 13, 2020

Greenplum Database is a massively parallel processing (MPP) SQL database that is built and based on PostgreSQL. It can scale towards a multi-petabyte level data workload without a single issue, and it allows access to a cluster of powerful servers that will work together within a single SQL interface where you can view all of the data.

Big Data

Big Data Database Artificial Intelligence Open Source

A Comprehensive Guide to Database Sharding: Building Scalable Systems

DZone

OCTOBER 2, 2024

In this article, we’ll dive deep into the concept of database sharding, a critical technique for scaling databases to handle large volumes of data and high levels of traffic. We’ll start by defining what sharding is and why it’s essential for modern, high-performance databases.

Database

Database Systems Scalability Traffic

Monitoring Kubernetes Service Topology Changes in Real-Time

DZone

NOVEMBER 1, 2024

Horizontally scalable data stores like Elasticsearch , Cassandra , and CockroachDB distribute their data across multiple nodes using techniques like consistent hashing. As nodes are added or removed, the data is reshuffled to ensure that the load is spread evenly across the new set of nodes.

Monitoring

Monitoring Scalability Database Cloud

The Ultimate Database Scaling Cheatsheet: Strategies for Optimizing Performance and Scalability

DZone

SEPTEMBER 16, 2024

As applications grow in complexity and user base, the demands on their underlying databases increase significantly. Efficient database scaling becomes crucial to maintain performance, ensure reliability, and manage large volumes of data. This cheatsheet provides an overview of essential techniques for database scaling.

Strategy

Strategy Database Scalability Performance

Scalable Annotation Service?—?Marken

The Netflix TechBlog

JANUARY 25, 2023

Scalable Annotation Service — Marken by Varun Sekhri , Meenakshi Jindal Introduction At Netflix, we have hundreds of micro services each with its own data models or entities. But there are more interesting cases where users want to store temporal (time-based) data or spatial data. Movie Entity with id 1234 has violence.

Scalability

Scalability Latency Media Architecture

Database Sharding and Its Challenges

DZone

JUNE 28, 2023

Database sharding is a powerful technique employed to manage large databases more effectively. It involves partitioning a large database into smaller, more manageable parts, known as shards. The above diagram presents a visual representation of a sharded database.

Database

Database Scalability Servers Performance

What Should You Know About Graph Database’s Scalability?

DZone

JANUARY 20, 2023

Having a distributed and scalable graph database system is highly sought after in many enterprise scenarios. Do Not Be Misled Designing and implementing a scalable graph database system has never been a trivial task.

Scalability

Scalability Big Data Hardware Internet

Building an Optimized Data Pipeline on Azure Using Spark, Data Factory, Databricks, and Synapse Analytics

DZone

APRIL 11, 2023

Data processing in the cloud has become increasingly popular due to its scalability, flexibility, and cost-effectiveness. This article will explore how these technologies can be used together to create an optimized data pipeline for data processing in the cloud.

Azure

Azure Analytics Storage Cloud

Top Redis Use Cases by Core Data Structure Types

High Scalability

SEPTEMBER 3, 2019

Redis , short for Remote Dictionary Server, is a BSD-licensed, open-source in-memory key-value data structure store written in C language by Salvatore Sanfillipo and was first released on May 10, 2009. Depending on how it is configured, Redis can act like a database, a cache or a message broker. Data Structures in Redis.

Open Source

Open Source Cache C++ Database

Introducing Netflix’s Key-Value Data Abstraction Layer

The Netflix TechBlog

SEPTEMBER 18, 2024

Central to this infrastructure is our use of multiple online distributed databases such as Apache Cassandra , a NoSQL database known for its high availability and scalability. Over time as new key-value databases were introduced and service owners launched new use cases, we encountered numerous challenges with datastore misuse.

Latency

Latency Storage Cache Efficiency

Building and Scaling Data Lineage at Netflix to Improve Data Infrastructure Reliability, and…

The Netflix TechBlog

MARCH 25, 2019

Building and Scaling Data Lineage at Netflix to Improve Data Infrastructure Reliability, and Efficiency By: Di Lin , Girish Lingappa , Jitender Aswani Imagine yourself in the role of a data-inspired decision maker staring at a metric on a dashboard about to make a critical business decision but pausing to ask a question?—?“Can

Infrastructure

Infrastructure Big Data Transportation Architecture

Best PostgreSQL GUI [2024]

Scalegrid

OCTOBER 18, 2024

Structured Query Language (SQL) is a simple declarative programming language utilized by various technology and business professionals to extract and transform data. To conclude, GUIs are a vital addition to ease the lives of database users and developers.

Open Source

Open Source Database Cloud Operating System

Boost DevOps maturity with observability and a data lakehouse

Dynatrace

JUNE 9, 2023

ln a world driven by macroeconomic uncertainty, businesses increasingly turn to data-driven decision-making to stay agile. They’re unleashing the power of cloud-based analytics on large data sets to unlock the insights they and the business need to make smarter decisions. All of these factors challenge DevOps maturity.

DevOps

DevOps Analytics Storage Metrics

Conducting log analysis with an observability platform and full data context

Dynatrace

APRIL 20, 2023

Modern organizations ingest petabytes of data daily, but legacy approaches to log analysis and management cannot accommodate this volume of data. based financial services group, discussed how the bank uses log monitoring on the Dynatrace platform with an emphasis on observability and security data.

Analytics

Analytics Infrastructure Storage Architecture

Bulldozer: Batch Data Moving from Data Warehouse to Online Key-Value Stores

The Netflix TechBlog

OCTOBER 27, 2020

By Tianlong Chen and Ioannis Papapanagiotou Netflix has more than 195 million subscribers that generate petabytes of data everyday. Data scientists and engineers collect this data from our subscribers and videos, and implement data analytics models to discover customer behaviour with the goal of maximizing user joy.

Latency

Latency Storage Big Data Tuning

What is IT operations analytics? Extract more data insights from more sources

Dynatrace

MAY 1, 2023

IT operations analytics is the process of unifying, storing, and contextually analyzing operational data to understand the health of applications, infrastructure, and environments and streamline everyday operations. ITOA collects operational data to identify patterns and anomalies for faster incident management and near-real-time insights.

Analytics

Analytics Artificial Intelligence Big Data Open Source

Introducing Netflix TimeSeries Data Abstraction Layer

The Netflix TechBlog

OCTOBER 8, 2024

Rajiv Shringi Vinay Chella Kaidan Fullerton Oleksii Tkachuk Joey Lynch Introduction As Netflix continues to expand and diversify into various sectors like Video on Demand and Gaming , the ability to ingest and store vast amounts of temporal data — often reaching petabytes — with millisecond access latency has become increasingly vital.

Latency

Latency Storage Traffic Tuning

Part 1: A Survey of Analytics Engineering Work at Netflix

The Netflix TechBlog

DECEMBER 17, 2024

Metric definitions are often scattered across various databases, documentation sites, and code repositories, making it difficult for analysts and data scientists to find reliable information quickly. Besides providing the end user with an instant answer in a preferred data visualization, LORE instantly learns from the users feedback.

Analytics

Analytics Engineering Entertainment Metrics

How Data Inspires Building a Scalable, Resilient and Secure Cloud Infrastructure At Netflix

The Netflix TechBlog

MARCH 5, 2019

While our engineering teams have and continue to build solutions to lighten this cognitive load (better guardrails, improved tooling, …), data and its derived products are critical elements to understanding, optimizing and abstracting our infrastructure. In the Reliability space, our data teams focus on two main approaches.

Infrastructure

Infrastructure Cloud Scalability AWS

DBLog: A Generic Change-Data-Capture Framework

The Netflix TechBlog

DECEMBER 17, 2019

Andreas Andreakis , Ioannis Papapanagiotou Overview Change-Data-Capture (CDC) allows capturing committed changes from a database in real-time and propagating those changes to downstream consumers [1][2]. In databases like MySQL and PostgreSQL, transaction logs are the source of CDC events.

Database

Database Traffic Transportation Open Source

Introducing the New Dapr Jobs API and Scheduler Service

DZone

OCTOBER 9, 2024

The Scheduler service enables this and is designed to address the performance and scalability improvements on Actor reminders and the Workflow API. However, the binding approach lacked in the areas of durability and scalability, and more importantly, could not be combined with other Dapr APIs. Prior to v1.14

Scalability

Scalability Design Database Processing

Unmatched scalability and security of Dynatrace extensions now available for all supported technologies: 7 reasons to migrate your JMX and Python plugins

Dynatrace

NOVEMBER 3, 2023

that offers security, scalability, and simplicity of use. already address SNMP, WMI, SQL databases, and Prometheus technologies, serving the monitoring needs of hundreds of Dynatrace customers. Python code also carries limited scalability and the burden of governing its security in production environments and lifecycle management.

Technology

Technology Technology Scalability Availability

DBLog: A Generic Change-Data-Capture Framework

The Netflix TechBlog

DECEMBER 17, 2019

Andreas Andreakis , Ioannis Papapanagiotou Overview Change-Data-Capture (CDC) allows capturing committed changes from a database in real-time and propagating those changes to downstream consumers [1][2]. In databases like MySQL and PostgreSQL, transaction logs are the source of CDC events.

Database

Database Traffic Transportation Open Source

AWS serverless services: Exploring your options

Dynatrace

OCTOBER 7, 2021

This means you no longer have to provision, scale, and maintain servers to run your applications, databases, and storage systems. Scalability. Finally, there’s scalability. Amazon EventBridge: EventBridge to bridges the data gap between your applications and other services, such as Lambda or specific SaaS apps.

Serverless

Serverless AWS Lambda Storage

Migrating From MySQL to YugabyteDB Using YugabyteDB Voyager

DZone

FEBRUARY 15, 2023

In this article, I’m going to demonstrate how you can migrate a comprehensive web application from MySQL to YugabyteDB using the open-source data migration engine YugabyteDB Voyager. Nowadays, many people migrate their applications from traditional, single-server relational databases to distributed database clusters.

Open Source

Open Source Scalability Database Servers

Designing Instagram

High Scalability

JANUARY 11, 2022

from a client it performs two parallel operations: i) persisting the action in the data store ii) publish the action in a streaming data store for a pub-sub model. User Feed Service, Media Counter Service) read the actions from the streaming data store and performs their specific tasks. After that, the various services (e.g.

Design

Design Media Storage Logistics

Optimizing data warehouse storage

The Netflix TechBlog

DECEMBER 21, 2020

By Anupom Syam Background At Netflix, our current data warehouse contains hundreds of Petabytes of data stored in AWS S3 , and each day we ingest and create additional Petabytes. We built AutoOptimize to efficiently and transparently optimize the data and metadata storage layout while maximizing their cost and performance benefits.

Storage

Storage Latency Efficiency Data Engineering

Stuff The Internet Says On Scalability For September 28th, 2018

High Scalability

SEPTEMBER 28, 2018

2 billion : Pokémon GO revenue since launch; 10 : say happy birthday to StackOverflow; $148 million : Uber data breach fine; 75% : streaming music industry revenue in the US; 5.2 MrTonyD : I was writing production code over 30 years ago (C, OS, database). They'll love it and you'll be their hero forever. $2 I acknowledge that.

Internet

Internet Internet Scalability Blockchain

Stuff The Internet Says On Scalability For November 23rd, 2018

High Scalability

NOVEMBER 23, 2018

@KevinBankston : Ford's CEO just said on NPR that the future of profitability for the company is all the data from its 100 million vehicles (and the people in them) they'll be able to monetize. Capitalism & surveillance capitalism are becoming increasingly indistinguishable (and frightening). Can you eat more after Thanksgiving?

Internet

Internet Internet Scalability Analytics

2019 Open Source Database Report: Top Databases, Public Cloud vs. On-Premise, Polyglot Persistence

High Scalability

JUNE 27, 2019

Ready to transition from a commercial database to open source, and want to know which databases are most popular in 2019? Wondering whether an on-premise vs. public cloud vs. hybrid cloud infrastructure is best for your database strategy?

Open Source

Open Source Database Cloud Strategy

Orchestrating Data/ML Workflows at Scale With Netflix Maestro

The Netflix TechBlog

OCTOBER 18, 2022

by Jun He , Akash Dwivedi , Natallia Dzenisenka , Snehal Chennuru , Praneeth Yenugutala , Pawan Dixit At Netflix, Data and Machine Learning (ML) pipelines are widely used and have become central for the business, representing diverse use cases that go beyond recommendations, predictions and data transformations.

Java

Java Scalability Traffic Architecture

The road to observability demo part 3: Collect, instrument, and analyze telemetry data automatically with Dynatrace

Dynatrace

MAY 17, 2023

OpenTelemetry , the open source observability tool, has become the go-to standard for instrumenting custom applications to collect observability telemetry data. For this third and final part of our series, we saved the best for last: How you can enhance telemetry data even more and with less effort on your end with Dynatrace OneAgent.

Metrics

Metrics Database Monitoring Network

A Guide to Prometheus Exporters: Techniques and Best Practices

DZone

NOVEMBER 21, 2023

Thanks to its expressive query language (PromQL), scalability, and configurable data format, it remains one of the most popular tools for data collection. Paired with Prometheus exporters, the tool can adapt to a variety of surroundings, which is one of its strongest points.

Best Practices

Best Practices Scalability Monitoring Database

Architecture Patterns: Sharding

DZone

JANUARY 2, 2024

Sharding, a database architecture pattern, involves partitioning a database into smaller, faster, more manageable parts called shards. Each shard is a distinct database, and collectively, these shards make up the entire database. What Is Sharding?

Architecture

Architecture Database Scalability Servers

Mastering Distributed SQL™ Databases in 2025

Scalegrid

JANUARY 10, 2025

Heading into 2024, SQL databases will remain essential in data management, increasingly using distributed systems to meet growing needs for scalability and reliability. According to 2023 statistics, 49% of web applications use an SQL-based database , with SQL having a 75% adoption rate in the IT industry.

Database

Database Scalability Best Practices Blockchain

Improved Alerting with Atlas Streaming Eval

The Netflix TechBlog

APRIL 27, 2023

Ruchir Jha , Brian Harrington , Yingwu Zhao TL;DR Streaming alert evaluation scales much better than the traditional approach of polling time-series databases. It allows us to overcome high dimensionality/cardinality limitations of the time-series database. It opens doors to support more exciting use-cases.

Storage

Storage Cache Metrics Database

Kubernetes in the wild report 2023

Dynatrace

JANUARY 16, 2023

The study analyzes factual Kubernetes production data from thousands of organizations worldwide that are using the Dynatrace Software Intelligence Platform to keep their Kubernetes clusters secure, healthy, and high performing. The strongest Kubernetes growth areas are security, databases, and CI/CD technologies. Java, Go, and Node.js

Open Source

Open Source Java Operating System Programming

Navigating the Skies

DZone

SEPTEMBER 25, 2023

This is an article from DZone's 2023 Database Systems Trend Report. At the heart of this digital transformation lies the critical role of cloud databases — the backbone of modern data management. At the heart of this digital transformation lies the critical role of cloud databases — the backbone of modern data management.

Education

Education Innovation Scalability Cloud

PostgreSQL vs. Oracle: Difference in Costs, Ease of Use & Functionality

Scalegrid

JULY 13, 2020

Oracle Database is a commercial, proprietary multi-model database management system produced by Oracle Corporation, and the largest relational database management system (RDBMS) in the world. While Oracle remains the #1 database on the market, its popularity has steadily declined by over 18% since 2013. Not available.

Open Source

Open Source Tuning C++ Database

How to Debug an Unresponsive Elasticsearch Cluster

DZone

OCTOBER 1, 2023

As a distributed database, your data is partitioned into “shards” which are then allocated to one or more servers. Because of this sharding, a read or write request to an Elasticsearch cluster requires coordinating between multiple nodes as there is no “global view” of your data on a single server.

Open Source

Open Source Tuning Analytics Scalability

Key Advantages of DBMS for Efficient Data Management

Scalegrid

JANUARY 5, 2024

Enhanced data security, better data integrity, and efficient access to information. If you’re considering a database management system, understanding these benefits is crucial. Understanding Database Management Systems (DBMS) A Database Management System (DBMS) assists users in creating and managing databases.

Efficiency

Efficiency Storage Database Scalability

How To Deploy the ELK Stack on Kubernetes

DZone

OCTOBER 24, 2023

The ELK stack is an abbreviation for Elasticsearch, Logstash, and Kibana, which offers the following capabilities: Elasticsearch: a scalable search and analytics engine with a log analytics tool and application-formed database, perfect for data-driven applications.

Analytics

Analytics Storage Infrastructure Scalability

Seamless RDS to DynamoDB Migration: Unlocking Scalability With the Dual Write Strategy

What is Greenplum Database? Intro to the Big Data Database

Trending Sources

A Comprehensive Guide to Database Sharding: Building Scalable Systems

Monitoring Kubernetes Service Topology Changes in Real-Time

The Ultimate Database Scaling Cheatsheet: Strategies for Optimizing Performance and Scalability

Scalable Annotation Service?—?Marken

Database Sharding and Its Challenges

What Should You Know About Graph Database’s Scalability?

Building an Optimized Data Pipeline on Azure Using Spark, Data Factory, Databricks, and Synapse Analytics

Top Redis Use Cases by Core Data Structure Types

Introducing Netflix’s Key-Value Data Abstraction Layer

Building and Scaling Data Lineage at Netflix to Improve Data Infrastructure Reliability, and…

Best PostgreSQL GUI [2024]

Boost DevOps maturity with observability and a data lakehouse

Conducting log analysis with an observability platform and full data context

Top PostgreSQL 17 New Features

Bulldozer: Batch Data Moving from Data Warehouse to Online Key-Value Stores

What is IT operations analytics? Extract more data insights from more sources

Introducing Netflix TimeSeries Data Abstraction Layer

Part 1: A Survey of Analytics Engineering Work at Netflix

How Data Inspires Building a Scalable, Resilient and Secure Cloud Infrastructure At Netflix

DBLog: A Generic Change-Data-Capture Framework

Introducing the New Dapr Jobs API and Scheduler Service

Unmatched scalability and security of Dynatrace extensions now available for all supported technologies: 7 reasons to migrate your JMX and Python plugins

DBLog: A Generic Change-Data-Capture Framework

AWS serverless services: Exploring your options

Migrating From MySQL to YugabyteDB Using YugabyteDB Voyager

Designing Instagram

Optimizing data warehouse storage

Stuff The Internet Says On Scalability For September 28th, 2018

Stuff The Internet Says On Scalability For November 23rd, 2018

2019 Open Source Database Report: Top Databases, Public Cloud vs. On-Premise, Polyglot Persistence

Orchestrating Data/ML Workflows at Scale With Netflix Maestro

The road to observability demo part 3: Collect, instrument, and analyze telemetry data automatically with Dynatrace

A Guide to Prometheus Exporters: Techniques and Best Practices

Architecture Patterns: Sharding

Mastering Distributed SQL™ Databases in 2025

Improved Alerting with Atlas Streaming Eval

Kubernetes in the wild report 2023

Navigating the Skies

PostgreSQL vs. Oracle: Difference in Costs, Ease of Use & Functionality

How to Debug an Unresponsive Elasticsearch Cluster

Key Advantages of DBMS for Efficient Data Management

How To Deploy the ELK Stack on Kubernetes

Stay Connected