Architecture, Database and Storage - Technology Performance Pulse

What is Greenplum Database? Intro to the Big Data Database

Scalegrid

MAY 13, 2020

Greenplum Database is a massively parallel processing (MPP) SQL database that is built and based on PostgreSQL. In this blog post, we explain what Greenplum is, and break down the Greenplum architecture, advantages, major use cases, and how to get started. The Greenplum Architecture. The Greenplum Architecture.

Big Data

Big Data Database Artificial Intelligence Open Source

Ready for changes with Hexagonal Architecture

The Netflix TechBlog

MARCH 10, 2020

At one point, more than 30 developers were working on it, and it had well over 300 database tables. Leveraging Hexagonal Architecture We needed to support the ability to swap data sources without impacting business logic , so we knew we needed to keep them decoupled. The dependency graph in Hexagonal Architecture goes inward.

Architecture

Architecture Transportation Java Strategy

Optimizing data warehouse storage

The Netflix TechBlog

DECEMBER 21, 2020

At this scale, we can gain a significant amount of performance and cost benefits by optimizing the storage layout (records, objects, partitions) as the data lands into our warehouse. We built AutoOptimize to efficiently and transparently optimize the data and metadata storage layout while maximizing their cost and performance benefits.

Storage

Storage Latency Efficiency Data Engineering

AWS serverless services: Exploring your options

Dynatrace

OCTOBER 7, 2021

To get a better understanding of AWS serverless, we’ll first explore the basics of serverless architectures, review AWS serverless offerings, and explore common use cases. Serverless architecture: A primer. Serverless architecture shifts application hosting functions away from local servers onto those managed by providers.

Serverless

Serverless AWS Lambda Storage

Comparing PostgreSQL DigitalOcean Performance & Pricing – ScaleGrid vs. DigitalOcean Managed Databases

Scalegrid

JUNE 4, 2020

ScaleGrid is a fully managed DBaaS that supports MySQL, PostgreSQL and Redis™, along with additional support for MongoDB® database and Greenplum® database. Along with many popular cloud providers, DigitalOcean also provides a Managed Databases service. So, which database service is right for your application? Single Node.

Database

Database Latency Benchmarking Performance

Improved Alerting with Atlas Streaming Eval

The Netflix TechBlog

APRIL 27, 2023

Ruchir Jha , Brian Harrington , Yingwu Zhao TL;DR Streaming alert evaluation scales much better than the traditional approach of polling time-series databases. It allows us to overcome high dimensionality/cardinality limitations of the time-series database. It opens doors to support more exciting use-cases.

Storage

Storage Cache Metrics Database

Designing Instagram

High Scalability

JANUARY 11, 2022

Architecture. We will use a graph database such as Neo4j to store the information. Additionally, we can use columnar databases like Cassandra to store information like user feeds, activities, and counters. After that, the post gets added to the feed of all the followers in the columnar data storage. High Level Design.

Design

Design Media Storage Logistics

Introducing Netflix’s Key-Value Data Abstraction Layer

The Netflix TechBlog

SEPTEMBER 18, 2024

Central to this infrastructure is our use of multiple online distributed databases such as Apache Cassandra , a NoSQL database known for its high availability and scalability. Over time as new key-value databases were introduced and service owners launched new use cases, we encountered numerous challenges with datastore misuse.

Latency

Latency Storage Cache Servers

Amazon DynamoDB ? a Fast and Scalable NoSQL Database.

All Things Distributed

JANUARY 18, 2012

a Fast and Scalable NoSQL Database Service Designed for Internet Scale Applications. Today is a very exciting day as we release Amazon DynamoDB , a fast, highly reliable and cost-effective NoSQL database service designed for internet scale applications. Werner Vogels weblog on building scalable and robust distributed systems.

Scalability

Scalability Database Ecommerce Latency

What is a Distributed Storage System

Scalegrid

FEBRUARY 8, 2024

A distributed storage system is foundational in today’s data-driven landscape, ensuring data spread over multiple servers is reliable, accessible, and manageable. Understanding distributed storage is imperative as data volumes and the need for robust storage solutions rise.

Storage

Storage Systems Big Data Azure

What is cloud monitoring? How to improve your full-stack visibility

Dynatrace

JANUARY 11, 2023

Database monitoring. This ensures the database queries are performant, while also identifying host problems. For example, uptime detection can identify database instability and help to improve mean time to restoration. Cloud storage monitoring. Website monitoring. Virtual machine (VM) monitoring.

Cloud

Cloud Monitoring Best Practices Infrastructure

Kubernetes in the wild report 2023

Dynatrace

JANUARY 16, 2023

The strongest Kubernetes growth areas are security, databases, and CI/CD technologies. Strongest Kubernetes growth areas are security, databases, and CI/CD technologies. Of the organizations in the Kubernetes survey, 71% run databases and caches in Kubernetes, representing a +48% year-over-year increase. Java, Go, and Node.js

Open Source

Open Source Java Operating System Programming

What Should You Know About Graph Database’s Scalability?

DZone

JANUARY 20, 2023

Having a distributed and scalable graph database system is highly sought after in many enterprise scenarios. Do Not Be Misled Designing and implementing a scalable graph database system has never been a trivial task.

Scalability

Scalability Big Data Hardware Internet

Seamless AI-powered observability for multicloud serverless applications

Dynatrace

FEBRUARY 9, 2022

Cloud vendors such as Amazon Web Services (AWS), Microsoft, and Google provide a wide spectrum of serverless services for compute and event-driven workloads, databases, storage, messaging, and other purposes. AI-powered automation and deep, broad observability for serverless architectures. Dynatrace news. New to Dynatrace?

Serverless

Serverless Azure Lambda AWS

Conducting log analysis with an observability platform and full data context

Dynatrace

APRIL 20, 2023

Logs highlight observability challenges Ingesting, storing, and processing the unprecedented explosion of data from sources such as software as a service, multicloud environments, containers, and serverless architectures can be overwhelming for today’s organizations. Grail enables 100% precision insights into all stored data.

Analytics

Analytics Infrastructure Storage Architecture

Boost DevOps maturity with observability and a data lakehouse

Dynatrace

JUNE 9, 2023

Research has found that 99% of organizations have embraced a multicloud architecture. The sheer number of permutations can break traditional databases. When data storage strategies become problematic to DevOps maturity Data warehouse-based approaches add cost and time to analytics projects.

DevOps

DevOps Analytics Storage Metrics

How a data lakehouse brings data insights to life

Dynatrace

OCTOBER 4, 2022

Further, these resources support countless Kubernetes clusters and Java-based architectures. They can call on dozens of databases and deliver gigabytes of data across myriad devices. In most data storage models, indexing engines enable faster access to query logs. Cost-effective architecture.

Analytics

Analytics Storage Infrastructure Metrics

Amazon Aurora ascendant: How we designed a cloud-native relational database

All Things Distributed

MARCH 13, 2019

Relational databases have been around for a long time. The core technologies underpinning the major relational database management systems of today were developed in the 1980–1990s. Those fundamentals helped make relational databases immensely popular with users everywhere.

Database

Database Design Cloud AWS

The Ultimate Guide to Database High Availability

Percona

JUNE 22, 2023

To make data count and to ensure cloud computing is unabated, companies and organizations must have highly available databases. A basic high availability database system provides failover (preferably automatic) from a primary database node to redundant nodes within a cluster. HA is sometimes confused with “fault tolerance.”

Availability

Availability Database Open Source Hardware

What is infrastructure monitoring and why is it mission-critical in the new normal?

Dynatrace

NOVEMBER 2, 2020

IT infrastructure is the heart of your digital business and connects every area – physical and virtual servers, storage, databases, networks, cloud services. This shift requires infrastructure monitoring to ensure all your components work together across applications, operating systems, storage, servers, virtualization, and more.

Infrastructure

Infrastructure Monitoring Virtualization Serverless

Amazon Aurora ascendant: How we designed a cloud-native relational database

All Things Distributed

MARCH 13, 2019

Relational databases have been around for a long time. The core technologies underpinning the major relational database management systems of today were developed in the 1980–1990s. Those fundamentals helped make relational databases immensely popular with users everywhere.

Database

Database Design Cloud AWS

Observability platform vs. observability tools

Dynatrace

DECEMBER 22, 2021

Metrics are measures of critical system values, such as CPU utilization or average write latency to persistent storage. They are particularly important in distributed systems, such as microservices architectures. Observability platforms are becoming essential as the complexity of cloud-native architectures increases.

Artificial Intelligence

Artificial Intelligence Metrics Architecture DevOps

Data ingestion pipeline with Operation Management

The Netflix TechBlog

MARCH 7, 2023

There are many naive solutions possible for this problem for example: Write different runs in different databases. Instead our challenge was to implement this feature on top of Cassandra and ElasticSearch databases because that’s what Marken uses. Marken Architecture Marken’s architecture diagram is as follows.

Media

Media Latency Architecture Database

DBaaS vs Self-Managed Cloud Databases

Scalegrid

DECEMBER 6, 2023

The choice of self-managed cloud databases vs DBaaS is a common debate among those who are looking for the best option that will cater to their particular needs. Database as a Service (DBaaS) and managed databases offer distinct advantages along with certain challenges.

Database

Database Cloud Hardware Storage

The Anna Key-Value Store Now Has 355x the Performance of DynamoDB for the Dollar

High Scalability

SEPTEMBER 8, 2018

New databases used to be announced seemingly every week. While database neogenesis has slowed down considerably, it has not gone necrotic. Each storage server collects statistics about the requests it serves, the data it stores, etc. Our monitoring engine automatically moves data between tiers based on access patterns.

Storage

Storage Performance AWS Cloud

Titan Graph Database Integration with DynamoDB: World-class Performance, Availability, and Scale for New Workloads

All Things Distributed

AUGUST 20, 2015

Today, we are releasing a plugin that allows customers to use the Titan graph engine with Amazon DynamoDB as the backend storage layer. It opens up the possibility to enjoy the value that graph databases bring to relationship-centric use cases, without worrying about managing the underlying storage. Enter graph databases.

Database

Database Logistics Availability Social Media

Data lakehouse innovations advance the three pillars of observability for more collaborative analytics

Dynatrace

FEBRUARY 16, 2023

Grail combines the big-data storage of a data warehouse with the analytical flexibility of a data lake. This unified approach enables Grail to vault past the limitations of traditional databases. And without the encumbrances of traditional databases, Grail performs fast. “In

Analytics

Analytics Innovation Metrics Database

Introducing Netflix TimeSeries Data Abstraction Layer

The Netflix TechBlog

OCTOBER 8, 2024

In previous blog posts, we introduced the Key-Value Data Abstraction Layer and the Data Gateway Platform , both of which are integral to Netflix’s data architecture. Note: Contrary to what the name may suggest, this system is not built as a general-purpose time series database.

Latency

Latency Storage Traffic Tuning

How unified data and analytics offers a new approach to software intelligence

Dynatrace

OCTOBER 4, 2022

Traditionally, though, to gain true business insight, organizations had to make tradeoffs between accessing quality, real-time data and factors such as data storage costs. Additionally, it provides index-free storage and direct analytics access to source data without requiring data rehydration. Don’t reinvent the wheel.

Analytics

Analytics Software Software Storage

Unlock the power of contextual log analytics

Dynatrace

OCTOBER 2, 2024

There is no need to think about schema and indexes, re-hydration, or hot/cold storage. This architecture also means you are not required to determine your log data use cases beforehand or while analyzing logs within the new logs app. Keep in mind that Dynatrace Grail is schema-on-read and indexless, built with scaling in mind.

Analytics

Analytics AWS DevOps Cloud

The Ultimate Guide to Open Source Databases

Percona

MARCH 30, 2023

The use of open source databases has increased steadily in recent years. Past trepidation — about perceived vulnerabilities and performance issues — has faded as decision makers realize what an “open source database” really is and what it offers. What is an open source database?

Open Source

Open Source Database Storage Scalability

MySQL vs MongoDB: Best Choice for You

Scalegrid

FEBRUARY 11, 2025

Choosing the right database often comes down to MongoDB vs MySQL. Whether you need a relational database for complex transactions or a NoSQL database for flexible data storage, weve got you covered. Data modeling is a critical skill for developers to manage and analyze data within these database systems effectively.

Scalability

Scalability Database Storage IoT

Article: Magic Pocket: Dropbox’s Exabyte-Scale Blob Storage System

InfoQ

MAY 15, 2023

A horizontally scalable exabyte-scale blob storage system which operates out of multiple regions, Magic Pocket is used to store all of Dropbox’s data. Adopting SMR technology and erasure codes, the system has extremely high durability guarantees but is cheaper than operating in the cloud. By Facundo Agriel

Storage

Storage Systems Scalability Cloud

Google Improves Cloud Spanner: More Compute and Storage without Price Increase

InfoQ

OCTOBER 14, 2023

Google recently announced various improvements to Cloud Spanner, its distributed, decoupled relational database service with a “50% increase in throughput and 2.5 times the storage per node than before” without a price change. By Steef-Jan Wiggers

Google

Google Storage Cloud Database

Expanding the Cloud ? Announcing Amazon Redshift, a Petabyte.

All Things Distributed

NOVEMBER 28, 2012

Unlike traditional row-based relational databases, which store data for each row sequentially on disk, Amazon Redshift stores each column sequentially. This means that Redshift performs much less wasted IO than a row-based database because it doesnâ??t t read data from columns it doesnâ??t t need when executing a given query.

Cloud

Cloud Storage Architecture Scalability

Weekend Reading: Amazon Aurora: Design Considerations for High Throughput Cloud-Native Relational Databases.

All Things Distributed

MAY 19, 2017

In many, high-throughput, OLTP style applications the database plays a crucial role to achieve scale, reliability, high-performance and cost efficiency. For a long time, these requirements were almost exclusively served by commercial, proprietary databases.

Database

Database Design Cloud Storage

Exploring MySQL 8 New Transaction Data Dictionary: Storing Information About Database Objects

Percona

AUGUST 9, 2023

MySQL 8 brought a significant architectural transformation by replacing the traditional MyISAM-based system tables with the Transaction Data Dictionary (TDD), a more efficient and reliable approach. The primary goal of the Transaction Data Dictionary (TDD) is to enhance the overall performance, stability, and scalability of MySQL databases.

Database

Database Storage Scalability Games

Backup and Recovery for Databases: What You Should Know

Percona

AUGUST 30, 2023

Data powers everything, and unlike coal and coal combustion, data and databases aren’t going away. In this blog, we’ll focus on the elements of database backup and disaster recovery, and we’ll introduce proven solutions for maintaining business continuity, even amid otherwise dire circumstances.

Database

Database Hardware Open Source Strategy

Bulldozer: Batch Data Moving from Data Warehouse to Online Key-Value Stores

The Netflix TechBlog

OCTOBER 27, 2020

At Netflix, we also heavily embrace a microservice architecture that emphasizes separation of concerns. The KV DAL allows applications to use a well-defined and storage engine agnostic HTTP/gRPC key-value data interface that in turn decouples applications from hard to maintain and backwards-incompatible datastore APIs.

Latency

Latency Storage Big Data Tuning

Improving the Efficiency of Goku Time-Series Database at Pinterest

InfoQ

NOVEMBER 6, 2024

Pinterest has modernized and enhanced its Goku time-series database. The recent updates focus on optimizing storage and resource usage without compromising service quality. By Mohit Palriwal

Database

Database Efficiency Storage Scalability

Introducing Percona Builds for Serverless PostgreSQL

Percona

MARCH 7, 2023

The goal is to simplify the provisioning and management of database capacity. One approach is to separate compute and storage to allow for independent scaling. It’s an open source alternative to AWS Aurora Postgres that utilizes a serverless architecture. This can now be achieved with ease using Serverless PostgreSQL.

Serverless

Serverless Storage Open Source Architecture

Simplicity meets power: Introducing the all-new Dynatrace Logs app

Dynatrace

OCTOBER 2, 2024

With Dynatrace, there is no need to think about schema and indexes, re-hydration, or hot/cold storage concepts. This architecture also means you’re not required to determine your log data use cases beforehand or while analyzing logs within the new Logs app.

Azure

Azure DevOps Infrastructure Cloud

How To Scale a Single-Host PostgreSQL Database With Citus

Percona

NOVEMBER 3, 2023

Rather than listing the concepts, function calls, etc, available in Citus, which frankly is a bit boring, I’m going to explore scaling out a database system starting with a single host. I won’t cover all the features but show just enough that you’ll want to see more of what you can learn to accomplish for yourself.

Database

Database Benchmarking Latency C++

Redis vs Memcached in 2024

Scalegrid

MARCH 28, 2024

Key Takeaways Redis offers complex data structures and additional features for versatile data handling, while Memcached excels in simplicity with a fast, multi-threaded architecture for basic caching needs. It uses a hash table to manage these pairs, divided into fixed-size buckets with linked lists for key-value storage.

Cache

Cache Storage Scalability Architecture

What is Greenplum Database? Intro to the Big Data Database

Ready for changes with Hexagonal Architecture

Trending Sources

Optimizing data warehouse storage

AWS serverless services: Exploring your options

Comparing PostgreSQL DigitalOcean Performance & Pricing – ScaleGrid vs. DigitalOcean Managed Databases

Improved Alerting with Atlas Streaming Eval

Designing Instagram

Introducing Netflix’s Key-Value Data Abstraction Layer

Amazon DynamoDB ? a Fast and Scalable NoSQL Database.

What is a Distributed Storage System

What is cloud monitoring? How to improve your full-stack visibility

Kubernetes in the wild report 2023

What Should You Know About Graph Database’s Scalability?

Seamless AI-powered observability for multicloud serverless applications

Conducting log analysis with an observability platform and full data context

Boost DevOps maturity with observability and a data lakehouse

How a data lakehouse brings data insights to life

Amazon Aurora ascendant: How we designed a cloud-native relational database

The Ultimate Guide to Database High Availability

What is infrastructure monitoring and why is it mission-critical in the new normal?

Amazon Aurora ascendant: How we designed a cloud-native relational database

Observability platform vs. observability tools

Data ingestion pipeline with Operation Management

DBaaS vs Self-Managed Cloud Databases

The Anna Key-Value Store Now Has 355x the Performance of DynamoDB for the Dollar

Titan Graph Database Integration with DynamoDB: World-class Performance, Availability, and Scale for New Workloads

Data lakehouse innovations advance the three pillars of observability for more collaborative analytics

Introducing Netflix TimeSeries Data Abstraction Layer

How unified data and analytics offers a new approach to software intelligence

Unlock the power of contextual log analytics

The Ultimate Guide to Open Source Databases

MySQL vs MongoDB: Best Choice for You

Article: Magic Pocket: Dropbox’s Exabyte-Scale Blob Storage System

Google Improves Cloud Spanner: More Compute and Storage without Price Increase

Expanding the Cloud ? Announcing Amazon Redshift, a Petabyte.

Weekend Reading: Amazon Aurora: Design Considerations for High Throughput Cloud-Native Relational Databases.

Exploring MySQL 8 New Transaction Data Dictionary: Storing Information About Database Objects

Backup and Recovery for Databases: What You Should Know

Bulldozer: Batch Data Moving from Data Warehouse to Online Key-Value Stores

Improving the Efficiency of Goku Time-Series Database at Pinterest

Introducing Percona Builds for Serverless PostgreSQL

Simplicity meets power: Introducing the all-new Dynatrace Logs app

How To Scale a Single-Host PostgreSQL Database With Citus

Redis vs Memcached in 2024

Stay Connected