Database, Scalability and Storage - Technology Performance Pulse

What is Greenplum Database? Intro to the Big Data Database

Scalegrid

MAY 13, 2020

Greenplum Database is a massively parallel processing (MPP) SQL database that is built and based on PostgreSQL. Greenplum Database is an open-source , hardware-agnostic MPP database for analytics, based on PostgreSQL and developed by Pivotal who was later acquired by VMware. What is an MPP Database?

Big Data

Big Data Database Artificial Intelligence Open Source

What Should You Know About Graph Database’s Scalability?

DZone

JANUARY 20, 2023

Having a distributed and scalable graph database system is highly sought after in many enterprise scenarios. Do Not Be Misled Designing and implementing a scalable graph database system has never been a trivial task.

Scalability

Scalability Big Data Hardware Internet

Amazon DynamoDB ? a Fast and Scalable NoSQL Database.

All Things Distributed

JANUARY 18, 2012

Werner Vogels weblog on building scalable and robust distributed systems. a Fast and Scalable NoSQL Database Service Designed for Internet Scale Applications. Today is a very exciting day as we release Amazon DynamoDB , a fast, highly reliable and cost-effective NoSQL database service designed for internet scale applications.

Scalability

Scalability Database Ecommerce Latency

Optimizing data warehouse storage

The Netflix TechBlog

DECEMBER 21, 2020

At this scale, we can gain a significant amount of performance and cost benefits by optimizing the storage layout (records, objects, partitions) as the data lands into our warehouse. We built AutoOptimize to efficiently and transparently optimize the data and metadata storage layout while maximizing their cost and performance benefits.

Storage

Storage Latency Efficiency Data Engineering

AWS serverless services: Exploring your options

Dynatrace

OCTOBER 7, 2021

This means you no longer have to provision, scale, and maintain servers to run your applications, databases, and storage systems. Speed is next; serverless solutions are quick to spin up or down as needed, and there are no delays due to limited storage or resource access. Scalability. Finally, there’s scalability.

Serverless

Serverless AWS Lambda Storage

Designing Instagram

High Scalability

JANUARY 11, 2022

We will use a graph database such as Neo4j to store the information. Additionally, we can use columnar databases like Cassandra to store information like user feeds, activities, and counters. After that, the post gets added to the feed of all the followers in the columnar data storage. Sample Queries supported by Graph Database.

Design

Design Media Storage Logistics

Choosing the Right Storage for PostgreSQL on Kubernetes: A Benchmark Analysis

Percona

APRIL 1, 2025

As more organizations move their PostgreSQL databases onto Kubernetes, a common question arises: Which storage solution best handles its demands? Picking the right option is critical, directly impacting performance, reliability, and scalability.

Storage

Storage Benchmarking Scalability Database

Improved Alerting with Atlas Streaming Eval

The Netflix TechBlog

APRIL 27, 2023

Ruchir Jha , Brian Harrington , Yingwu Zhao TL;DR Streaming alert evaluation scales much better than the traditional approach of polling time-series databases. It allows us to overcome high dimensionality/cardinality limitations of the time-series database. It opens doors to support more exciting use-cases.

Storage

Storage Cache Metrics Database

Mastering Disk Space Management with MongoDB® Storage Engines

Scalegrid

MAY 11, 2024

MongoDB offers several storage engines that cater to various use cases. The default storage engine in earlier versions was MMAPv1, which utilized memory-mapped files and document-level locking. The newer, pluggable storage engine, WiredTiger, addresses this by using prefix compression, collection-level locking, and row-based storage.

Storage

Storage Engineering Cache Database

How To Deploy the ELK Stack on Kubernetes

DZone

OCTOBER 24, 2023

The ELK stack is an abbreviation for Elasticsearch, Logstash, and Kibana, which offers the following capabilities: Elasticsearch: a scalable search and analytics engine with a log analytics tool and application-formed database, perfect for data-driven applications.

Analytics

Analytics Storage Infrastructure Scalability

What is a Distributed Storage System

Scalegrid

FEBRUARY 8, 2024

A distributed storage system is foundational in today’s data-driven landscape, ensuring data spread over multiple servers is reliable, accessible, and manageable. Understanding distributed storage is imperative as data volumes and the need for robust storage solutions rise.

Storage

Storage Systems Big Data Azure

Part 1: A Survey of Analytics Engineering Work at Netflix

The Netflix TechBlog

DECEMBER 17, 2024

Metric definitions are often scattered across various databases, documentation sites, and code repositories, making it difficult for analysts and data scientists to find reliable information quickly. Our ecosystem enables engineering teams to run applications and services at scale, utilizing a mix of open-source and proprietary solutions.

Analytics

Analytics Engineering Entertainment Metrics

Introducing Netflix’s Key-Value Data Abstraction Layer

The Netflix TechBlog

SEPTEMBER 18, 2024

Central to this infrastructure is our use of multiple online distributed databases such as Apache Cassandra , a NoSQL database known for its high availability and scalability. Over time as new key-value databases were introduced and service owners launched new use cases, we encountered numerous challenges with datastore misuse.

Latency

Latency Storage Cache Servers

Building an Optimized Data Pipeline on Azure Using Spark, Data Factory, Databricks, and Synapse Analytics

DZone

APRIL 11, 2023

Data processing in the cloud has become increasingly popular due to its scalability, flexibility, and cost-effectiveness. It provides built-in connectors for various data sources such as databases, file systems, cloud storage, and more.

Azure

Azure Analytics Storage Cloud

What is cloud monitoring? How to improve your full-stack visibility

Dynatrace

JANUARY 11, 2023

Database monitoring. This ensures the database queries are performant, while also identifying host problems. For example, uptime detection can identify database instability and help to improve mean time to restoration. Cloud storage monitoring. Website monitoring. Virtual machine (VM) monitoring.

Cloud

Cloud Monitoring Best Practices Infrastructure

A one size fits all database doesn't fit anyone

All Things Distributed

JUNE 21, 2018

A common question that I get is why do we offer so many database products? To do this, they need to be able to use multiple databases and data models within the same application. Seldom can one database fit the needs of multiple distinct use cases. Seldom can one database fit the needs of multiple distinct use cases.

Database

Database AWS Games Latency

PostgreSQL vs. Oracle: Difference in Costs, Ease of Use & Functionality

Scalegrid

JULY 13, 2020

Oracle Database is a commercial, proprietary multi-model database management system produced by Oracle Corporation, and the largest relational database management system (RDBMS) in the world. While Oracle remains the #1 database on the market, its popularity has steadily declined by over 18% since 2013. Not available.

Open Source

Open Source Tuning C++ Database

Kubernetes in the wild report 2023

Dynatrace

JANUARY 16, 2023

The strongest Kubernetes growth areas are security, databases, and CI/CD technologies. Through effortless provisioning, a larger number of small hosts provide a cost-effective and scalable platform. Strongest Kubernetes growth areas are security, databases, and CI/CD technologies. Java, Go, and Node.js

Open Source

Open Source Java Operating System Programming

DBaaS vs Self-Managed Cloud Databases

Scalegrid

DECEMBER 6, 2023

The choice of self-managed cloud databases vs DBaaS is a common debate among those who are looking for the best option that will cater to their particular needs. Database as a Service (DBaaS) and managed databases offer distinct advantages along with certain challenges.

Database

Database Cloud Hardware Storage

Optimize your environment: Unveiling Dynatrace Hyper-V extension for enhanced performance and efficient troubleshooting

Dynatrace

OCTOBER 23, 2023

Secondly, determining the correct allocation of resources (CPU, memory, storage) to each virtual machine to ensure optimal performance without over-provisioning can be difficult. Firstly, managing virtual networks can be complex as networking in a virtual environment differs significantly from traditional networking.

Efficiency

Efficiency Virtualization Hardware Performance

How To Design For High-Traffic Events And Prevent Your Website From Crashing

Smashing Magazine

JANUARY 7, 2025

For example, you can switch to a scalable cloud-based web host, or compress/optimize images to save bandwidth. Choose A Scalable Web Host The most convenient way to design a high-traffic website without worrying about website crashes is to upgrade your web hosting solution. Caching can help your website combat this issue.

Traffic

Traffic Website Design Cache

Article: Magic Pocket: Dropbox’s Exabyte-Scale Blob Storage System

InfoQ

MAY 15, 2023

A horizontally scalable exabyte-scale blob storage system which operates out of multiple regions, Magic Pocket is used to store all of Dropbox’s data. Adopting SMR technology and erasure codes, the system has extremely high durability guarantees but is cheaper than operating in the cloud. By Facundo Agriel

Storage

Storage Systems Scalability Cloud

Conducting log analysis with an observability platform and full data context

Dynatrace

APRIL 20, 2023

“Logs magnify these issues by far due to their volatile structure, the massive storage needed to process them, and due to potential gold hidden in their content,” Pawlowski said, highlighting the importance of log analysis. ” In many cases, indexed databases only provide access to a sample of statistical data summaries.

Analytics

Analytics Infrastructure Storage Architecture

Stuff The Internet Says On Scalability For December 21st, 2018

High Scalability

DECEMBER 21, 2018

It's HighScalability time: Have a very scalable Xmas everyone! Whether it’s database or message queues it’s a really weird combo of licenses and features for hostage. See you in the New Year. Do you like this sort of Stuff? Please support me on Patreon. I'd really appreciate it. Still looking for that perfect xmas gift?

Internet

Internet Internet Scalability Serverless

Boost DevOps maturity with observability and a data lakehouse

Dynatrace

JUNE 9, 2023

Increasing an organization’s DevOps maturity is a key goal as teams adopt more cloud-native technologies, which simultaneously makes their environments more scalable and feature-rich but also more complex. The sheer number of permutations can break traditional databases.

DevOps

DevOps Analytics Storage Metrics

What is infrastructure monitoring and why is it mission-critical in the new normal?

Dynatrace

NOVEMBER 2, 2020

IT infrastructure is the heart of your digital business and connects every area – physical and virtual servers, storage, databases, networks, cloud services. This shift requires infrastructure monitoring to ensure all your components work together across applications, operating systems, storage, servers, virtualization, and more.

Infrastructure

Infrastructure Monitoring Virtualization Serverless

Building Netflix’s Distributed Tracing Infrastructure

The Netflix TechBlog

OCTOBER 19, 2020

Our distributed tracing infrastructure is grouped into three sections: tracer library instrumentation, stream processing, and storage. An additional implication of a lenient sampling policy is the need for scalable stream processing and storage infrastructure fleets to handle increased data volume. Storage: don’t break the bank!

Infrastructure

Infrastructure Transportation Storage Open Source

Titan Graph Database Integration with DynamoDB: World-class Performance, Availability, and Scale for New Workloads

All Things Distributed

AUGUST 20, 2015

Today, we are releasing a plugin that allows customers to use the Titan graph engine with Amazon DynamoDB as the backend storage layer. It opens up the possibility to enjoy the value that graph databases bring to relationship-centric use cases, without worrying about managing the underlying storage. Enter graph databases.

Database

Database Logistics Availability Social Media

2019 PostgreSQL Trends Report: Private vs. Public Cloud, Migrations, Database Combinations & Top Reasons Used

High Scalability

APRIL 3, 2019

PostgreSQL is an open source object-relational database system that has soared in popularity over the past 30 years from its active, loyal, and growing community. For the 2nd year in a row, PostgreSQL has kept the title of #1 fastest growing database in the world according to the DBMS of the Year report by the experts at DB-Engines.

Database

Database Cloud Open Source Systems

Introducing Netflix TimeSeries Data Abstraction Layer

The Netflix TechBlog

OCTOBER 8, 2024

The Key-Value Abstraction offers a flexible, scalable solution for storing and accessing structured key-value data, while the Data Gateway Platform provides essential infrastructure for protecting, configuring, and deploying the data tier. We do not use it for metrics, histograms, timers, or any such near-real time analytics use case.

Latency

Latency Storage Traffic Tuning

NoSQL Data Modeling Techniques

Highly Scalable

MARCH 1, 2012

NoSQL databases are often compared by various non-functional criteria, such as scalability, performance, and consistency. At the same time, NoSQL data modeling is not so well studied and lacks the systematic theory found in relational databases. Document databases advance the BigTable model offering two significant improvements.

Database

Database Ecommerce Efficiency Engineering

Expanding the Cloud ? Managing Cold Storage with Amazon Glacier

All Things Distributed

AUGUST 20, 2012

Werner Vogels weblog on building scalable and robust distributed systems. Managing Cold Storage with Amazon Glacier. With the introduction of Amazon Glacier , IT organizations now have a solution that removes the headaches of digital archiving and provides extremely low cost storage. All Things Distributed. Comments ().

Storage

Storage Cloud AWS Media

MySQL vs MongoDB: Best Choice for You

Scalegrid

FEBRUARY 11, 2025

Choosing the right database often comes down to MongoDB vs MySQL. This article will help you understand the core differences in data structure, scalability, and use cases. Whether you need a relational database for complex transactions or a NoSQL database for flexible data storage, weve got you covered.

Scalability

Scalability Database Storage IoT

From syslog to AWS Firehose: Dynatrace log management innovations that enhance observability

Dynatrace

SEPTEMBER 5, 2024

Native support for Syslog messages Syslog messages are generated by default in Linux and Unix operating systems, security devices, network devices, and applications such as web servers and databases. Dynatrace supports scalable data ingestion, ensuring your observability infrastructure grows with your cloud environment.

Innovation

Innovation AWS Analytics Storage

MySQL General Tablespaces: A Powerful Storage Option for Your Data

Percona

JANUARY 4, 2024

Managing storage and performance efficiently in your MySQL database is crucial, and general tablespaces offer flexibility in achieving this. In contrast to the single system tablespace that holds system tables by default, general tablespaces are user-defined storage containers for multiple InnoDB tables.

Storage

Storage Engineering Database Open Source

The Ultimate Guide to Open Source Databases

Percona

MARCH 30, 2023

The use of open source databases has increased steadily in recent years. Past trepidation — about perceived vulnerabilities and performance issues — has faded as decision makers realize what an “open source database” really is and what it offers. What is an open source database?

Open Source

Open Source Database Storage Scalability

The Ultimate Guide to Database High Availability

Percona

JUNE 22, 2023

To make data count and to ensure cloud computing is unabated, companies and organizations must have highly available databases. A basic high availability database system provides failover (preferably automatic) from a primary database node to redundant nodes within a cluster. HA is sometimes confused with “fault tolerance.”

Availability

Availability Database Open Source Hardware

Zendesk Moves from DynamoDB to MySQL and S3 to Save over 80% in Costs

InfoQ

DECEMBER 29, 2023

Zendesk reduced its data storage costs by over 80% by migrating from DynamoDB to a tiered storage solution using MySQL and S3. The company considered different storage technologies and decided to combine the relational database and the object store to strike a balance between querybility and scalability while keeping the costs down.

Storage

Storage Scalability Database Technology

Stuff The Internet Says On Scalability For June 15th, 2018

High Scalability

JUNE 15, 2018

1.6x : better deep learning cluster scheduling on k8s; 100,000 : Large-scale Diverse Driving Video Database; 3rd : reddit popularity in the US; 50% : increase in Neural Information Processing System papers, AI bubble? They'll love you even more. Domain Specific Architectures are getting 20x and 40x improvements, not just 5-10%.

Internet

Internet Internet Scalability Blockchain

Key Advantages of DBMS for Efficient Data Management

Scalegrid

JANUARY 5, 2024

If you’re considering a database management system, understanding these benefits is crucial. DBMS enhances data security with encryption, implements various access controls, and enables improved data sharing and concurrent access, thus facilitating quick response to changes and maintaining consistent database accuracy.

Efficiency

Efficiency Storage Database Scalability

Process more with less using smarter cluster overload prevention for Dynatrace Managed

Dynatrace

MAY 14, 2020

The world’s most scalable, automatic distributed tracing pushes the boundary once again with enhanced Adaptive Load Management. Dynatrace PurePath technology captures and analyzes transactions end to end across every tier of your application technology stack, from the browser all the way down to the code and database level.

Processing

Processing Hardware Traffic Storage

Google Improves Cloud Spanner: More Compute and Storage without Price Increase

InfoQ

OCTOBER 14, 2023

Google recently announced various improvements to Cloud Spanner, its distributed, decoupled relational database service with a “50% increase in throughput and 2.5 times the storage per node than before” without a price change. By Steef-Jan Wiggers

Google

Google Storage Cloud Database

From Proprietary to Open Source: The Complete Guide to Database Migration

Percona

OCTOBER 18, 2023

Migrating a proprietary database to open source is a major decision that can significantly affect your organization. Today, we’ll be taking a deep dive into the intricacies of database migration, along with specific solutions to help make the process easier.

Open Source

Open Source Database Hardware Strategy

What is Greenplum Database? Intro to the Big Data Database

What Should You Know About Graph Database’s Scalability?

Trending Sources

Amazon DynamoDB ? a Fast and Scalable NoSQL Database.

Optimizing data warehouse storage

Top PostgreSQL 17 New Features

AWS serverless services: Exploring your options

Designing Instagram

Choosing the Right Storage for PostgreSQL on Kubernetes: A Benchmark Analysis

Improved Alerting with Atlas Streaming Eval

Mastering Disk Space Management with MongoDB® Storage Engines

How To Deploy the ELK Stack on Kubernetes

What is a Distributed Storage System

Part 1: A Survey of Analytics Engineering Work at Netflix

Introducing Netflix’s Key-Value Data Abstraction Layer

Building an Optimized Data Pipeline on Azure Using Spark, Data Factory, Databricks, and Synapse Analytics

What is cloud monitoring? How to improve your full-stack visibility

A one size fits all database doesn't fit anyone

PostgreSQL vs. Oracle: Difference in Costs, Ease of Use & Functionality

Kubernetes in the wild report 2023

DBaaS vs Self-Managed Cloud Databases

Optimize your environment: Unveiling Dynatrace Hyper-V extension for enhanced performance and efficient troubleshooting

How To Design For High-Traffic Events And Prevent Your Website From Crashing

Article: Magic Pocket: Dropbox’s Exabyte-Scale Blob Storage System

Conducting log analysis with an observability platform and full data context

Stuff The Internet Says On Scalability For December 21st, 2018

Boost DevOps maturity with observability and a data lakehouse

What is infrastructure monitoring and why is it mission-critical in the new normal?

Building Netflix’s Distributed Tracing Infrastructure

Titan Graph Database Integration with DynamoDB: World-class Performance, Availability, and Scale for New Workloads

2019 PostgreSQL Trends Report: Private vs. Public Cloud, Migrations, Database Combinations & Top Reasons Used

Introducing Netflix TimeSeries Data Abstraction Layer

NoSQL Data Modeling Techniques

Expanding the Cloud ? Managing Cold Storage with Amazon Glacier

MySQL vs MongoDB: Best Choice for You

From syslog to AWS Firehose: Dynatrace log management innovations that enhance observability

MySQL General Tablespaces: A Powerful Storage Option for Your Data

The Ultimate Guide to Open Source Databases

The Ultimate Guide to Database High Availability

Zendesk Moves from DynamoDB to MySQL and S3 to Save over 80% in Costs

Stuff The Internet Says On Scalability For June 15th, 2018

Key Advantages of DBMS for Efficient Data Management

Process more with less using smarter cluster overload prevention for Dynatrace Managed

Google Improves Cloud Spanner: More Compute and Storage without Price Increase

From Proprietary to Open Source: The Complete Guide to Database Migration

Stay Connected