Cache, Database and Storage - Technology Performance Pulse

MezzFS?—?Mounting object storage in Netflix’s media processing platform

The Netflix TechBlog

MARCH 6, 2019

Mounting object storage in Netflix’s media processing platform By Barak Alon (on behalf of Netflix’s Media Cloud Engineering team) MezzFS (short for “Mezzanine File System”) is a tool we’ve developed at Netflix that mounts cloud objects as local files via FUSE. Our object storage service splits objects into many parts and stores them in S3.

Media

Media Storage Processing Cache

Designing Instagram

High Scalability

JANUARY 11, 2022

We will use a graph database such as Neo4j to store the information. Additionally, we can use columnar databases like Cassandra to store information like user feeds, activities, and counters. After that, the post gets added to the feed of all the followers in the columnar data storage. Sample Queries supported by Graph Database.

Design

Design Media Storage Logistics

Introducing Netflix’s Key-Value Data Abstraction Layer

The Netflix TechBlog

SEPTEMBER 18, 2024

Central to this infrastructure is our use of multiple online distributed databases such as Apache Cassandra , a NoSQL database known for its high availability and scalability. Over time as new key-value databases were introduced and service owners launched new use cases, we encountered numerous challenges with datastore misuse.

Latency

Latency Storage Cache Servers

Improved Alerting with Atlas Streaming Eval

The Netflix TechBlog

APRIL 27, 2023

Ruchir Jha , Brian Harrington , Yingwu Zhao TL;DR Streaming alert evaluation scales much better than the traditional approach of polling time-series databases. It allows us to overcome high dimensionality/cardinality limitations of the time-series database. It opens doors to support more exciting use-cases.

Storage

Storage Cache Metrics Database

Kubernetes in the wild report 2023

Dynatrace

JANUARY 16, 2023

The strongest Kubernetes growth areas are security, databases, and CI/CD technologies. Strongest Kubernetes growth areas are security, databases, and CI/CD technologies. Of the organizations in the Kubernetes survey, 71% run databases and caches in Kubernetes, representing a +48% year-over-year increase.

Open Source

Open Source Java Operating System Programming

AWS serverless services: Exploring your options

Dynatrace

OCTOBER 7, 2021

This means you no longer have to provision, scale, and maintain servers to run your applications, databases, and storage systems. Speed is next; serverless solutions are quick to spin up or down as needed, and there are no delays due to limited storage or resource access. AWS offers four serverless offerings for storage.

Serverless

Serverless AWS Lambda Storage

Mastering Disk Space Management with MongoDB® Storage Engines

Scalegrid

MAY 11, 2024

MongoDB offers several storage engines that cater to various use cases. The default storage engine in earlier versions was MMAPv1, which utilized memory-mapped files and document-level locking. The newer, pluggable storage engine, WiredTiger, addresses this by using prefix compression, collection-level locking, and row-based storage.

Storage

Storage Engineering Cache Database

Dynatrace Kubernetes Observability for Persistent Volume Claims

Dynatrace

AUGUST 1, 2022

Interestingly, our partner RedHat reported in 2021 that around 80% of deployed workloads are databases or data caches, storing data in persistent volume claims (PVCs). You also decide to run your database for storing user uploads – such as images or videos – directly in Kubernetes. However, you lack insights into your PVCs.

Storage

Storage Database Network Metrics

How To Design For High-Traffic Events And Prevent Your Website From Crashing

Smashing Magazine

JANUARY 7, 2025

In this article, well discuss six ways to design websites for high-traffic events like product drops and sales: Compress and optimize images , Choose a scalable web host , Use a CDN , Leverage caching , Stress test websites , Refine the backend. You can also find optimization plugins or caching solutions that give you access to a CDN.

Traffic

Traffic Website Design Cache

What is a Distributed Storage System

Scalegrid

FEBRUARY 8, 2024

A distributed storage system is foundational in today’s data-driven landscape, ensuring data spread over multiple servers is reliable, accessible, and manageable. Understanding distributed storage is imperative as data volumes and the need for robust storage solutions rise.

Storage

Storage Systems Big Data Azure

Amazon DynamoDB ? a Fast and Scalable NoSQL Database.

All Things Distributed

JANUARY 18, 2012

a Fast and Scalable NoSQL Database Service Designed for Internet Scale Applications. Today is a very exciting day as we release Amazon DynamoDB , a fast, highly reliable and cost-effective NoSQL database service designed for internet scale applications. Werner Vogels weblog on building scalable and robust distributed systems.

Scalability

Scalability Database Ecommerce Latency

MySQL Data Caching Efficiency

Percona

APRIL 14, 2023

A shared characteristic in most (if not all) databases, be them traditional relational databases like Oracle, MySQL, and PostgreSQL or some kind of NoSQL-style database like MongoDB, is the use of a caching mechanism to keep (a copy of) part of the data in memory. MySQL does.

Cache

Cache Efficiency Database Monitoring

A one size fits all database doesn't fit anyone

All Things Distributed

JUNE 21, 2018

A common question that I get is why do we offer so many database products? To do this, they need to be able to use multiple databases and data models within the same application. Seldom can one database fit the needs of multiple distinct use cases. Seldom can one database fit the needs of multiple distinct use cases.

Database

Database AWS Games Latency

Introducing Netflix TimeSeries Data Abstraction Layer

The Netflix TechBlog

OCTOBER 8, 2024

Note: Contrary to what the name may suggest, this system is not built as a general-purpose time series database. Flexible Storage : The service is designed to integrate with various storage backends, including Apache Cassandra and Elasticsearch , allowing Netflix to customize storage solutions based on specific use case requirements.

Latency

Latency Storage Traffic Tuning

How Uber Serves Over 40 Million Reads Per Second from Online Storage Using an Integrated Cache

Uber Engineering

MAY 2, 2024

Learn how Uber serves over 40 million reads per second from its in-house, distributed database built on top of MySQL using an integrated caching solution: CacheFront.”

Cache

Cache Storage Database

Data ingestion pipeline with Operation Management

The Netflix TechBlog

MARCH 7, 2023

There are many naive solutions possible for this problem for example: Write different runs in different databases. Instead our challenge was to implement this feature on top of Cassandra and ElasticSearch databases because that’s what Marken uses. This is obviously very expensive. Write algo runs into files.

Media

Media Latency Architecture Database

Redis vs Memcached in 2024

Scalegrid

MARCH 28, 2024

Key Takeaways Redis offers complex data structures and additional features for versatile data handling, while Memcached excels in simplicity with a fast, multi-threaded architecture for basic caching needs. Redis is better suited for complex data models, and Memcached is better suited for high-throughput, string-based caching scenarios.

Cache

Cache Storage Scalability Architecture

The Ultimate Guide to Database High Availability

Percona

JUNE 22, 2023

To make data count and to ensure cloud computing is unabated, companies and organizations must have highly available databases. A basic high availability database system provides failover (preferably automatic) from a primary database node to redundant nodes within a cluster. HA is sometimes confused with “fault tolerance.”

Availability

Availability Database Open Source Hardware

PostgreSQL Indexes Can Hurt You: Negative Effects and the Costs Involved

Percona

APRIL 24, 2023

This post is about PostgreSQL, but most of the problems also apply to other database systems. The more indexes, the more the requirement of memory for effective caching. Indexes need more cache than tables Due to random writes and reads, indexes need more pages to be in the cache.

Tuning

Tuning Cache Storage Database

How Bloom Filters Work in MyRocks

Percona

FEBRUARY 15, 2023

Bloom filters are an essential component of an LSM-based database engine like MyRocks. For good performance, the filter blocks are cached in the RocksDB block cache and normally stay there since they are accessed frequently. Bloom filter is an important feature improving performance of LSM storage engines.

Storage

Storage Tuning Cache Engineering

Towards multiverse databases

The Morning Paper

JUNE 16, 2019

Towards multiverse databases Marzoev et al., The central idea behind multiverse databases is to push the data access and privacy rules into the database itself. With multiverse databases, each user sees a consistent “parallel universe” database containing only the data that user is allowed to see.

Database

Database Cache Benchmarking Efficiency

View from Nutanix storage during Postgres DB benchmark

n0derunner

JUNE 28, 2019

Since the DB is small (50% the size of the Linux RAM) – the database is mostly cached on the read side – so we only see writes going to the DB files. Despite the database flushes ocurring in bursts with a decent amount of concurrency the Nutanix CVM give an average of 1.5ms write response.

Benchmarking

Benchmarking Storage Cache Database

InnoDB Performance Optimization Basics

Percona

MARCH 23, 2023

Hardware Memory The amount of RAM to be provisioned for database servers can vary greatly depending on the size of the database and the specific requirements of the company. By caching hot datasets, indexes, and ongoing changes, InnoDB can provide faster response times and utilize disk IO in a much more optimal way.

Performance

Performance Hardware Tuning Storage

Back-to-Basics Weekend Reading - A Decomposition Storage Model

All Things Distributed

SEPTEMBER 20, 2013

Traditionally records in a database were stored as such: the data in a row was stored together for easy and fast retrieval. Combined with the rise of data warehouse workloads, where there is often significant redundancy in the values stored in columns, and database models based on column oriented storage took off.

Storage

Storage Hardware Cache C++

Expanding the Cloud - Introducing Amazon ElastiCache - All Things.

All Things Distributed

AUGUST 22, 2011

Today AWS has launched Amazon ElastiCache , a new service that makes it easy to add distributed in-memory caching to any application. Amazon ElastiCache handles the complexity of creating, scaling and managing an in-memory cache to free up brainpower for more differentiating activities. Driving Storage Costs Down for AWS Customers.

Cloud

Cloud Cache AWS Storage

The Most Important MySQL Setting

Percona

APRIL 7, 2023

I’ve used a fourth instance to host a PMM server to monitor servers A and B and used the data collected by the PMM agents installed on the database servers to compare performance. That’s a heritage of the LAMP model when the same server would host both the database and the web server.

Tuning

Tuning Cache Servers Benchmarking

Key Advantages of DBMS for Efficient Data Management

Scalegrid

JANUARY 5, 2024

If you’re considering a database management system, understanding these benefits is crucial. DBMS enhances data security with encryption, implements various access controls, and enables improved data sharing and concurrent access, thus facilitating quick response to changes and maintaining consistent database accuracy.

Efficiency

Efficiency Storage Database Scalability

How Caches Become Databases – And Why You Don’t Want This

VoltDB

MARCH 21, 2024

In the world of databases, data management, and data platforms, this entropy usually takes the form of a simple database or data platform that might be ideal for early use cases evolving (or rather, de volving) into an expensive and unmanageable nightmare due to operational strain from use-case gluttony. How hard can it be?

Cache

Cache Database Storage Availability

MySQL Key Performance Indicators (KPI) With PMM

Percona

JUNE 22, 2023

As a MySQL database administrator, keeping a close eye on the performance of your MySQL server is crucial to ensure optimal database operations. This includes metrics such as query execution time, the number of queries executed per second, and the utilization of query cache and adaptive hash index.

Performance

Performance Monitoring Traffic Database

Redis® Monitoring Strategies for 2025

Scalegrid

JANUARY 21, 2025

Redis Monitoring Essentials Ensuring the performance, reliability, and safety of a Redis database requires active monitoring. With these essential support systems in place, you can effectively monitor your databases with up-to-date data about their health and functioning status at all times.

Strategy

Strategy Monitoring Latency DevOps

Crucial Redis Monitoring Metrics You Must Watch

Scalegrid

JANUARY 25, 2024

Redis® is an in-memory database that provides blazingly fast performance. This makes it a compelling alternative to disk-based databases when performance is a concern. Redis returns a big list of database metrics when you run the info command on the Redis shell. This blog post lists the important database metrics to monitor.

Metrics

Metrics Monitoring Latency Cache

PostgreSQL Upgrade: Tricks With OID Columns and Extensions

Percona

APRIL 3, 2023

SET WITHOUT OIDS; A list of tables with the problem is in the file: tables_with_oids.txt Failure, exiting postgres@xx.xx.xx.xx:~$ more tables_with_oids.txt In database: abs_test public.craft public.ports In database: postgres public.oid_test We can use the below SQL to find tables with OID, and it also generates the DDL to remove.

C++

C++ Database Cache Storage

Percona Monitoring and Management 2 Scaling and Capacity Planning

Percona

MARCH 17, 2023

These updates are designed to keep databases running at peak performance and simplify database operations. But as companies grow and see more demand for their databases, we need to ensure that PMM also remains scalable so you don’t need to worry about its performance while tending to the rest of your environment.

Monitoring

Monitoring Scalability Database Cache

Use Distributed Caching to Accelerate Online Web Sites

ScaleOut Software

APRIL 22, 2020

Maintaining rapidly changing data in back-end databases creates bottlenecks that impact responsiveness. In addition, repeatedly accessing back-end databases to serve up popular items, such as product descriptions and news stories, also adds to the bottleneck. The Solution: Distributed Caching.

Cache

Cache Storage Servers Database

Use Distributed Caching to Accelerate Online Web Sites

ScaleOut Software

APRIL 22, 2020

Maintaining rapidly changing data in back-end databases creates bottlenecks that impact responsiveness. In addition, repeatedly accessing back-end databases to serve up popular items, such as product descriptions and news stories, also adds to the bottleneck. The Solution: Distributed Caching.

Cache

Cache Storage Servers Database

Microservices, events, and upside-down databases

O'Reilly Software

JUNE 12, 2018

Back then, the most common pattern I saw for service-based systems was sharing a database among multiple services. The rationale was simple: the data I need is already in this other database, and accessing a database is easy, so I’ll just reach in and grab what I need. And data was at the heart of the problem.

Database

Database Cache Architecture Latency

Redis® Monitoring Strategies for 2024

Scalegrid

DECEMBER 21, 2023

Redis® Monitoring Essentials Ensuring the performance, reliability, and safety of a Redis® database requires active monitoring. With these essential support systems in place, you can effectively monitor your databases with up-to-date data about their health and functioning status at all times.

Strategy

Strategy Monitoring Latency DevOps

WiredTiger Logging and Checkpoint Mechanism

Percona

MARCH 28, 2023

Every database system has to ensure durability and reliability. The same data, in the form of pages inside the Wiredtiger cache, are also marked dirty. At every checkpoint interval (Default 60 seconds), MongoDB flushes the modified pages that are marked as dirty in the cache to their respective data files (both collection-*.wt

Hardware

Hardware C++ Storage Cache

Cloudburst: stateful functions-as-a-service

The Morning Paper

FEBRUARY 6, 2020

Last week we looked at a function shipping solution to the problem; Cloudburst uses the more common data shipping to bring data to caches next to function runtimes (though you could also make a case that the scheduling algorithm placing function execution in locations where the data is cached a flavour of function-shipping too).

Serverless

Serverless Lambda Cache Latency

No Server Required - Jekyll & Amazon S3 - All Things Distributed

All Things Distributed

AUGUST 17, 2011

As some of you may remember I was pretty excited when Amazon Simple Storage Service (S3) released its website feature such that I could serve this weblog completely from S3. My templates and blog posts are now located in DropBox and thus locally cached at each machine I use. Driving Storage Costs Down for AWS Customers.

Servers

Servers Social Media AWS Website

What is? OpenTelemetry??An open-source standard for logs, metrics, and traces

Dynatrace

OCTOBER 15, 2021

The data is incredibly plentiful and difficult to store over long periods due to capacity limitations — a reason why private and public cloud storage services have been a boon to DevOps teams. This occurs once data is safely stored within a local cache. Monitoring begins here.

Open Source

Open Source Metrics Cloud Transportation

The Amazon.com 2010 Shareholder Letter Focusses on Technology.

All Things Distributed

APRIL 27, 2011

We use high-performance transactions systems, complex rendering and object caching, workflow and queuing systems, business intelligence and data analytics, machine learning and pattern recognition, neural networks and probabilistic decision making, and a wide variety of other techniques. Driving Storage Costs Down for AWS Customers.

Technology

Technology Technology AWS Storage

Expanding the Cloud with DNS - Introducing Amazon Route 53 - All.

All Things Distributed

DECEMBER 5, 2010

There are two main types of DNS servers: authoritative servers and caching resolvers. But the real robustness of the DNS system comes through the way lookups are handled, which is what caching resolvers do. Caching techniques ensure that the DNS system doesnt get overloaded with queries. Syndication. Subscribe to this weblogs.

Cloud

Cloud Internet Internet AWS

Understanding Weak Reference In JavaScript

Smashing Magazine

MAY 25, 2022

WeakMap can be used in two areas of web development: caching and additional data storage. The result from a function can be cached so that whenever the function is called, the cached result can be reused. With caching, a copy of the result from a request is saved locally. Let’s see this in action. Additional Data.

Cache

Cache Programming Storage Code

MezzFS?—?Mounting object storage in Netflix’s media processing platform

Designing Instagram

Trending Sources

Introducing Netflix’s Key-Value Data Abstraction Layer

Improved Alerting with Atlas Streaming Eval

Kubernetes in the wild report 2023

AWS serverless services: Exploring your options

Mastering Disk Space Management with MongoDB® Storage Engines

Dynatrace Kubernetes Observability for Persistent Volume Claims

How To Design For High-Traffic Events And Prevent Your Website From Crashing

What is a Distributed Storage System

Amazon DynamoDB ? a Fast and Scalable NoSQL Database.

MySQL Data Caching Efficiency

A one size fits all database doesn't fit anyone

Introducing Netflix TimeSeries Data Abstraction Layer

How Uber Serves Over 40 Million Reads Per Second from Online Storage Using an Integrated Cache

Data ingestion pipeline with Operation Management

Redis vs Memcached in 2024

The Ultimate Guide to Database High Availability

PostgreSQL Indexes Can Hurt You: Negative Effects and the Costs Involved

How Bloom Filters Work in MyRocks

Towards multiverse databases

View from Nutanix storage during Postgres DB benchmark

InnoDB Performance Optimization Basics

Back-to-Basics Weekend Reading - A Decomposition Storage Model

Expanding the Cloud - Introducing Amazon ElastiCache - All Things.

The Most Important MySQL Setting

Key Advantages of DBMS for Efficient Data Management

How Caches Become Databases – And Why You Don’t Want This

MySQL Key Performance Indicators (KPI) With PMM

Redis® Monitoring Strategies for 2025

Crucial Redis Monitoring Metrics You Must Watch

PostgreSQL Upgrade: Tricks With OID Columns and Extensions

Percona Monitoring and Management 2 Scaling and Capacity Planning

Use Distributed Caching to Accelerate Online Web Sites

Use Distributed Caching to Accelerate Online Web Sites

Microservices, events, and upside-down databases

Redis® Monitoring Strategies for 2024

WiredTiger Logging and Checkpoint Mechanism

Cloudburst: stateful functions-as-a-service

No Server Required - Jekyll & Amazon S3 - All Things Distributed

What is? OpenTelemetry??An open-source standard for logs, metrics, and traces

The Amazon.com 2010 Shareholder Letter Focusses on Technology.

Expanding the Cloud with DNS - Introducing Amazon Route 53 - All.

Understanding Weak Reference In JavaScript

Stay Connected