Cache, Efficiency and Storage - Technology Performance Pulse

Netflix’s Distributed Counter Abstraction

The Netflix TechBlog

NOVEMBER 12, 2024

After selecting a mode, users can interact with APIs without needing to worry about the underlying storage mechanisms and counting methods. Best Effort Regional Counter This type of counter is powered by EVCache , Netflix’s distributed caching solution built on the widely popular Memcached.

Latency

Latency Cache Infrastructure Strategy

The Power of Caching: Boosting API Performance and Scalability

DZone

AUGUST 16, 2023

Caching is the process of storing frequently accessed data or resources in a temporary storage location, such as memory or disk, to improve retrieval speed and reduce the need for repetitive processing.

Cache

Cache Scalability Performance Latency

MezzFS?—?Mounting object storage in Netflix’s media processing platform

The Netflix TechBlog

MARCH 6, 2019

Mounting object storage in Netflix’s media processing platform By Barak Alon (on behalf of Netflix’s Media Cloud Engineering team) MezzFS (short for “Mezzanine File System”) is a tool we’ve developed at Netflix that mounts cloud objects as local files via FUSE. Our object storage service splits objects into many parts and stores them in S3.

Media

Media Storage Processing Cache

MySQL Data Caching Efficiency

Percona

APRIL 14, 2023

A shared characteristic in most (if not all) databases, be them traditional relational databases like Oracle, MySQL, and PostgreSQL or some kind of NoSQL-style database like MongoDB, is the use of a caching mechanism to keep (a copy of) part of the data in memory. How do you know if your MySQL database caching is operating efficiently?

Cache

Cache Efficiency Database Monitoring

Introducing Netflix’s Key-Value Data Abstraction Layer

The Netflix TechBlog

SEPTEMBER 18, 2024

Our goal was to build a versatile and efficient data storage solution that could handle a wide variety of use cases, ranging from the simplest hashmaps to more complex data structures, all while ensuring high availability, tunable consistency, and low latency. Developers just provide their data problem rather than a database solution!

Latency

Latency Storage Cache Servers

Netflix Cloud Packaging in the Terabyte Era

The Netflix TechBlog

SEPTEMBER 24, 2021

Figure 1: A Simplified Video Processing Pipeline With this architecture, chunk encoding is very efficient and processed in distributed cloud computing instances. From chunk encoding to assembly and packaging, the result of each previous processing step must be uploaded to cloud storage and then downloaded by the next processing step.

Cloud

Cloud Media Storage Cache

AWS serverless services: Exploring your options

Dynatrace

OCTOBER 7, 2021

This means you no longer have to provision, scale, and maintain servers to run your applications, databases, and storage systems. Speed is next; serverless solutions are quick to spin up or down as needed, and there are no delays due to limited storage or resource access. AWS offers four serverless offerings for storage.

Serverless

Serverless AWS Lambda Storage

Dynatrace Kubernetes Observability for Persistent Volume Claims

Dynatrace

AUGUST 1, 2022

Interestingly, our partner RedHat reported in 2021 that around 80% of deployed workloads are databases or data caches, storing data in persistent volume claims (PVCs). You quickly realize that it will take ages to fill up the overprovisioned database storage. Two days later, your database runs out of storage in the middle of the night.

Storage

Storage Database Network Metrics

Mastering Disk Space Management with MongoDB® Storage Engines

Scalegrid

MAY 11, 2024

MongoDB offers several storage engines that cater to various use cases. The default storage engine in earlier versions was MMAPv1, which utilized memory-mapped files and document-level locking. The newer, pluggable storage engine, WiredTiger, addresses this by using prefix compression, collection-level locking, and row-based storage.

Storage

Storage Engineering Cache Database

Key Advantages of DBMS for Efficient Data Management

Scalegrid

JANUARY 5, 2024

Enhanced data security, better data integrity, and efficient access to information. Despite initial investment costs, DBMS presents long-term savings and improved efficiency through automated processes, efficient query optimizations, and scalability, contributing to enhanced decision-making and end-user productivity.

Efficiency

Efficiency Storage Database Scalability

Remote Workstations for the Discerning Artists

The Netflix TechBlog

MARCH 8, 2021

They could need a GPU when doing graphics-intensive work or extra large storage to handle file management. Instead, we created a service to take the most popular configurations and cache them. We rely on our internal partner teams to support components installed on the workstation, such as storage and artist tools.

Entertainment

Entertainment Storage Open Source Hardware

What is a Distributed Storage System

Scalegrid

FEBRUARY 8, 2024

A distributed storage system is foundational in today’s data-driven landscape, ensuring data spread over multiple servers is reliable, accessible, and manageable. Understanding distributed storage is imperative as data volumes and the need for robust storage solutions rise.

Storage

Storage Systems Big Data Azure

Introducing Netflix TimeSeries Data Abstraction Layer

The Netflix TechBlog

OCTOBER 8, 2024

Building on these foundational abstractions, we developed the TimeSeries Abstraction — a versatile and scalable solution designed to efficiently store and query large volumes of temporal event data with low millisecond latencies, all in a cost-effective manner across various use cases. Let’s dive into the various aspects of this abstraction.

Latency

Latency Storage Traffic Tuning

What is a data lakehouse? Combining data lakes and warehouses for the best of both worlds

Dynatrace

OCTOBER 4, 2022

While data lakes and data warehousing architectures are commonly used modes for storing and analyzing data, a data lakehouse is an efficient third way to store and analyze data that unifies the two architectures while preserving the benefits of both. Unlike data warehouses, however, data is not transformed before landing in storage.

Artificial Intelligence

Artificial Intelligence Storage Analytics Government

Building an elastic query engine on disaggregated storage

The Morning Paper

MARCH 8, 2020

Building an elastic query engine on disaggregated storage , Vuppalapati, NSDI’20. Snowflake is a data warehouse designed to overcome these limitations, and the fundamental mechanism by which it achieves this is the decoupling (disaggregation) of compute and storage. joins) during query processing. Disaggregation (or not).

Storage

Storage Engineering Cache Serverless

Implementing AWS well-architected pillars with automated workflows

Dynatrace

SEPTEMBER 13, 2023

This is a set of best practices and guidelines that help you design and operate reliable, secure, efficient, cost-effective, and sustainable systems in the cloud. The framework comprises six pillars: Operational Excellence, Security, Reliability, Performance Efficiency, Cost Optimization, and Sustainability.

AWS

AWS Efficiency Azure Cloud

Elasticsearch Indexing Strategy in Asset Management Platform (AMP)

The Netflix TechBlog

MARCH 10, 2023

are stored in secure storage layers. Amsterdam is built on top of three storage layers. To avoid the ES query for the list of indices for every indexing request, we keep the list of indices in a distributed cache. It is also responsible for asset discovery, validation, sharing, and for triggering workflows.

Strategy

Strategy Cache Storage Analytics

Pushy to the Limit: Evolving Netflix’s WebSocket proxy for the future

The Netflix TechBlog

SEPTEMBER 10, 2024

With these clear benefits, we continued to build out this functionality for more devices, enabling the same efficiency wins. It was very efficient, but it had a set job size, requiring manual intervention if we wanted to horizontally scale it, and it required manual intervention when rolling out a new version.

Latency

Latency Cache Tuning Efficiency

Redis vs Memcached in 2024

Scalegrid

MARCH 28, 2024

Key Takeaways Redis offers complex data structures and additional features for versatile data handling, while Memcached excels in simplicity with a fast, multi-threaded architecture for basic caching needs. Redis is better suited for complex data models, and Memcached is better suited for high-throughput, string-based caching scenarios.

Cache

Cache Storage Scalability Architecture

PostgreSQL Indexes Can Hurt You: Negative Effects and the Costs Involved

Percona

APRIL 24, 2023

The more indexes, the more the requirement of memory for effective caching. Indexes need more cache than tables Due to random writes and reads, indexes need more pages to be in the cache. Cache requirements for indexes are generally much higher than associated tables.

Tuning

Tuning Cache Storage Database

Unlocking the Secrets of TOAST: How To Optimize Large Column Storage in PostgreSQL for Top Performance and Scalability

Percona

FEBRUARY 1, 2023

This post will look at using The Oversized-Attribute Storage Technique (TOAST) to improve performance and scalability. Therefore, TOAST is a storage technique used in PostgreSQL to handle large data objects such as images, videos, and audio files.

Storage

Storage Scalability Strategy Performance

How Bloom Filters Work in MyRocks

Percona

FEBRUARY 15, 2023

A bloom filter is a space-efficient way of storing information about a list of keys. For good performance, the filter blocks are cached in the RocksDB block cache and normally stay there since they are accessed frequently. Bloom filter is an important feature improving performance of LSM storage engines.

Storage

Storage Tuning Cache Engineering

Five Data-Loading Patterns To Improve Frontend Performance

Smashing Magazine

SEPTEMBER 28, 2022

Active Memory Caching. When you want to get data that you already had quickly, you need to do caching — caching stores data that a user recently retrieved. Caching partially stores your data and is not used as permanent storage. Caching partially stores your data and is not used as permanent storage.

Cache

Cache Performance Servers Architecture

Crucial Redis Monitoring Metrics You Must Watch

Scalegrid

JANUARY 25, 2024

Effective management of memory stores with policies like LRU/LFU proactive monitoring of the replication process and advanced metrics such as cache hit ratio and persistence indicators are crucial for ensuring data integrity and optimizing Redis’s performance. Cache Hit Ratio The cache hit ratio represents the efficiency of cache usage.

Metrics

Metrics Monitoring Latency Cache

MySQL Key Performance Indicators (KPI) With PMM

Percona

JUNE 22, 2023

We will also discuss related configuration variables to consider that can impact these KPIs, helping you gain a comprehensive understanding of your MySQL server’s performance and efficiency. Query performance Query performance is a key performance indicator (KPI) in MySQL, as it measures the efficiency and speed of query execution.

Performance

Performance Monitoring Traffic Database

File systems unfit as distributed storage backends: lessons from ten years of Ceph evolution

The Morning Paper

NOVEMBER 5, 2019

File systems unfit as distributed storage backends: lessons from 10 years of Ceph evolution Aghayev et al., In this case, the assumption that a distributed storage backend should clearly be layered on top of a local file system. What is a distributed storage backend? SOSP’19. This is not surprising in hindsight.

Storage

Storage Systems Hardware Efficiency

Redis® Monitoring Strategies for 2025

Scalegrid

JANUARY 21, 2025

To monitor Redis instances effectively, collect Redis metrics focusing on cache hit ratio, memory allocated, and latency threshold. This ensures each Redis instance optimally uses the in-memory data store and aligns with the operating system’s efficiency.

Strategy

Strategy Monitoring Latency DevOps

Redis® Monitoring Strategies for 2024

Scalegrid

DECEMBER 21, 2023

To monitor Redis® instances effectively, collect Redis metrics focusing on cache hit ratio, memory allocated, and latency threshold. This ensures each Redis® instance optimally uses the in-memory data store and aligns with the operating system’s efficiency.

Strategy

Strategy Monitoring Latency DevOps

Cloudburst: stateful functions-as-a-service

The Morning Paper

FEBRUARY 6, 2020

Last week we looked at a function shipping solution to the problem; Cloudburst uses the more common data shipping to bring data to caches next to function runtimes (though you could also make a case that the scheduling algorithm placing function execution in locations where the data is cached a flavour of function-shipping too).

Serverless

Serverless Lambda Cache Latency

Hierarchical Navigation and Faceted Search on Top of Oracle Coherence

Highly Scalable

APRIL 2, 2012

Categories can contain thousands of products and user cannot efficiently search though this array without powerful tools. The rationale behind these methods is that frontend should be able to fetch transient information very efficiently and separately from fetching of heavy-weight domain entities because this information cannot be cached.

Ecommerce

Ecommerce Cache Storage Architecture

Distributed Algorithms in NoSQL Databases

Highly Scalable

SEPTEMBER 18, 2012

These developments gradually highlight a system of relevant database building blocks with proven practical efficiency. We use term ReadOne-WriteOne for combination of techniques A, B, C and D – they all do not provide strict consistency guarantees, but are efficient enough to be used in practice as an self-contained approach. (E,

Database

Database Latency C++ Scalability

Connect MySQL™ to Power BI – Step by Step

Scalegrid

JUNE 14, 2024

Benefits of Power BI The advantages of Power BI are manifold, from its intuitive interface to its ability to handle large datasets efficiently. By utilizing query folding effectively, you can ensure that Power BI sends optimized queries to MySQL, which can take advantage of the database’s indexing, caching, and processing capabilities.

Database

Database Azure Efficiency Servers

Amazon DynamoDB ? a Fast and Scalable NoSQL Database.

All Things Distributed

JANUARY 18, 2012

In response, we began to develop a collection of storage and database technologies to address the demanding scalability and reliability requirements of the Amazon.com ecommerce platform. s pricing is simple and predictable: Storage is $1 per GB per month. The growth of Amazonâ??s Domain scaling limitations. Amazon DynamoDBâ??s

Scalability

Scalability Database Ecommerce Latency

20X Faster Backup Preparation With Percona XtraBackup 8.0.33-28!

Percona

JULY 25, 2023

After the “data dictionary” (DD) engine and DD cache are initialized on a server, the Storage Engines can ask for a table definition. Initializing a DD engine and the cache adds complexity and other server dependencies. Essentially LRU cache is disabled by loading the tables as non-evictable. ibd > t1.sdi

Cache

Cache Servers Benchmarking Design

AppFabric Caching: Retry Later

ScaleOut Software

MAY 15, 2014

For example, the IMDG must be able to efficiently create millions of objects in each server to make use of its huge storage capacity. During load-balancing, the client gets the following exception when accessing the cache: ErrorCode<ERRCA0017>:SubStatus<ES0006>:There is a temporary failure. Please retry later.

Cache

Cache Servers Network Design

A one size fits all database doesn't fit anyone

All Things Distributed

JUNE 21, 2018

This consistent performance is a big part of why the Snapchat Stories feature , which includes Snapchat's largest storage write workload, moved to DynamoDB. Tinder is one example of a customer that is using the flexible schema model of DynamoDB to achieve developer efficiency.

Database

Database AWS Games Latency

Driving Compute Cost Down for AWS Customers - All Things.

All Things Distributed

MARCH 5, 2012

Amazon ElastiCache customers will see their prices drop by up to 10%, depending on their cache node types. We continuously apply all our innovative skills to the design of datacenters, servers, storage, network, etc. to drive new efficiencies and higher reliability.

AWS

AWS Innovation Scalability Cache

PostgreSQL Performance Tuning: Optimizing Database Parameters for Maximum Efficiency

Percona

MAY 1, 2023

PostgreSQL performance optimization aims to improve the efficiency of a PostgreSQL database system by adjusting configurations and implementing best practices to identify and resolve bottlenecks, improve query speed, and maximize database throughput and responsiveness. What is PostgreSQL performance tuning?

Tuning

Tuning Database Efficiency Performance

Accelerating Data: Faster and More Scalable ElastiCache for Redis

All Things Distributed

OCTOBER 12, 2016

While caching continues to be a dominant use of ElastiCache for Redis, we see customers increasingly use it as an in-memory NoSQL database. ElastiCache for Redis Multi-AZ capability is built to handle any failover case for Redis Cluster with robustness and efficiency. This does not happen with ElastiCache for Redis.

Scalability

Scalability Cache Analytics AWS

Procella: unifying serving and analytical data at YouTube

The Morning Paper

SEPTEMBER 10, 2019

When each of those use cases is powered by a dedicated back-end, investments in better performance, improved scalability and efficiency etc. Cache all the things. Procella achieves high scalability and efficiency by segregating storage (in Colossus) from compute (on Borg). are divided. Making Procella fast.

Analytics

Analytics Latency Cache Google

MICRO 2019 Trip Report

ACM Sigarch

NOVEMBER 4, 2019

Using genomics and deep learning as two motivating examples, he highlighted the importance of algorithm-architecture co-design to improve end-to-end efficiency. The best paper runner-up was “ Dynamic Multi-Resolution Data Storage ”. . Albonesi.

Hardware

Hardware Architecture Programming Innovation

The Impacts of Fragmentation in MySQL

Percona

JULY 5, 2023

The principle of locality While the principle of locality is usually related to processors and cache access patterns, it also applies to data access in general. If your data is organized following the principle of locality, data access will be more efficient. This type of fragmentation can improve performance and efficiency.

Open Source

Open Source Database Efficiency Testing

Optimizing Next.js Applications With Nx

Smashing Magazine

OCTOBER 26, 2021

Nx is an open-source build framework that helps you architect, test, and build at any scale — integrating seamlessly with modern technologies and libraries, while providing a robust command-line interface (CLI), caching, and dependency management. Nx uses distributed graph-based task execution and computation caching to speed up tasks.

Cache

Cache Servers Code Testing

A MyRocks Use Case

Percona

DECEMBER 23, 2022

I wrote this post on MyRocks because I believe it is the most interesting new MySQL storage engine to have appeared over the last few years. Although MyRocks is very efficient for writes, I chose a more generic workload that will provide a different MyRocks use case. RocksDB is a very efficient key-value store based on LSM trees.

Storage

Storage Cache C++ Virtualization

Netflix’s Distributed Counter Abstraction

The Power of Caching: Boosting API Performance and Scalability

Trending Sources

MezzFS?—?Mounting object storage in Netflix’s media processing platform

MySQL Data Caching Efficiency

Introducing Netflix’s Key-Value Data Abstraction Layer

Netflix Cloud Packaging in the Terabyte Era

AWS serverless services: Exploring your options

Dynatrace Kubernetes Observability for Persistent Volume Claims

Mastering Disk Space Management with MongoDB® Storage Engines

Key Advantages of DBMS for Efficient Data Management

Remote Workstations for the Discerning Artists

What is a Distributed Storage System

Introducing Netflix TimeSeries Data Abstraction Layer

What is a data lakehouse? Combining data lakes and warehouses for the best of both worlds

Building an elastic query engine on disaggregated storage

Implementing AWS well-architected pillars with automated workflows

Elasticsearch Indexing Strategy in Asset Management Platform (AMP)

Pushy to the Limit: Evolving Netflix’s WebSocket proxy for the future

Redis vs Memcached in 2024

PostgreSQL Indexes Can Hurt You: Negative Effects and the Costs Involved

Unlocking the Secrets of TOAST: How To Optimize Large Column Storage in PostgreSQL for Top Performance and Scalability

How Bloom Filters Work in MyRocks

Five Data-Loading Patterns To Improve Frontend Performance

Crucial Redis Monitoring Metrics You Must Watch

MySQL Key Performance Indicators (KPI) With PMM

File systems unfit as distributed storage backends: lessons from ten years of Ceph evolution

Redis® Monitoring Strategies for 2025

Redis® Monitoring Strategies for 2024

Cloudburst: stateful functions-as-a-service

Hierarchical Navigation and Faceted Search on Top of Oracle Coherence

Distributed Algorithms in NoSQL Databases

Connect MySQL™ to Power BI – Step by Step

Amazon DynamoDB ? a Fast and Scalable NoSQL Database.

20X Faster Backup Preparation With Percona XtraBackup 8.0.33-28!

AppFabric Caching: Retry Later

A one size fits all database doesn't fit anyone

Driving Compute Cost Down for AWS Customers - All Things.

PostgreSQL Performance Tuning: Optimizing Database Parameters for Maximum Efficiency

Accelerating Data: Faster and More Scalable ElastiCache for Redis

Procella: unifying serving and analytical data at YouTube

MICRO 2019 Trip Report

The Impacts of Fragmentation in MySQL

Optimizing Next.js Applications With Nx

A MyRocks Use Case

Stay Connected