Cache, Processing and Storage - Technology Performance Pulse

Netflix’s Distributed Counter Abstraction

The Netflix TechBlog

NOVEMBER 12, 2024

After selecting a mode, users can interact with APIs without needing to worry about the underlying storage mechanisms and counting methods. Best Effort Regional Counter This type of counter is powered by EVCache , Netflix’s distributed caching solution built on the widely popular Memcached.

Latency

Latency Cache Infrastructure Strategy

The Power of Caching: Boosting API Performance and Scalability

DZone

AUGUST 16, 2023

Caching is the process of storing frequently accessed data or resources in a temporary storage location, such as memory or disk, to improve retrieval speed and reduce the need for repetitive processing.

Cache

Cache Scalability Performance Latency

Consistent caching mechanism in Titus Gateway

The Netflix TechBlog

NOVEMBER 3, 2022

We introduce a caching mechanism in the API gateway layer, allowing us to offload processing from singleton leader elected controllers without giving up strict data consistency and guarantees clients observe. cell): Titus Job Coordinator is a leader elected process managing the active state of the system.

Cache

Cache Latency Traffic Systems

Designing Instagram

High Scalability

JANUARY 11, 2022

There are two major processes which gets executed when a user posts a photo on Instagram. Firstly, the synchronous process which is responsible for uploading image content on file storage, persisting the media metadata in graph data-storage, returning the confirmation message to the user and triggering the process to update the user activity.

Design

Design Media Storage Logistics

AWS serverless services: Exploring your options

Dynatrace

OCTOBER 7, 2021

This means you no longer have to provision, scale, and maintain servers to run your applications, databases, and storage systems. Speed is next; serverless solutions are quick to spin up or down as needed, and there are no delays due to limited storage or resource access. AWS offers four serverless offerings for storage.

Serverless

Serverless AWS Lambda Storage

Introducing Netflix’s Key-Value Data Abstraction Layer

The Netflix TechBlog

SEPTEMBER 18, 2024

Our goal was to build a versatile and efficient data storage solution that could handle a wide variety of use cases, ranging from the simplest hashmaps to more complex data structures, all while ensuring high availability, tunable consistency, and low latency. Developers just provide their data problem rather than a database solution!

Latency

Latency Storage Cache Efficiency

Netflix Cloud Packaging in the Terabyte Era

The Netflix TechBlog

SEPTEMBER 24, 2021

By Xiaomei Liu , Rosanna Lee , Cyril Concolato Introduction Behind the scenes of the beloved Netflix streaming service and content, there are many technology innovations in media processing. Packaging has always been an important step in media processing. Uploading and downloading data always come with a penalty, namely latency.

Cloud

Cloud Media Storage Cache

Mastering Disk Space Management with MongoDB® Storage Engines

Scalegrid

MAY 11, 2024

MongoDB offers several storage engines that cater to various use cases. The default storage engine in earlier versions was MMAPv1, which utilized memory-mapped files and document-level locking. The newer, pluggable storage engine, WiredTiger, addresses this by using prefix compression, collection-level locking, and row-based storage.

Storage

Storage Engineering Cache Database

What is a Distributed Storage System

Scalegrid

FEBRUARY 8, 2024

A distributed storage system is foundational in today’s data-driven landscape, ensuring data spread over multiple servers is reliable, accessible, and manageable. Understanding distributed storage is imperative as data volumes and the need for robust storage solutions rise.

Storage

Storage Systems Big Data Azure

What is a data lakehouse? Combining data lakes and warehouses for the best of both worlds

Dynatrace

OCTOBER 4, 2022

Data warehouses offer a single storage repository for structured data and provide a source of truth for organizations. However, organizations must structure and store data inputs in a specific format to enable extract, transform, and load processes, and efficiently query this data. Massively parallel processing. Query language.

Artificial Intelligence

Artificial Intelligence Storage Analytics Government

Kubernetes in the wild report 2023

Dynatrace

JANUARY 16, 2023

Of the organizations in the Kubernetes survey, 71% run databases and caches in Kubernetes, representing a +48% year-over-year increase. Together with messaging systems (+36% growth), organizations are increasingly using databases and caches to persist application workload states.

Open Source

Open Source Java Operating System Programming

Building an elastic query engine on disaggregated storage

The Morning Paper

MARCH 8, 2020

Building an elastic query engine on disaggregated storage , Vuppalapati, NSDI’20. Snowflake is a data warehouse designed to overcome these limitations, and the fundamental mechanism by which it achieves this is the decoupling (disaggregation) of compute and storage. joins) during query processing. Disaggregation (or not).

Storage

Storage Engineering Cache Serverless

In-Stream Big Data Processing

Highly Scalable

AUGUST 20, 2013

The shortcomings and drawbacks of batch-oriented data processing were widely recognized by the Big Data community quite a long time ago. It became clear that real-time query processing and in-stream processing is the immediate need in many practical applications. Fault-tolerance. Interoperability with Hadoop.

Big Data

Big Data Processing Lambda Database

MySQL Data Caching Efficiency

Percona

APRIL 14, 2023

A shared characteristic in most (if not all) databases, be them traditional relational databases like Oracle, MySQL, and PostgreSQL or some kind of NoSQL-style database like MongoDB, is the use of a caching mechanism to keep (a copy of) part of the data in memory. How do you know if your MySQL database caching is operating efficiently?

Cache

Cache Efficiency Database Monitoring

Introducing Netflix TimeSeries Data Abstraction Layer

The Netflix TechBlog

OCTOBER 8, 2024

Flexible Storage : The service is designed to integrate with various storage backends, including Apache Cassandra and Elasticsearch , allowing Netflix to customize storage solutions based on specific use case requirements. Note : With Cassandra 4.x

Latency

Latency Storage Traffic Tuning

Helping VFX studios pave a path to the cloud

The Netflix TechBlog

NOVEMBER 15, 2022

Rachel Kelley (AWS), Ranjit Raju (AWS) Rendering is core to the the VFX process VFX studios around the world create amazing imagery for Netflix productions. By: Peter Cioni (Netflix), Alex Schworer (Netflix), Mac Moore (Conductor Tech.), Rendering on AWS provides the flexibility to control how quickly a project is completed.

Cloud

Cloud Entertainment AWS Infrastructure

Elasticsearch Indexing Strategy in Asset Management Platform (AMP)

The Netflix TechBlog

MARCH 10, 2023

are stored in secure storage layers. Amsterdam is built on top of three storage layers. To avoid the ES query for the list of indices for every indexing request, we keep the list of indices in a distributed cache. We also utilize Kafka to process these assets asynchronously without impacting our real time traffic.

Strategy

Strategy Cache Storage Analytics

What is session replay? Discover user pain points with session recordings

Dynatrace

DECEMBER 20, 2021

Replays provide on-demand data about where conversion processes aren’t working. By analyzing sessions of new employees interacting with key tools, teams can provide detailed instructions that will help to streamline the onboarding process and get staff up to speed faster. Are customers losing interest? Enhancing error correction.

Mobile

Mobile Website Analytics Cache

Unlocking the Secrets of TOAST: How To Optimize Large Column Storage in PostgreSQL for Top Performance and Scalability

Percona

FEBRUARY 1, 2023

This post will look at using The Oversized-Attribute Storage Technique (TOAST) to improve performance and scalability. This process is done automatically and does not significantly impact how the database is used. text, bytea), and “strategy” is one of the four TOAST storage strategies (PLAIN, EXTENDED, EXTERNAL, MAIN).

Storage

Storage Scalability Strategy Performance

Redis vs Memcached in 2024

Scalegrid

MARCH 28, 2024

Key Takeaways Redis offers complex data structures and additional features for versatile data handling, while Memcached excels in simplicity with a fast, multi-threaded architecture for basic caching needs. Redis is better suited for complex data models, and Memcached is better suited for high-throughput, string-based caching scenarios.

Cache

Cache Storage Architecture Scalability

Pushy to the Limit: Evolving Netflix’s WebSocket proxy for the future

The Netflix TechBlog

SEPTEMBER 10, 2024

The voice service then constructs a message for the device and places it on the message queue, which is then processed and sent to Pushy to deliver to the device. The previous version of the message processor was a Mantis stream-processing job that processed messages from the message queue.

Latency

Latency Cache Tuning Efficiency

Implementing AWS well-architected pillars with automated workflows

Dynatrace

SEPTEMBER 13, 2023

This process enables you to continuously evaluate software against predefined quality criteria and service level objectives (SLOs) in pre-production environments. Storing frequently accessed data in faster storage, usually in-memory caching, improves data retrieval speed and overall system performance. Beyond

AWS

AWS Efficiency Azure Cloud

Migrating Critical Traffic At Scale with No Downtime?—?Part 1

The Netflix TechBlog

MAY 4, 2023

Traffic Duplication and Correlation: The initial step requires the implementation of a mechanism to clone and fork production traffic to the newly established pathway, along with a process to record and correlate responses from the original and alternative routes.

Traffic

Traffic Latency Tuning Systems

What is? OpenTelemetry??An open-source standard for logs, metrics, and traces

Dynatrace

OCTOBER 15, 2021

This information is gathered from remote, often inaccessible points within your ecosystem and processed by some sort of tool or equipment. The data is incredibly plentiful and difficult to store over long periods due to capacity limitations — a reason why private and public cloud storage services have been a boon to DevOps teams.

Open Source

Open Source Metrics Cloud Transportation

Why you need Dynatrace on Azure Workloads

Dynatrace

NOVEMBER 5, 2019

Dependency agent Installation – Maps connections between servers and processes. AI engine, Davis – Automatically processes billions of dependencies to serve up precise answers; rather than processing simple time-series data, Davis uses high-fidelity metrics, traces, logs, and real user data that are mapped to a unified entity.

Azure

Azure Artificial Intelligence Metrics Innovation

Five Data-Loading Patterns To Improve Frontend Performance

Smashing Magazine

SEPTEMBER 28, 2022

Every unnecessary bit of JavaScript code you bundle and serve will be more code the client has to load and process. Active Memory Caching. When you want to get data that you already had quickly, you need to do caching — caching stores data that a user recently retrieved. Caching Schemes. Large preview ).

Cache

Cache Performance Servers Architecture

The Most Important MySQL Setting

Percona

APRIL 7, 2023

But since retrieving data from disk is slow, databases tend to work with a caching mechanism to keep as much hot data, the bits and pieces that are most often accessed, in memory. In MySQL, considering the standard storage engine, InnoDB , the data cache is called Buffer Pool. In PostgreSQL, it is called shared buffers.

Tuning

Tuning Cache Servers Benchmarking

InnoDB Performance Optimization Basics

Percona

MARCH 23, 2023

As datasets continue to grow in size, the amount of RAM required to store and process these datasets also increases. By caching hot datasets, indexes, and ongoing changes, InnoDB can provide faster response times and utilize disk IO in a much more optimal way. Setting oom_score_adj to -800. Refer to innodb_redo_log_capacity below.

Performance

Performance Hardware Tuning Storage

WiredTiger Logging and Checkpoint Mechanism

Percona

MARCH 28, 2023

Journal It’s a process where every write operation gets written (appended) from Memory to a Journal file, AKA transaction log that exists on disk at a specific interval configured using “ journalCommitIntervalMs.” The same data, in the form of pages inside the Wiredtiger cache, are also marked dirty. wt and index-*.wt).

Hardware

Hardware C++ Storage Cache

Use Distributed Caching to Accelerate Online Web Sites

ScaleOut Software

APRIL 22, 2020

The Solution: Distributed Caching. The solution to this challenge is to use scalable, memory-based data storage for fast-changing data so that web sites can keep up with exploding workloads. It’s not enough simply to lash together a set of servers hosting a collection of in-memory caches.

Cache

Cache Storage Servers Database

Use Distributed Caching to Accelerate Online Web Sites

ScaleOut Software

APRIL 22, 2020

The Solution: Distributed Caching. The solution to this challenge is to use scalable, memory-based data storage for fast-changing data so that web sites can keep up with exploding workloads. It’s not enough simply to lash together a set of servers hosting a collection of in-memory caches.

Cache

Cache Storage Servers Database

Key Advantages of DBMS for Efficient Data Management

Scalegrid

JANUARY 5, 2024

Despite initial investment costs, DBMS presents long-term savings and improved efficiency through automated processes, efficient query optimizations, and scalability, contributing to enhanced decision-making and end-user productivity. It provides tools for organizing and retrieving data efficiently.

Efficiency

Efficiency Storage Database Scalability

Crucial Redis Monitoring Metrics You Must Watch

Scalegrid

JANUARY 25, 2024

Effective management of memory stores with policies like LRU/LFU proactive monitoring of the replication process and advanced metrics such as cache hit ratio and persistence indicators are crucial for ensuring data integrity and optimizing Redis’s performance. If the cache hit ratio is lower than ~0.8

Metrics

Metrics Monitoring Latency Cache

Cloudburst: stateful functions-as-a-service

The Morning Paper

FEBRUARY 6, 2020

Last week we looked at a function shipping solution to the problem; Cloudburst uses the more common data shipping to bring data to caches next to function runtimes (though you could also make a case that the scheduling algorithm placing function execution in locations where the data is cached a flavour of function-shipping too).

Serverless

Serverless Lambda Cache Latency

Understanding Weak Reference In JavaScript

Smashing Magazine

MAY 25, 2022

When the JavaScript engine runs a garbage-collection process, the man object will be removed from memory and from the WeakMap that we assigned it to. The process of clearing memory when objects are no longer being used is referred to as garbage collection. With caching, a copy of the result from a request is saved locally.

Cache

Cache Programming Storage Code

AWS EKS Monitoring as a Self-Service with Dynatrace

Dynatrace

SEPTEMBER 17, 2019

PostgreSQL & Elastic for data storage. REDIS for caching. In order to filter these dashboards by tenant all they need is a Dynatrace Management Zone , which is based on meta data extracted by Dynatrace from e.g. k8s tags, container labels or process environment variables. NGINX as an API Gateway. 3 Log Analytics.

AWS

AWS Monitoring Ecommerce Lambda

Extending relational query processing with ML inference

The Morning Paper

FEBRUARY 20, 2020

Extending relational query processing with ML inference , Karanasos, CIDR’10. … based on interactions with enterprise customers, we expect that storage and inference of ML models will be subject to the same scrutiny and performance requirements of sensitive/mission-critical operational data. " Query execution.

Processing

Processing Hardware Database Servers

File systems unfit as distributed storage backends: lessons from ten years of Ceph evolution

The Morning Paper

NOVEMBER 5, 2019

File systems unfit as distributed storage backends: lessons from 10 years of Ceph evolution Aghayev et al., In this case, the assumption that a distributed storage backend should clearly be layered on top of a local file system. What is a distributed storage backend? SOSP’19. This is not surprising in hindsight.

Storage

Storage Systems Hardware Efficiency

Redis® Monitoring Strategies for 2025

Scalegrid

JANUARY 21, 2025

To monitor Redis instances effectively, collect Redis metrics focusing on cache hit ratio, memory allocated, and latency threshold. Advanced monitoring techniques enable you to identify potential issues, such as high latency, CPU utilization, command throughput, and cache hit rate before they become major problems.

Strategy

Strategy Monitoring Latency DevOps

How To Optimize Progressive Web Apps: Going Beyond The Basics

Smashing Magazine

DECEMBER 23, 2020

The service workers enable the offline usage of the PWA by fetching cached data or informing the user about the absence of an Internet connection. When developing a PWA, you can cache the application shell’s resources and assets in the browser. Cached content with IndexedDB. Cache first, then network. Service Workers.

Cache

Cache Internet Internet Google

Hierarchical Navigation and Faceted Search on Top of Oracle Coherence

Highly Scalable

APRIL 2, 2012

The rationale behind these methods is that frontend should be able to fetch transient information very efficiently and separately from fetching of heavy-weight domain entities because this information cannot be cached. So, the only way was to cache all necessary data to minimize interaction with RDBMS.

Ecommerce

Ecommerce Cache Storage Architecture

The Amazon.com 2010 Shareholder Letter Focusses on Technology.

All Things Distributed

APRIL 27, 2011

We use high-performance transactions systems, complex rendering and object caching, workflow and queuing systems, business intelligence and data analytics, machine learning and pattern recognition, neural networks and probabilistic decision making, and a wide variety of other techniques. Driving Storage Costs Down for AWS Customers.

Technology

Technology Technology AWS Storage

MySQL Key Performance Indicators (KPI) With PMM

Percona

JUNE 22, 2023

This includes metrics such as query execution time, the number of queries executed per second, and the utilization of query cache and adaptive hash index. query cache: Disable (query_cache_size: 0, query_cache_type:OFF) innodb_adaptive_hash_index: Check adaptive hash index usage to determine its efficiency.

Performance

Performance Monitoring Traffic Database

Redis® Monitoring Strategies for 2024

Scalegrid

DECEMBER 21, 2023

To monitor Redis® instances effectively, collect Redis metrics focusing on cache hit ratio, memory allocated, and latency threshold. Advanced monitoring techniques enable you to identify potential issues, such as high latency, CPU utilization, command throughput, and cache hit rate before they become major problems.

Strategy

Strategy Monitoring Latency DevOps

Netflix’s Distributed Counter Abstraction

The Power of Caching: Boosting API Performance and Scalability

Trending Sources

Consistent caching mechanism in Titus Gateway

Designing Instagram

AWS serverless services: Exploring your options

Introducing Netflix’s Key-Value Data Abstraction Layer

Netflix Cloud Packaging in the Terabyte Era

Mastering Disk Space Management with MongoDB® Storage Engines

What is a Distributed Storage System

What is a data lakehouse? Combining data lakes and warehouses for the best of both worlds

Kubernetes in the wild report 2023

Building an elastic query engine on disaggregated storage

In-Stream Big Data Processing

MySQL Data Caching Efficiency

Introducing Netflix TimeSeries Data Abstraction Layer

Helping VFX studios pave a path to the cloud

Elasticsearch Indexing Strategy in Asset Management Platform (AMP)

What is session replay? Discover user pain points with session recordings

Unlocking the Secrets of TOAST: How To Optimize Large Column Storage in PostgreSQL for Top Performance and Scalability

Redis vs Memcached in 2024

Pushy to the Limit: Evolving Netflix’s WebSocket proxy for the future

Implementing AWS well-architected pillars with automated workflows

Migrating Critical Traffic At Scale with No Downtime?—?Part 1

What is? OpenTelemetry??An open-source standard for logs, metrics, and traces

Why you need Dynatrace on Azure Workloads

Five Data-Loading Patterns To Improve Frontend Performance

The Most Important MySQL Setting

InnoDB Performance Optimization Basics

WiredTiger Logging and Checkpoint Mechanism

Use Distributed Caching to Accelerate Online Web Sites

Use Distributed Caching to Accelerate Online Web Sites

Key Advantages of DBMS for Efficient Data Management

Crucial Redis Monitoring Metrics You Must Watch

Cloudburst: stateful functions-as-a-service

Understanding Weak Reference In JavaScript

AWS EKS Monitoring as a Self-Service with Dynatrace

Extending relational query processing with ML inference

File systems unfit as distributed storage backends: lessons from ten years of Ceph evolution

Redis® Monitoring Strategies for 2025

How To Optimize Progressive Web Apps: Going Beyond The Basics

Hierarchical Navigation and Faceted Search on Top of Oracle Coherence

The Amazon.com 2010 Shareholder Letter Focusses on Technology.

MySQL Key Performance Indicators (KPI) With PMM

Redis® Monitoring Strategies for 2024

Stay Connected