Architecture, Cache and Latency - Technology Performance Pulse

Why Replace External Database Caches?

DZone

AUGUST 28, 2024

Teams often consider external caches when the existing database cannot meet the required service-level agreement (SLA). However, external caches are not as simple as they are often made out to be. However, external caches are not as simple as they are often made out to be. This is a clear performance-oriented decision.

Cache

Cache Database Latency Traffic

Netflix’s Distributed Counter Abstraction

The Netflix TechBlog

NOVEMBER 12, 2024

By: Rajiv Shringi , Oleksii Tkachuk , Kartik Sathyanarayanan Introduction In our previous blog post, we introduced Netflix’s TimeSeries Abstraction , a distributed service designed to store and query large volumes of temporal event data with low millisecond latencies. Today, we’re excited to present the Distributed Counter Abstraction.

Latency

Latency Cache Infrastructure Strategy

Consistent caching mechanism in Titus Gateway

The Netflix TechBlog

NOVEMBER 3, 2022

The original assumptions and architectural choices were no longer viable. We introduce a caching mechanism in the API gateway layer, allowing us to offload processing from singleton leader elected controllers without giving up strict data consistency and guarantees clients observe. How do I know that my cache is up to date?

Cache

Cache Latency Traffic Systems

Architectural Insights: Designing Efficient Multi-Layered Caching With Instagram Example

DZone

FEBRUARY 27, 2024

Caching is a critical technique for optimizing application performance by temporarily storing frequently accessed data, allowing for faster retrieval during subsequent requests. Multi-layered caching involves using multiple levels of cache to store and retrieve data.

Cache

Cache Efficiency Architecture Design

Single-core memory bandwidth: Latency, Bandwidth, and Concurrency

John McCalpin

FEBRUARY 17, 2025

“Latency” is the duration from the execution of a load instruction (to an address that misses in all the caches), and the completion of that load instruction when the data is returned from memory. . The example below is for a 2005-era processor with 60 ns memory latency and 6.4

Latency

Latency Hardware Cache Systems

Foundation Model for Personalized Recommendation

The Netflix TechBlog

MARCH 28, 2025

This scenario underscored the need for a new recommender system architecture where member preference learning is centralized, enhancing accessibility and utility across different models. Yet, many are confined to a brief temporal window due to constraints in serving latency or training costs.

Tuning

Tuning Efficiency Latency Strategy

Front-End: Cache Strategies You Should Know

DZone

MAY 1, 2023

Caches are very useful software components that all engineers must know. It is a transversal component that applies to all the tech areas and architecture layers such as operating systems, data platforms, backend, frontend, and other components. What Is a Cache?

Cache

Cache Strategy Latency Operating System

Introducing Netflix’s Key-Value Data Abstraction Layer

The Netflix TechBlog

SEPTEMBER 18, 2024

These include challenges with tail latency and idempotency, managing “wide” partitions with many rows, handling single large “fat” columns, and slow response pagination. Data Model At its core, the KV abstraction is built around a two-level map architecture. Developers just provide their data problem rather than a database solution!

Latency

Latency Storage Cache Servers

Predictive CPU isolation of containers at Netflix

The Netflix TechBlog

JUNE 4, 2019

Because microprocessors are so fast, computer architecture design has evolved towards adding various levels of caching between compute units and the main memory, in order to hide the latency of bringing the bits to the brains. This avoids thrashing caches too much for B and evens out the pressure on the L3 caches of the machine.

Cache

Cache Latency Airlines Logistics

Netflix Cloud Packaging in the Terabyte Era

The Netflix TechBlog

SEPTEMBER 24, 2021

Table 1: Movie and File Size Examples Initial Architecture A simplified view of our initial cloud video processing pipeline is illustrated in the following diagram. Figure 1: A Simplified Video Processing Pipeline With this architecture, chunk encoding is very efficient and processed in distributed cloud computing instances.

Cloud

Cloud Media Storage Cache

Dynatrace accelerates business transformation with new AI observability solution

Dynatrace

JANUARY 31, 2024

Retrieval-augmented generation emerges as the standard architecture for LLM-based applications Given that LLMs can generate factually incorrect or nonsensical responses, retrieval-augmented generation (RAG) has emerged as an industry standard for building GenAI applications.

Cache

Cache Azure Infrastructure Monitoring

Dynatrace supports SnapStart for Lambda as an AWS launch partner

Dynatrace

NOVEMBER 28, 2022

The new Amazon capability enables customers to improve the startup latency of their functions from several seconds to as low as sub-second (up to 10 times faster) at P99 (the 99th latency percentile). This can cause latency outliers and may lead to a poor end-user experience for latency-sensitive applications.

Lambda

Lambda AWS Serverless Latency

Migrating Critical Traffic At Scale with No Downtime?—?Part 1

The Netflix TechBlog

MAY 4, 2023

When undertaking system migrations, one of the main challenges is establishing confidence and seamlessly transitioning the traffic to the upgraded architecture without adversely impacting the customer experience. It provides a good read on the availability and latency ranges under different production conditions.

Traffic

Traffic Latency Tuning Systems

Designing Instagram

High Scalability

JANUARY 11, 2022

Architecture. When a user requests for feed then there will be two parallel threads involved in fetching the user feeds to optimize for latency. We will use a cache having an LRU based eviction policy for caching user feeds of active users. Sending and receiving messages from other users. High Level Design. Optimization.

Design

Design Media Storage Logistics

Improved Alerting with Atlas Streaming Eval

The Netflix TechBlog

APRIL 27, 2023

Moreover, common database optimizations like caching recently queried data don’t really work for alerting queries because, generally speaking, the last received datapoint is required for correctness. First and foremost, we have successfully alleviated our initial scalability problem with the polling based architecture. OK, Results?

Storage

Storage Cache Metrics Database

Seamlessly Swapping the API backend of the Netflix Android app

The Netflix TechBlog

SEPTEMBER 8, 2020

This allows the app to query a list of “paths” in each HTTP request, and get specially formatted JSON (jsonGraph) that we use to cache the data and hydrate the UI. For us, it means that we now need to have ~15 MDN tabs open when writing routes :) Let’s briefly discuss the architecture of this microservice. It was a Node.js

Latency

Latency Cache Java Traffic

Dynatrace supports Azure Managed Instance for Apache Cassandra

Dynatrace

MAY 13, 2022

Because of its scalability and distributed architecture, thousands of companies trust it to run their cloud and hybrid-based workloads at high availability without compromising performance. With the Dynatrace Data Explorer, you can easily analyze metrics, such as client read/write latency by Cassandra nodes and disk space usage by keyspaces.

Azure

Azure Latency Metrics Infrastructure

Introducing Netflix TimeSeries Data Abstraction Layer

The Netflix TechBlog

OCTOBER 8, 2024

Rajiv Shringi Vinay Chella Kaidan Fullerton Oleksii Tkachuk Joey Lynch Introduction As Netflix continues to expand and diversify into various sectors like Video on Demand and Gaming , the ability to ingest and store vast amounts of temporal data — often reaching petabytes — with millisecond access latency has become increasingly vital.

Latency

Latency Storage Traffic Tuning

Data ingestion pipeline with Operation Management

The Netflix TechBlog

MARCH 7, 2023

But we cannot search or present low latency retrievals from files Etc. Marken Architecture Marken’s architecture diagram is as follows. Marken Architecture Marken’s architecture diagram is as follows. Using memcache allows us to keep latencies for our search low (most of our queries are less than 100ms).

Media

Media Latency Architecture Database

Re-Architecting the Video Gatekeeper

The Netflix TechBlog

JULY 12, 2019

The Tech Hollow , an OSS technology we released a few years ago, has been best described as a total high-density near cache : Total : The entire dataset is cached on each node?—?there there is no eviction policy, and there are no cache misses. Near : the cache exists in RAM on any instance which requires access to the dataset.

Cache

Cache Architecture Latency Engineering

What is a data lakehouse? Combining data lakes and warehouses for the best of both worlds

Dynatrace

OCTOBER 4, 2022

While data lakes and data warehousing architectures are commonly used modes for storing and analyzing data, a data lakehouse is an efficient third way to store and analyze data that unifies the two architectures while preserving the benefits of both. Data lakehouses deliver the query response with minimal latency.

Artificial Intelligence

Artificial Intelligence Storage Analytics Government

Observability vs. monitoring: What’s the difference?

Dynatrace

NOVEMBER 3, 2021

Organizations are depending more and more on distributed architectures to provide application services. For example, when monitoring a database, you’ll want to know about any latency when writing data to a disk or average query response time. Dynatrace news. This trend is prompting advances in both observability and monitoring.

Monitoring

Monitoring Metrics DevOps Scalability

Implementing AWS well-architected pillars with automated workflows

Dynatrace

SEPTEMBER 13, 2023

Because Google offers its own Google Cloud Architecture Framework and Microsoft its Azure Well-Architected Framework , organizations that use a combination of these platforms triple the challenge of integrating their performance frameworks into a cohesive strategy. SRG validates the status of the resiliency SLOs for the experiment period.

AWS

AWS Efficiency Azure Cloud

Expanding the Cloud: More memory, more caching and more performance for your data

All Things Distributed

SEPTEMBER 3, 2013

Choosing your database architecture may be the most critical decision you’ll make and has a disproportionate impact on the performance, scalability, and availability of your app. No single database architecture or solution can meet all of Amazon.com’s or our customers’ needs.

Cache

Cache Cloud Performance Retail

How RevenueCat Manages Caching for Handling over 1.2 Billion Daily API Requests

InfoQ

JANUARY 29, 2024

RevenueCat extensively uses caching to improve the availability and performance of its product API while ensuring consistency. The company shared its techniques to deliver the platform, which can handle over 1.2 billion daily API requests. The team at RevenueCat created an open-source memcache client that provides several advanced features.

Cache

Cache Open Source Availability Performance

Redis vs Memcached in 2024

Scalegrid

MARCH 28, 2024

Key Takeaways Redis offers complex data structures and additional features for versatile data handling, while Memcached excels in simplicity with a fast, multi-threaded architecture for basic caching needs. Introduction Caching serves a dual purpose in web development – speeding up client requests and reducing server load.

Cache

Cache Storage Scalability Architecture

Understanding operational 5G: a first measurement study on its coverage, performance and energy consumption

The Morning Paper

OCTOBER 4, 2020

We are standing on the eve of the 5G era… 5G, as a monumental shift in cellular communication technology, holds tremendous potential for spurring innovations across many vertical industries, with its promised multi-Gbps speed, sub-10 ms low latency, and massive connectivity. Throughput and latency. energy consumption).

Energy

Energy Latency Performance Network

Amazon DynamoDB ? a Fast and Scalable NoSQL Database.

All Things Distributed

JANUARY 18, 2012

Amazon DynamoDB offers low, predictable latencies at any scale. This architectural pattern was a response to the scaling challenges that had challenged Amazon.com through its first 5 years, when direct database access was one of the major bottlenecks in scaling and operating the business. This impacts the predictability of a Domainâ??s

Scalability

Scalability Database Ecommerce Latency

Cloudburst: stateful functions-as-a-service

The Morning Paper

FEBRUARY 6, 2020

Last week we looked at a function shipping solution to the problem; Cloudburst uses the more common data shipping to bring data to caches next to function runtimes (though you could also make a case that the scheduling algorithm placing function execution in locations where the data is cached a flavour of function-shipping too).

Serverless

Serverless Lambda Cache Latency

Redis® Monitoring Strategies for 2025

Scalegrid

JANUARY 21, 2025

With its widespread use in modern application architectures, understanding the ins and outs of Redis monitoring is essential for any tech professional. Identifying key Redis metrics such as latency, CPU usage, and memory metrics is crucial for effective Redis monitoring. Redis, a powerful in-memory data store, is no exception.

Strategy

Strategy Monitoring Latency DevOps

Choosing a cloud DBMS: architectures and tradeoffs

The Morning Paper

AUGUST 29, 2019

Choosing a cloud DBMS: architectures and tradeoffs Tan et al., As it is infeasible to test every OLAP system runnable on AWS, we chose widely-used systems that represented a variety of architectures and cost models. Query performance is measured from both warm and cold caches. VLDB’19.

Architecture

Architecture Cloud Storage Serverless

Comparisons of Proxies for MySQL

Percona

MARCH 20, 2023

When designing an architecture, many components need to be considered before deciding on the best solution. In this context, features like filtering, firewalling, or caching are redundant and may consume resources that could be allocated to scaling. MySQL Router is the one that has the higher latency no matter what.

Games

Games Latency Traffic Cache

Embrace event-driven computing: Amazon expands DynamoDB with streams, cross-region replication, and database triggers

All Things Distributed

JULY 14, 2015

In this blog post, I will explain how these three new capabilities empower you to build applications with distributed systems architecture and create responsive, reliable, and high-performance applications using DynamoDB that work at any scale. DynamoDB Cross-region Replication.

Database

Database Lambda AWS IoT

Redis® Monitoring Strategies for 2024

Scalegrid

DECEMBER 21, 2023

With its widespread use in modern application architectures, understanding the ins and outs of Redis® monitoring is essential for any tech professional. Identifying key Redis® metrics such as latency, CPU usage, and memory metrics is crucial for effective Redis monitoring. Redis®, a powerful in-memory data store, is no exception.

Strategy

Strategy Monitoring Latency DevOps

The Most Important MySQL Setting

Percona

APRIL 7, 2023

Here’s how the same test performed when running Percona Distribution for PostgreSQL 14 on these same servers: Queries: reads Queries: writes Queries: other Queries: total Transactions Latency (95th) MySQL (A) 1584986 1645000 245322 3475308 122277 20137.61 MySQL (B) 2517529 2610323 389048 5516900 194140 11523.48

Tuning

Tuning Cache Servers Benchmarking

Optimizing CDN Architecture: Enhancing Performance and User Experience

IO River

NOVEMBER 2, 2023

CDNs cache content on edge servers distributed globally, reducing the distance between users and the content they want.‍CDNs use load-balancing techniques to distribute incoming traffic across multiple servers called Points of Presence (PoPs) which distribute content closer to end-users and improve overall performance.

Architecture

Architecture Cache Performance Latency

Optimizing CDN Architecture: Enhancing Performance and User Experience

IO River

NOVEMBER 2, 2023

CDNs cache content on edge servers distributed globally, reducing the distance between users and the content they want.â€CDNs â€What is CDN Architecture? CDNs cache content on edge servers distributed globally, reducing the distance between users and the content they want.â€CDNs â€What is CDN Architecture?â€CDN

Architecture

Architecture Cache Performance Latency

Helios: hyperscale indexing for the cloud & edge – part 1

The Morning Paper

OCTOBER 26, 2020

Helios also serves as a reference architecture for how Microsoft envisions its next generation of distributed big-data processing systems being built. These two narratives of reference architecture and ingestion/indexing system are interwoven throughout the paper. Why do we need a new reference architecture? Emphasis mine ).

Cloud

Cloud Big Data Latency Architecture

Expanding the Cloud with DNS - Introducing Amazon Route 53 - All.

All Things Distributed

DECEMBER 5, 2010

While registrars manage the namespace in the DNS naming architecture, DNS servers are used to provide the mapping between names and the addresses used to identify an access point. There are two main types of DNS servers: authoritative servers and caching resolvers. Authoritative servers hold the definitive mappings.

Cloud

Cloud Internet Internet AWS

Fast key-value stores: an idea whose time has come and gone

The Morning Paper

JUNE 23, 2019

Generally to cache data (including non-persistent data that never sees a backing store), to share non-persistent data across application services (e.g. ” Even re-reading that today, the letter of the law there is surprisingly strict to me: you can use the local memory space or filesystem as a brief single transaction cache, but no more.

Cache

Cache Latency Google Network

Handling user-initiated actions in an asynchronous, message-based architecture

O'Reilly Software

DECEMBER 11, 2017

A message-based microservices architecture offers many advantages, making solutions easier to scale and expand with new services. The asynchronous nature of interservice interactions inherent to this architecture, however, poses challenges for user-initiated actions such as create-read-update-delete (CRUD) requests on an object.

Architecture

Architecture Government Latency Efficiency

Single-core memory bandwidth: Latency, Bandwidth, and Concurrency

John McCalpin

FEBRUARY 17, 2025

“Latency” is the duration from the execution of a load instruction (to an address that misses in all the caches), and the completion of that load instruction when the data is returned from memory. . The example below is for a 2005-era processor with 60 ns memory latency and 6.4

Latency

Latency Hardware Cache Systems

How Caches Become Databases – And Why You Don’t Want This

VoltDB

MARCH 21, 2024

The most obvious and common way this happens is when companies try to evolve their caches into a data platform that can, for example, be used as highly available enterprise key-value stores for volatile data. Let’s look at a typical scenario involving the javax cache API, also known as JSR107. How hard can it be?

Cache

Cache Database Storage Availability

How LinkedIn Serves Over 4.8 Million Member Profiles per Second

InfoQ

JULY 3, 2023

LinkedIn introduced Couchbase as a centralized caching tier for scaling member profile reads to handle increasing traffic that has outgrown their existing database cluster. The new solution achieved over 99% hit rate, helped reduce tail latencies by more than 60% and costs by 10% annually. By Rafal Gancarz

Cache

Cache Latency Traffic Database

Why Replace External Database Caches?

Netflix’s Distributed Counter Abstraction

Trending Sources

Consistent caching mechanism in Titus Gateway

Architectural Insights: Designing Efficient Multi-Layered Caching With Instagram Example

Single-core memory bandwidth: Latency, Bandwidth, and Concurrency

Foundation Model for Personalized Recommendation

Front-End: Cache Strategies You Should Know

Introducing Netflix’s Key-Value Data Abstraction Layer

Predictive CPU isolation of containers at Netflix

Netflix Cloud Packaging in the Terabyte Era

Dynatrace accelerates business transformation with new AI observability solution

Dynatrace supports SnapStart for Lambda as an AWS launch partner

Migrating Critical Traffic At Scale with No Downtime?—?Part 1

Designing Instagram

Improved Alerting with Atlas Streaming Eval

Seamlessly Swapping the API backend of the Netflix Android app

Dynatrace supports Azure Managed Instance for Apache Cassandra

Introducing Netflix TimeSeries Data Abstraction Layer

Data ingestion pipeline with Operation Management

Re-Architecting the Video Gatekeeper

What is a data lakehouse? Combining data lakes and warehouses for the best of both worlds

Observability vs. monitoring: What’s the difference?

Implementing AWS well-architected pillars with automated workflows

Expanding the Cloud: More memory, more caching and more performance for your data

How RevenueCat Manages Caching for Handling over 1.2 Billion Daily API Requests

Redis vs Memcached in 2024

Understanding operational 5G: a first measurement study on its coverage, performance and energy consumption

Amazon DynamoDB ? a Fast and Scalable NoSQL Database.

Cloudburst: stateful functions-as-a-service

Redis® Monitoring Strategies for 2025

Choosing a cloud DBMS: architectures and tradeoffs

Comparisons of Proxies for MySQL

Embrace event-driven computing: Amazon expands DynamoDB with streams, cross-region replication, and database triggers

Redis® Monitoring Strategies for 2024

The Most Important MySQL Setting

Optimizing CDN Architecture: Enhancing Performance and User Experience

Optimizing CDN Architecture: Enhancing Performance and User Experience

Helios: hyperscale indexing for the cloud & edge – part 1

Expanding the Cloud with DNS - Introducing Amazon Route 53 - All.

Fast key-value stores: an idea whose time has come and gone

Handling user-initiated actions in an asynchronous, message-based architecture

Single-core memory bandwidth: Latency, Bandwidth, and Concurrency

How Caches Become Databases – And Why You Don’t Want This

How LinkedIn Serves Over 4.8 Million Member Profiles per Second

Stay Connected