article thumbnail

Netflix’s Distributed Counter Abstraction

The Netflix TechBlog

By: Rajiv Shringi , Oleksii Tkachuk , Kartik Sathyanarayanan Introduction In our previous blog post, we introduced Netflix’s TimeSeries Abstraction , a distributed service designed to store and query large volumes of temporal event data with low millisecond latencies. Today, we’re excited to present the Distributed Counter Abstraction.

Latency 251
article thumbnail

Consistent caching mechanism in Titus Gateway

The Netflix TechBlog

We introduce a caching mechanism in the API gateway layer, allowing us to offload processing from singleton leader elected controllers without giving up strict data consistency and guarantees clients observe. The cache is kept in sync with the current leader process. How do I know that my cache is up to date? of the data.

Cache 233
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

How To Design For High-Traffic Events And Prevent Your Website From Crashing

Smashing Magazine

How To Design For High-Traffic Events And Prevent Your Website From Crashing How To Design For High-Traffic Events And Prevent Your Website From Crashing Saad Khan 2025-01-07T14:00:00+00:00 2025-01-07T22:04:48+00:00 This article is sponsored by Cloudways Product launches and sales typically attract large volumes of traffic.

Traffic 86
article thumbnail

PolyScale.ai – Scaling MySQL & PostgreSQL with Global Caching

Scalegrid

Data-driven applications span a wide breadth of complexity, from simple microservices to real-time event-driven systems under significant load. Guest post by Ben Hagan from PolyScale.ai However, as any development and/or DevOps team tasked with performance improvements will attest, making data-driven apps fast globally is “non-trivial”.

Cache 279
article thumbnail

Seeing through hardware counters: a journey to threefold performance increase

The Netflix TechBlog

We turned to JVM-specific profiling, starting with the basic hotspot stats, and then switching to more detailed JFR (Java Flight Recorder) captures to compare the distribution of the events. We also see much higher L1 cache activity combined with 4x higher count of MACHINE_CLEARS. Cache line is a concept similar to memory page?—?

Hardware 363
article thumbnail

Foundation Model for Personalized Recommendation

The Netflix TechBlog

To harness this data effectively, we employ a process of interaction tokenization, ensuring meaningful events are identified and redundancies are minimized. Even with such strategies, interaction histories from active users can span thousands of events, exceeding the capacity of transformer models with standard self attention layers.

Tuning 165
article thumbnail

Predictive CPU isolation of containers at Netflix

The Netflix TechBlog

Because microprocessors are so fast, computer architecture design has evolved towards adding various levels of caching between compute units and the main memory, in order to hide the latency of bringing the bits to the brains. This avoids thrashing caches too much for B and evens out the pressure on the L3 caches of the machine.

Cache 258