Remove Design Remove Presentation Remove Storage
article thumbnail

Optimizing data warehouse storage

The Netflix TechBlog

At this scale, we can gain a significant amount of performance and cost benefits by optimizing the storage layout (records, objects, partitions) as the data lands into our warehouse. We built AutoOptimize to efficiently and transparently optimize the data and metadata storage layout while maximizing their cost and performance benefits.

Storage 214
article thumbnail

What is a Distributed Storage System

Scalegrid

A distributed storage system is foundational in today’s data-driven landscape, ensuring data spread over multiple servers is reliable, accessible, and manageable. Understanding distributed storage is imperative as data volumes and the need for robust storage solutions rise.

Storage 130
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Scaling Media Machine Learning at Netflix

The Netflix TechBlog

We will then present a case study of using these components in order to optimize, scale, and solidify an existing pipeline. Media Feature Storage: Amber Storage Media feature computation tends to be expensive and time-consuming. We accomplish this by paving the path to: Accessing and processing media data (e.g.

Media 299
article thumbnail

Netflix’s Distributed Counter Abstraction

The Netflix TechBlog

By: Rajiv Shringi , Oleksii Tkachuk , Kartik Sathyanarayanan Introduction In our previous blog post, we introduced Netflix’s TimeSeries Abstraction , a distributed service designed to store and query large volumes of temporal event data with low millisecond latencies. Today, we’re excited to present the Distributed Counter Abstraction.

Latency 253
article thumbnail

Building an elastic query engine on disaggregated storage

The Morning Paper

Building an elastic query engine on disaggregated storage , Vuppalapati, NSDI’20. This paper describes the design decisions behind the Snowflake cloud-based data warehouse. have altered the many assumptions that guided the design and optimization of the Snowflake system. From shared-nothing to disaggregation.

Storage 112
article thumbnail

Using JSONB in PostgreSQL: How to Effectively Store & Index JSON Data in PostgreSQL

Scalegrid

Note: If a particular key is always present in your document, it might make sense to store it as a first class column. JSONB storage has some drawbacks vs. traditional columns: PostreSQL does not store column statistics for JSONB columns. JSONB storage results in a larger storage footprint.

Storage 321
article thumbnail

Introducing Netflix TimeSeries Data Abstraction Layer

The Netflix TechBlog

Building on these foundational abstractions, we developed the TimeSeries Abstraction  — a versatile and scalable solution designed to efficiently store and query large volumes of temporal event data with low millisecond latencies, all in a cost-effective manner across various use cases.

Latency 242