Analytics, Big Data and Design - Technology Performance Pulse

What is Greenplum Database? Intro to the Big Data Database

Scalegrid

MAY 13, 2020

Greenplum Database is an open-source , hardware-agnostic MPP database for analytics, based on PostgreSQL and developed by Pivotal who was later acquired by VMware. This feature-packed database provides powerful and rapid analytics on data that scales up to petabyte volumes. Greenplum Architectural Design.

Big Data

Big Data Database Artificial Intelligence Open Source

In-Stream Big Data Processing

Highly Scalable

AUGUST 20, 2013

The shortcomings and drawbacks of batch-oriented data processing were widely recognized by the Big Data community quite a long time ago. This system has been designed to supplement and succeed the existing Hadoop-based system that had too high latency of data processing and too high maintenance costs.

Big Data

Big Data Processing Lambda Database

Driving down the cost of Big-Data analytics - All Things Distributed

All Things Distributed

AUGUST 18, 2011

Driving down the cost of Big-Data analytics. The Amazon Elastic MapReduce (EMR) team announced today the ability to seamlessly use Amazon EC2 Spot Instances with their service, significantly driving down the cost of data analytics in the cloud. Driving down the cost of Big-Data analytics.

Big Data

Big Data Analytics AWS Cloud

Probabilistic Data Structures for Web Analytics and Data Mining

Highly Scalable

MAY 1, 2012

Statistical analysis and mining of huge multi-terabyte data sets is a common task nowadays, especially in the areas like web analytics and Internet advertising. Analysis of such large data sets often requires powerful distributed data stores like Hadoop and heavy data processing with techniques like MapReduce.

Analytics

Analytics Traffic Big Data Efficiency

Data Engineers of Netflix?—?Interview with Kevin Wylie

The Netflix TechBlog

JULY 15, 2021

Interview with Kevin Wylie This post is part of our “Data Engineers of Netflix” series, where our very own data engineers talk about their journeys to Data Engineering @ Netflix. Kevin Wylie is a Data Engineer on the Content Data Science and Engineering team. What drew you to Netflix?

Data Engineering

Data Engineering Engineering Entertainment Big Data

What is a data lakehouse? Combining data lakes and warehouses for the best of both worlds

Dynatrace

OCTOBER 4, 2022

Generally, the storage technology categorizes data into landing, raw, and curated zones depending on its consumption readiness. The result is a framework that offers a single source of truth and enables companies to make the most of advanced analytics capabilities simultaneously. Support diverse analytics workloads.

Artificial Intelligence

Artificial Intelligence Storage Analytics Government

Data Engineers of Netflix?—?Interview with Pallavi Phadnis

The Netflix TechBlog

OCTOBER 28, 2021

Netflix’s unique work culture and petabyte-scale data problems are what drew me to Netflix. During earlier years of my career, I primarily worked as a backend software engineer, designing and building the backend systems that enable big data analytics.

Data Engineering

Data Engineering Engineering Big Data Software Engineering

Top 15 Software Testing Trends to Watch Out in 2021

DZone

DECEMBER 28, 2020

The introduction of innovative technologies has brought the newest updates in software testing, development, design, and delivery. Digital transformation is yet another significant focus point for the sectors and the enterprises that are ranking top on cloud and business analytics. Besides, AI and ML seem to reach a new level.

Software

Software Software Testing Big Data

Python at Netflix

The Netflix TechBlog

APRIL 29, 2019

Various software systems are needed to design, build, and operate this CDN infrastructure, and a significant number of them are written in Python. We are heavy users of Jupyter Notebooks and nteract to analyze operational data and prototype visualization tools that help us detect capacity regressions.

Open Source

Open Source Network Infrastructure Big Data

Path to NoOps part 1: How modern AIOps brings NoOps within reach

Dynatrace

OCTOBER 25, 2022

NoOps is an advanced transformation of DevOps where many of the functions needed to manage, optimize and secure IT services and applications are automated within the design. This risk leads many to question the practicality of DevOps, which makes the idea of NoOps more attractive. Thus, the concept of NoOps takes DevOps a step further.

DevOps

DevOps Big Data Cloud Innovation

Spice up your Analytics: Amazon QuickSight Now Generally Available in N. Virginia, Oregon, and Ireland.

All Things Distributed

NOVEMBER 15, 2016

To scale to a larger number of users and support the growth in data volume spurred by social media, web, mobile, IoT, ad-tech, and ecommerce workloads, these tools require customers to invest in even more infrastructure to maintain performance. Enter Amazon QuickSight. While QuickSight supports multiple graph types (e.g.,

Analytics

Analytics Availability Media Social Media

Seven benefits of AIOps to transform your business operations

Dynatrace

JULY 5, 2022

AIOps combines big data and machine learning to automate key IT operations processes, including anomaly detection and identification, event correlation, and root-cause analysis. Like the development and design phases, these applications generate massive data volumes that offer relevant and actionable insights.

Artificial Intelligence

Artificial Intelligence Cloud Innovation Strategy

What is APM?

Dynatrace

JUNE 1, 2020

Go faster, deliver consistently better results, with less team friction that you ever thought possible, as Dynatrace combines a unified data platform with advanced analytics to provide a single source of truth for your Biz, Dev and Ops teams. User Experience and Business Analytics ery user journey and maximize business KPIs.

Artificial Intelligence

Artificial Intelligence Social Media Monitoring IoT

A Recap of the Data Engineering Open Forum at Netflix

The Netflix TechBlog

JUNE 20, 2024

In this talk, Jason Reid discusses the pros and cons of both data warehouse bundling and unbundling in terms of performance, governance, and flexibility, and he examines how the trend of data warehouse unbundling will impact the data engineering landscape in the next 5 years.

Data Engineering

Data Engineering Engineering Entertainment Software Engineering

A Day in the Life of an Experimentation and Causal Inference Scientist @ Netflix

The Netflix TechBlog

MARCH 2, 2021

At Netflix, our data scientists span many areas of technical specialization, including experimentation, causal inference, machine learning, NLP, modeling, and optimization. Together with data analytics and data engineering, we comprise the larger, centralized Data Science and Engineering group.

Analytics

Analytics C++ Innovation Engineering

Turbocharge Your Apache Spark Jobs for Unmatched Performance

DZone

JULY 17, 2023

Apache Spark is a leading platform in the field of big data processing, known for its speed, versatility, and ease of use. Understanding Apache Spark Apache Spark is a unified computing engine designed for large-scale data processing. However, getting the most out of Spark often involves fine-tuning and optimization.

Big Data

Big Data Performance Open Source Tuning

DynatraceGo! APAC 2021: Lessons in thick data and keeping pace with the market

Dynatrace

AUGUST 10, 2021

BPAY is in the midst of its digital transformation journey in which it is discovering the critical importance of developing “contemporary ways of designing, operating, and using” its software. She dispelled the myth that more big data equals better decisions, higher profits, or more customers. No matter how much you collect.

DevOps

DevOps Innovation Big Data Cloud

What is ITOps? Why IT operations is more crucial than ever in a multicloud world

Dynatrace

DECEMBER 15, 2022

ITOps refers to the process of acquiring, designing, deploying, configuring, and maintaining equipment and services that support an organization’s desired business outcomes. Collect raw data in virtual and nonvirtual environments from multiple feeds, normalize and structure the data, and aggregate it for alerts.

Artificial Intelligence

Artificial Intelligence DevOps Hardware Virtualization

Bulldozer: Batch Data Moving from Data Warehouse to Online Key-Value Stores

The Netflix TechBlog

OCTOBER 27, 2020

Data scientists and engineers collect this data from our subscribers and videos, and implement data analytics models to discover customer behaviour with the goal of maximizing user joy. The data warehouse is not designed to serve point requests from microservices with low latency.

Latency

Latency Storage Big Data Tuning

Expanding the AWS Cloud: Introducing the AWS Europe (London) Region

All Things Distributed

DECEMBER 13, 2016

With the launch of the AWS Europe (London) Region, AWS can enable many more UK enterprise, public sector and startup customers to reduce IT costs, address data locality needs, and embark on rapid transformations in critical new areas, such as big data analysis and Internet of Things. Fraud.net is a good example of this.

AWS

AWS Cloud Artificial Intelligence IoT

What is a Distributed Storage System

Scalegrid

FEBRUARY 8, 2024

Their design emphasizes increasing availability by spreading out files among different nodes or servers — this approach significantly reduces risks associated with losing or corrupting data due to node failure. These distributed storage services also play a pivotal role in big data and analytics operations.

Storage

Storage Systems Big Data Azure

Hyper Scale VPC Flow Logs enrichment to provide Network Insight

The Netflix TechBlog

MAY 26, 2020

Cloud Network Insight is a suite of solutions that provides both operational and analytical insight into the Cloud Network Infrastructure to address the identified problems. As with any sustainable engineering design, focusing on simplicity is very important. And excellent logging is needed for debugging purposes and supportability.

Network

Network Tuning AWS Big Data

What is Application Performance Monitoring?

Dynatrace

JUNE 1, 2020

Go faster, deliver consistently better results, with less team friction that you ever thought possible, as Dynatrace combines a unified data platform with advanced analytics to provide a single source of truth for your Biz, Dev and Ops teams. User Experience and Business Analytics ery user journey and maximize business KPIs.

Monitoring

Monitoring Performance Social Media Artificial Intelligence

Incremental Processing using Netflix Maestro and Apache Iceberg

The Netflix TechBlog

NOVEMBER 20, 2023

For example, a job would reprocess aggregates for the past 3 days because it assumes that there would be late arriving data, but data prior to 3 days isn’t worth the cost of reprocessing. Backfill: Backfilling datasets is a common operation in big data processing. ETL pipelines keep all the benefits of batch workflows.

Processing

Processing Big Data Efficiency Engineering

MySQL vs MongoDB: Best Choice for You

Scalegrid

FEBRUARY 11, 2025

Key Takeaways MySQL is a relational database management system ideal for structured data and complex relationships, ensuring data integrity and reliability. This structured approach makes relational databases ideal for applications requiring precise data management and complex transactions.

Scalability

Scalability Database Storage IoT

What is AIOps? Everything you wanted to know

Dynatrace

OCTOBER 14, 2021

Gartner defines AIOps as the combination of “big data and machine learning to automate IT operations processes, including event correlation, anomaly detection, and causality determination.” A comprehensive, modern approach to AIOps is a unified platform that encompasses observability, AI, and analytics.

Artificial Intelligence

Artificial Intelligence DevOps Innovation Metrics

The AWS GovCloud (US) Region - All Things Distributed

All Things Distributed

AUGUST 16, 2011

The Cloud First strategy is most visible with new Federal IT programs, which are all designed to be â??Cloud Government and Big Data. One particular early use case for AWS GovCloud (US) will be massive data processing and analytics. Driving down the cost of Big-Data analytics. Cloud Readyâ??;

AWS

AWS Government Big Data Cloud

Music to my Ears - All Things Distributed

All Things Distributed

MARCH 28, 2011

Amazon S3 is used by enterprises of all sizes and is designed to handle scaling extremely well; it stores hundreds of billions of objects and easily performs several hundreds of thousands of storage transaction a second. a Fast and Scalable NoSQL Database Service Designed for Internet Scale Applications. Expanding the Cloud â??

AWS

AWS Cloud Storage Internet

The Amazon.com 2010 Shareholder Letter Focusses on Technology.

All Things Distributed

APRIL 27, 2011

We use high-performance transactions systems, complex rendering and object caching, workflow and queuing systems, business intelligence and data analytics, machine learning and pattern recognition, neural networks and probabilistic decision making, and a wide variety of other techniques. Driving down the cost of Big-Data analytics.

Technology

Technology Technology AWS Storage

Introducing the AWS South America - All Things Distributed

All Things Distributed

DECEMBER 14, 2011

a Fast and Scalable NoSQL Database Service Designed for Internet Scale Applications. Driving down the cost of Big-Data analytics. Countdown to What is Next in AWS. Expanding the Cloud â?? Introducing the AWS South America (Sao Paulo) Region. Expanding the Cloud - Introducing Amazon ElastiCache.

AWS

AWS Latency Storage Cloud

No Server Required - Jekyll & Amazon S3 - All Things Distributed

All Things Distributed

AUGUST 17, 2011

It is simple and elegant, as you would expect from someone who has won several design awards. I am still using the same layout and css I used with MT, as I prefer to make one change at the time: design comes next. a Fast and Scalable NoSQL Database Service Designed for Internet Scale Applications. Expanding the Cloud â??

Servers

Servers Social Media AWS Website

Expanding the Cloud - Introducing Amazon ElastiCache - All Things.

All Things Distributed

AUGUST 22, 2011

a Fast and Scalable NoSQL Database Service Designed for Internet Scale Applications. Driving down the cost of Big-Data analytics. Countdown to What is Next in AWS. Expanding the Cloud â?? Introducing the AWS South America (Sao Paulo) Region. Expanding the Cloud - Introducing Amazon ElastiCache.

Cloud

Cloud Cache AWS Storage

Data Mining Problems in Retail

Highly Scalable

MARCH 10, 2015

Although there are many books on data mining in general and its applications to marketing and customer relationship management in particular [BE11, AS14, PR13 etc.], The rest of the article is organized as follows: We first introduce a simple framework that ties together a retailer’s actions, profits and data. Propensity to churn.

Retail

Retail C++ Analytics Metrics

Expanding the Cloud - Introducing the AWS Asia Pacific (Tokyo.

All Things Distributed

MARCH 2, 2011

a Fast and Scalable NoSQL Database Service Designed for Internet Scale Applications. Driving down the cost of Big-Data analytics. Countdown to What is Next in AWS. Expanding the Cloud â?? Introducing the AWS South America (Sao Paulo) Region. Expanding the Cloud - Introducing Amazon ElastiCache.

AWS

AWS Cloud Games Latency

Mastering Hybrid Cloud Strategy

Scalegrid

MARCH 14, 2024

Workloads from web content, big data analytics, and artificial intelligence stand out as particularly well-suited for hybrid cloud infrastructure owing to their fluctuating computational needs and scalability demands.

Strategy

Strategy Cloud Artificial Intelligence Infrastructure

Expanding the Cloud with DNS - Introducing Amazon Route 53 - All.

All Things Distributed

DECEMBER 5, 2010

We have designed Route 53 to propagate updates very quickly and give the customer the tools to find out when all changes have been propagated. a Fast and Scalable NoSQL Database Service Designed for Internet Scale Applications. Driving down the cost of Big-Data analytics. Countdown to What is Next in AWS.

Cloud

Cloud Internet Internet AWS

Redis vs Memcached in 2024

Scalegrid

MARCH 28, 2024

Redis Data Types and Structures The design of Redis’s data structures emphasizes versatility. It is designed to cache plain text values, offering fast read and write access to frequently accessed data. Advanced Redis Features Showdown Big data center concept, cloud database, server power station of the future.

Cache

Cache Storage Scalability Architecture

Expanding the Cloud - AWS Import/Export Support for Amazon EBS.

All Things Distributed

JULY 7, 2011

a Fast and Scalable NoSQL Database Service Designed for Internet Scale Applications. Driving down the cost of Big-Data analytics. Countdown to What is Next in AWS. Expanding the Cloud â?? Introducing the AWS South America (Sao Paulo) Region. Expanding the Cloud - Introducing Amazon ElastiCache.

AWS

AWS Cloud Storage Internet

Simplifying IT - Create Your Application with AWS CloudFormation.

All Things Distributed

FEBRUARY 25, 2011

a Fast and Scalable NoSQL Database Service Designed for Internet Scale Applications. Driving down the cost of Big-Data analytics. Countdown to What is Next in AWS. Expanding the Cloud â?? Introducing the AWS South America (Sao Paulo) Region. Expanding the Cloud - Introducing Amazon ElastiCache.

AWS

AWS Cloud Scalability Storage

Hacking with AWS at The Next Web Hackaton - All Things Distributed

All Things Distributed

MARCH 24, 2011

Up to 200 developers and designers will get together to hack up interesting applications using the Internets APIs and SDKs. If you want to be that team of 5-7 developer/designers you can sign up using this form. a Fast and Scalable NoSQL Database Service Designed for Internet Scale Applications. Countdown to What is Next in AWS.

AWS

AWS Internet Internet Storage

Driving Bandwidth Cost Down for AWS Customers. - All Things.

All Things Distributed

JUNE 29, 2011

a Fast and Scalable NoSQL Database Service Designed for Internet Scale Applications. Driving down the cost of Big-Data analytics. Countdown to What is Next in AWS. Expanding the Cloud â?? Introducing the AWS South America (Sao Paulo) Region. Expanding the Cloud - Introducing Amazon ElastiCache.

AWS

AWS Retail Innovation Strategy

DROAM - Dreaming about Cheap Data Roaming - All Things.

All Things Distributed

JANUARY 11, 2011

a Fast and Scalable NoSQL Database Service Designed for Internet Scale Applications. Driving down the cost of Big-Data analytics. Countdown to What is Next in AWS. Expanding the Cloud â?? Introducing the AWS South America (Sao Paulo) Region. Expanding the Cloud - Introducing Amazon ElastiCache.

Wireless

Wireless AWS Internet Internet

Expanding the Cloud - Cluster Compute Instances for Amazon EC2.

All Things Distributed

JULY 13, 2010

Cluster Computer Instances for Amazon EC2 are a new instance type specifically designed for High Performance Computing applications. Other industries using Amazon EC2 for HPC-style workloads include pharmaceuticals, oil exploration, industrial and automotive design, media and entertainment, and more. Countdown to What is Next in AWS.

Cloud

Cloud AWS Automotive Latency

Amazon EC2 Cluster GPU Instances - All Things Distributed

All Things Distributed

NOVEMBER 15, 2010

These trade-offs have even impacted the way the lowest level building blocks in our computer architectures have been designed. Some good insight into the work that is needed to convert certain algorithms to run efficiently on GPUs is the UCB/NVIDIA " Designing Efficient Sorting Algorithms for Manycore GPUs " paper.

AWS

AWS Programming Latency Architecture

What is Greenplum Database? Intro to the Big Data Database

In-Stream Big Data Processing

Trending Sources

Driving down the cost of Big-Data analytics - All Things Distributed

Probabilistic Data Structures for Web Analytics and Data Mining

Data Engineers of Netflix?—?Interview with Kevin Wylie

What is a data lakehouse? Combining data lakes and warehouses for the best of both worlds

Data Engineers of Netflix?—?Interview with Pallavi Phadnis

Top 15 Software Testing Trends to Watch Out in 2021

Python at Netflix

Path to NoOps part 1: How modern AIOps brings NoOps within reach

Spice up your Analytics: Amazon QuickSight Now Generally Available in N. Virginia, Oregon, and Ireland.

Seven benefits of AIOps to transform your business operations

What is APM?

A Recap of the Data Engineering Open Forum at Netflix

A Day in the Life of an Experimentation and Causal Inference Scientist @ Netflix

Turbocharge Your Apache Spark Jobs for Unmatched Performance

DynatraceGo! APAC 2021: Lessons in thick data and keeping pace with the market

What is ITOps? Why IT operations is more crucial than ever in a multicloud world

Bulldozer: Batch Data Moving from Data Warehouse to Online Key-Value Stores

Expanding the AWS Cloud: Introducing the AWS Europe (London) Region

What is a Distributed Storage System

Hyper Scale VPC Flow Logs enrichment to provide Network Insight

What is Application Performance Monitoring?

Incremental Processing using Netflix Maestro and Apache Iceberg

MySQL vs MongoDB: Best Choice for You

What is AIOps? Everything you wanted to know

The AWS GovCloud (US) Region - All Things Distributed

Music to my Ears - All Things Distributed

The Amazon.com 2010 Shareholder Letter Focusses on Technology.

Introducing the AWS South America - All Things Distributed

No Server Required - Jekyll & Amazon S3 - All Things Distributed

Expanding the Cloud - Introducing Amazon ElastiCache - All Things.

Data Mining Problems in Retail

Expanding the Cloud - Introducing the AWS Asia Pacific (Tokyo.

Mastering Hybrid Cloud Strategy

Expanding the Cloud with DNS - Introducing Amazon Route 53 - All.

Redis vs Memcached in 2024

Expanding the Cloud - AWS Import/Export Support for Amazon EBS.

Simplifying IT - Create Your Application with AWS CloudFormation.

Hacking with AWS at The Next Web Hackaton - All Things Distributed

Driving Bandwidth Cost Down for AWS Customers. - All Things.

DROAM - Dreaming about Cheap Data Roaming - All Things.

Expanding the Cloud - Cluster Compute Instances for Amazon EC2.

Amazon EC2 Cluster GPU Instances - All Things Distributed

Stay Connected