Big Data, Efficiency and Infrastructure - Technology Performance Pulse

What is Greenplum Database? Intro to the Big Data Database

Scalegrid

MAY 13, 2020

When handling large amounts of complex data, or big data, chances are that your main machine might start getting crushed by all of the data it has to process in order to produce your analytics results. Greenplum features a cost-based query optimizer for large-scale, big data workloads. Greenplum Advantages.

Big Data

Big Data Database Artificial Intelligence Open Source

Building and Scaling Data Lineage at Netflix to Improve Data Infrastructure Reliability, and…

The Netflix TechBlog

MARCH 25, 2019

Building and Scaling Data Lineage at Netflix to Improve Data Infrastructure Reliability, and Efficiency By: Di Lin , Girish Lingappa , Jitender Aswani Imagine yourself in the role of a data-inspired decision maker staring at a metric on a dashboard about to make a critical business decision but pausing to ask a question?—?“Can

Infrastructure

Infrastructure Big Data Transportation Architecture

Sustainability: Thoughts from a software engineer

Dynatrace

MARCH 17, 2025

Until recently, improvements in data center power efficiency compensated almost entirely for the increasing demand for computing resources. The rise of big data, cryptocurrencies, and AI means the IT sector contributes significantly to global greenhouse gas emissions. However, this trend is now reversing.

Software Engineering

Software Engineering Engineering Software Software

What is IT operations analytics? Extract more data insights from more sources

Dynatrace

MAY 1, 2023

IT operations analytics is the process of unifying, storing, and contextually analyzing operational data to understand the health of applications, infrastructure, and environments and streamline everyday operations. Here are the six steps of a typical ITOA process : Define the data infrastructure strategy. Apache Spark.

Analytics

Analytics Artificial Intelligence Big Data Open Source

What is cloud monitoring? How to improve your full-stack visibility

Dynatrace

JANUARY 11, 2023

With more organizations taking the multicloud plunge, monitoring cloud infrastructure is critical to ensure all components of the cloud computing stack are available, high-performing, and secure. Cloud monitoring is a set of solutions and practices used to observe, measure, analyze, and manage the health of cloud-based IT infrastructure.

Cloud

Cloud Monitoring Best Practices Infrastructure

What is software automation? Optimize the software lifecycle with intelligent automation

Dynatrace

JUNE 26, 2023

Software analytics offers the ability to gain and share insights from data emitted by software systems and related operational processes to develop higher-quality software faster while operating it efficiently and securely. This involves big data analytics and applying advanced AI and machine learning techniques, such as causal AI.

Software

Software Software Analytics Big Data

What is IT automation?

Dynatrace

JULY 6, 2022

With ever-evolving infrastructure, services, and business objectives, IT teams can’t keep up with routine tasks that require human intervention. Ultimately, IT automation can deliver consistency, efficiency, and better business outcomes for modern enterprises. IT automation tools can achieve enterprise-wide efficiency.

Artificial Intelligence

Artificial Intelligence Tuning Strategy Big Data

What is a data lakehouse? Combining data lakes and warehouses for the best of both worlds

Dynatrace

OCTOBER 4, 2022

While data lakes and data warehousing architectures are commonly used modes for storing and analyzing data, a data lakehouse is an efficient third way to store and analyze data that unifies the two architectures while preserving the benefits of both. What is a data lakehouse? Data warehouses.

Artificial Intelligence

Artificial Intelligence Storage Analytics Government

Conducting log analysis with an observability platform and full data context

Dynatrace

APRIL 20, 2023

With more automated approaches to log monitoring and log analysis, however, organizations can gain visibility into their applications and infrastructure efficiently and with greater precision—even as cloud environments grow. They enable IT teams to identify and address the precise cause of application and infrastructure issues.

Analytics

Analytics Infrastructure Storage Architecture

Any analysis, any time: Dynatrace Log Management and Analytics powered by Grail

Dynatrace

OCTOBER 4, 2022

Log management and analytics is an essential part of any organization’s infrastructure, and it’s no secret the industry has suffered from a shortage of innovation for several years. Several pain points have made it difficult for organizations to manage their data efficiently and create actual value.

Analytics

Analytics Artificial Intelligence Storage Serverless

How Netflix uses eBPF flow logs at scale for network insight

The Netflix TechBlog

JUNE 7, 2021

At much less than 1% of CPU and memory on the instance, this highly performant sidecar provides flow data at scale for network insight. Challenges The cloud network infrastructure that Netflix utilizes today consists of AWS services such as VPC, DirectConnect, VPC Peering, Transit Gateways, NAT Gateways, etc and Netflix owned devices.

Network

Network Transportation AWS Cloud

Python at Netflix

The Netflix TechBlog

APRIL 29, 2019

An easy, though imprecise, way of thinking about Netflix infrastructure is that everything that happens before you press Play on your remote control (e.g., Various software systems are needed to design, build, and operate this CDN infrastructure, and a significant number of them are written in Python. are you logged in?

Open Source

Open Source Network Infrastructure Big Data

Seven benefits of AIOps to transform your business operations

Dynatrace

JULY 5, 2022

AIOps combines big data and machine learning to automate key IT operations processes, including anomaly detection and identification, event correlation, and root-cause analysis. However, 58% of IT leaders say infrastructure management drains resources as cloud use increases. For example: Greater IT staff efficiency.

Artificial Intelligence

Artificial Intelligence Cloud Innovation Strategy

Path to NoOps part 1: How modern AIOps brings NoOps within reach

Dynatrace

OCTOBER 25, 2022

Organizations adopt DevOps, where developers and operations work together in a continuous loop, so they can develop software and resolve issues efficiently before they affect users. DevOps requires infrastructure experts and software experts to work hand in hand. In ideal circumstances, an organization evolves together.

DevOps

DevOps Big Data Cloud Innovation

What is container orchestration?

Dynatrace

MARCH 24, 2023

Containers enable developers to package microservices or applications with the libraries, configuration files, and dependencies needed to run on any infrastructure, regardless of the target system environment. And organizations use Kubernetes to run on an increasing array of workloads.

Infrastructure

Infrastructure Open Source Operating System Cloud

What is ITOps? Why IT operations is more crucial than ever in a multicloud world

Dynatrace

DECEMBER 15, 2022

ITOps is an IT discipline involving actions and decisions made by the operations team responsible for an organization’s IT infrastructure. Besides the traditional system hardware, storage, routers, and software, ITOps also includes virtual components of the network and cloud infrastructure. What is ITOps? ITOps vs. AIOps.

Artificial Intelligence

Artificial Intelligence DevOps Hardware Virtualization

How observability analytics helps teams uncover answers

Dynatrace

JUNE 26, 2024

Automation Based on the sheer volume and variety of data available to observability tools, IT automation is critical to ensure efficient operations. While human oversight is required to ensure outputs meet expectations, relying on manual processes to collect and correlate data is no longer feasible. Predictive analysis.

Analytics

Analytics Infrastructure Metrics Efficiency

Bulldozer: Batch Data Moving from Data Warehouse to Online Key-Value Stores

The Netflix TechBlog

OCTOBER 27, 2020

Data scientists and engineers collect this data from our subscribers and videos, and implement data analytics models to discover customer behaviour with the goal of maximizing user joy. The processed data is typically stored as data warehouse tables in AWS S3. Moving data with Bulldozer at Netflix.

Latency

Latency Storage Big Data Tuning

Data Engineers of Netflix?—?Interview with Pallavi Phadnis

The Netflix TechBlog

OCTOBER 28, 2021

Netflix’s unique work culture and petabyte-scale data problems are what drew me to Netflix. During earlier years of my career, I primarily worked as a backend software engineer, designing and building the backend systems that enable big data analytics. You can learn more about it from my talk at the Flink forward conference.

Data Engineering

Data Engineering Engineering Big Data Software Engineering

Snowflake Workload Optimization

DZone

AUGUST 23, 2023

In the era of big data, efficient data management and query performance are critical for organizations that want to get the best operational performance from their data investments.

Big Data

Big Data Analytics Innovation Scalability

What is APM?

Dynatrace

JUNE 1, 2020

There are many different types of monitoring from APM to Infrastructure Monitoring, Network Monitoring, Database Monitoring, Log Monitoring, Container Monitoring, Cloud Monitoring, Synthetic Monitoring, and End User monitoring. From APM to full-stack monitoring. This is something Dynatrace offers users to make sure monitoring is made easy.

Artificial Intelligence

Artificial Intelligence Social Media Monitoring IoT

Ensuring Performance, Efficiency, and Scalability of Digital Transformation

Alex Podelko

DECEMBER 19, 2019

How to select appropriate IT Infrastructure to support Digital Transformation by Boris Zibitsker, BEZNext. – Optimizing IT infrastructure – with specific use cases. Boris has unique expertise in that area – especially in Big Data applications. Something we all struggle with.

Efficiency

Efficiency Artificial Intelligence Scalability Performance

Less is More: Engineering Data Warehouse Efficiency with Minimalist Design

Uber Engineering

AUGUST 14, 2019

Maintaining Uber’s large-scale data warehouse comes with an operational cost in terms of ETL functions and storage. In our experience, optimizing for operational efficiency requires answering one key question: for which tables does the maintenance cost supersede utility?

Efficiency

Efficiency Engineering Design Storage

Building a Rule-Based Platform to Manage Netflix Membership SKUs at Scale

The Netflix TechBlog

FEBRUARY 16, 2021

Operational Efficiency: The majority of the changes require metadata configuration files and library code changes, usually taking days of testing and service release to adopt the updates. The changes are administered by the regular git pull request flow and guarded by the validation infrastructure.

Mobile

Mobile Engineering Infrastructure Scalability

Data lakehouse innovations advance the three pillars of observability for more collaborative analytics

Dynatrace

FEBRUARY 16, 2023

As teams try to gain insight into this data deluge, they have to balance the need for speed, data fidelity, and scale with capacity constraints and cost. To solve this problem, Dynatrace launched Grail, its causational data lakehouse , in 2022. But logs are just one pillar of the observability triumvirate.

Analytics

Analytics Innovation Metrics Database

AIOps observability adoption ascends in healthcare

Dynatrace

MARCH 14, 2022

The healthcare industry is embracing cloud technology to improve the efficiency, quality, and security of patient care, and this year’s HIMSS Conference in Orlando, Fla., AIOps (or “AI for IT operations”) uses artificial intelligence so that big data can help IT teams work faster and more effectively.

Healthcare

Healthcare Artificial Intelligence Innovation Strategy

Data Movement in Netflix Studio via Data Mesh

The Netflix TechBlog

JULY 26, 2021

At Netflix Studio, teams build various views of business data to provide visibility for day-to-day decision making. With dependable near real-time data, Studio teams are able to track and react better to the ever-changing pace of productions and improve efficiency of global business operations using the most up-to-date information.

Big Data

Big Data Government Processing Analytics

Reimagining Experimentation Analysis at Netflix

The Netflix TechBlog

SEPTEMBER 10, 2019

Our A/B tests range across UI, algorithms, messaging, marketing, operations, and infrastructure changes. Instead of relying on engineers to productionize scientific contributions, we’ve made a strategic bet to build an architecture that enables data scientists to easily contribute. Getting Data with the Metrics Repo 2.

Metrics

Metrics Architecture Infrastructure Innovation

Expanding the AWS Cloud: Introducing the AWS Europe (London) Region

All Things Distributed

DECEMBER 13, 2016

In November 2015, Amazon Web Services announced that it would launch a new AWS infrastructure region in the United Kingdom. Today, I'm happy to announce that the AWS Europe (London) Region, our 16th technology infrastructure region globally, is now generally available for use by customers worldwide.

AWS

AWS Cloud Artificial Intelligence IoT

Streaming SQL in Data Mesh

The Netflix TechBlog

NOVEMBER 3, 2023

Democratizing Stream Processing @ Netflix By Guil Pires , Mark Cho , Mingliang Liu , Sujay Jain Data powers much of what we do at Netflix. On the Data Platform team, we build the infrastructure used across the company to process data at scale.

Processing

Processing Engineering Infrastructure Latency

What is a Distributed Storage System

Scalegrid

FEBRUARY 8, 2024

It utilizes methodologies like DStore, which takes advantage of underused hard drive space by using it for storing vast amounts of collected datasets while enabling efficient recovery processes. These systems enable vast amounts of data to be spread over multiple nodes, allowing for simultaneous access and boosting processing efficiency.

Storage

Storage Systems Big Data Azure

Optimizing anomaly detection and noise

Dynatrace

MARCH 11, 2021

I took a big-data-analysis approach, which started with another problem visualization. This is required for understanding how I intend to improve the efficiency of (manual) alert ticket handling. The color of the line reflects the impact of the problem: infrastructure, service or application. But that didn’t work for me.

Tuning

Tuning Architecture Monitoring Big Data

What is Application Performance Monitoring?

Dynatrace

JUNE 1, 2020

There are many different types of monitoring from APM to Infrastructure Monitoring, Network Monitoring, Database Monitoring, Log Monitoring, Container Monitoring, Cloud Monitoring, Synthetic Monitoring and End User monitoring. From APM to full-stack monitoring. This is something Dynatrace offers users, to make sure monitoring is made easy.

Monitoring

Monitoring Performance Social Media Artificial Intelligence

Mastering Hybrid Cloud Strategy

Scalegrid

MARCH 14, 2024

Key Takeaways A hybrid cloud platform combines private and public cloud providers with on-premises infrastructure to create a flexible, secure, cost-effective IT environment that supports scalability, innovation, and rapid market response. The architecture usually integrates several private, public, and on-premises infrastructures.

Strategy

Strategy Cloud Artificial Intelligence Infrastructure

Optimizing data warehouse storage

The Netflix TechBlog

DECEMBER 21, 2020

We built AutoOptimize to efficiently and transparently optimize the data and metadata storage layout while maximizing their cost and performance benefits. This article will list some of the use cases of AutoOptimize, discuss the design principles that help enhance efficiency, and present the high-level architecture.

Storage

Storage Latency Efficiency Data Engineering

What is RabbitMQ Used For

Scalegrid

JUNE 28, 2024

Key features of RabbitMQ include message persistence to prevent data loss, flexible routing capabilities, and support for multiple messaging protocols such as AMQP, MQTT, and STOMP, enhancing its adaptability and reliability. Businesses can maintain a reliable and efficient communication system by utilizing message queues.

IoT

IoT Healthcare Programming Open Source

What is AIOps? Everything you wanted to know

Dynatrace

OCTOBER 14, 2021

Gartner defines AIOps as the combination of “big data and machine learning to automate IT operations processes, including event correlation, anomaly detection, and causality determination.” The second challenge with traditional AIOps centers around the data processing cycle. But what is AIOps, exactly? What is AIOps?

Artificial Intelligence

Artificial Intelligence DevOps Innovation Metrics

Mastering Distributed SQL™ Databases in 2025

Scalegrid

JANUARY 10, 2025

They keep the features that developers like but can handle much more data, similar to NoSQL systems. Notably, they simplify handling big data flows, offer consistent transactions, and sustain high performance even when they’re used for real-time data analysis and complex queries.

Database

Database Scalability Best Practices Blockchain

Expanding the AWS Cloud – Introducing the AWS Europe (Stockholm) Region

All Things Distributed

DECEMBER 12, 2018

In April 2017, Amazon Web Services announced that it would launch a new AWS infrastructure region Region in Sweden. Today, we add to that presence with an infrastructure Region in Stockholm with three Availability Zones. They rely on the AWS Cloud for their entire infrastructure and use almost every AWS service available.

AWS

AWS Cloud Games Serverless

A Day in the Life of an Experimentation and Causal Inference Scientist @ Netflix

The Netflix TechBlog

MARCH 2, 2021

I started working at a local payment processing company after graduation, where I built survival models to calculate lifetime value and experimented with them on our brand new big data stack. I was doing data science without realizing it. Each company has their own spin on data scientist responsibilities.

Analytics

Analytics C++ Innovation Engineering

Expanding the Cloud: Introducing the AWS Asia Pacific (Mumbai) Region

All Things Distributed

JUNE 26, 2016

In June 2015, Amazon Web Services announced that it would launch a new AWS infrastructure region in India. Market innovators and change agents need a comprehensive infrastructure platform that can reliably scale on-demand. Advanced problem solving that connects big data with machine learning.

AWS

AWS Cloud Healthcare Blockchain

The Need for Real-Time Device Tracking

ScaleOut Software

JULY 19, 2021

We are increasingly surrounded by intelligent IoT devices, which have become an essential part of our lives and an integral component of business and industrial infrastructures. Real-Time Device Tracking with In-Memory Computing Can Fill an Important Gap in Today’s Streaming Analytics Platforms. The list goes on.

IoT

IoT Big Data Analytics Architecture

Seer: leveraging big data to navigate the complexity of performance debugging in cloud microservices

The Morning Paper

MAY 14, 2019

Seer: leveraging big data to navigate the complexity of performance debugging in cloud microservices Gan et al., When a QoS violation is predicted to occur and a culprit microservice located, Seer uses a lower level tracing infrastructure with hardware monitoring primitives to identify the reason behind the QoS violation.

Big Data

Big Data Cloud Performance Hardware

Amazon EC2 Cluster GPU Instances - All Things Distributed

All Things Distributed

NOVEMBER 15, 2010

Now that our ability to generate higher and higher clock rates has stalled and CPU architectural improvements have shifted focus towards multiple cores, we see that it is becoming harder to efficiently use these computer systems. Driving down the cost of Big-Data analytics. Cluster Computer, Cluster GPU and Amazon EMR.

AWS

AWS Programming Latency Architecture

What is Greenplum Database? Intro to the Big Data Database

Building and Scaling Data Lineage at Netflix to Improve Data Infrastructure Reliability, and…

Trending Sources

Sustainability: Thoughts from a software engineer

What is IT operations analytics? Extract more data insights from more sources

What is cloud monitoring? How to improve your full-stack visibility

What is software automation? Optimize the software lifecycle with intelligent automation

What is IT automation?

What is a data lakehouse? Combining data lakes and warehouses for the best of both worlds

Conducting log analysis with an observability platform and full data context

Any analysis, any time: Dynatrace Log Management and Analytics powered by Grail

How Netflix uses eBPF flow logs at scale for network insight

Python at Netflix

Seven benefits of AIOps to transform your business operations

Path to NoOps part 1: How modern AIOps brings NoOps within reach

What is container orchestration?

What is ITOps? Why IT operations is more crucial than ever in a multicloud world

How observability analytics helps teams uncover answers

Bulldozer: Batch Data Moving from Data Warehouse to Online Key-Value Stores

Data Engineers of Netflix?—?Interview with Pallavi Phadnis

Snowflake Workload Optimization

What is APM?

Ensuring Performance, Efficiency, and Scalability of Digital Transformation

Less is More: Engineering Data Warehouse Efficiency with Minimalist Design

Building a Rule-Based Platform to Manage Netflix Membership SKUs at Scale

Data lakehouse innovations advance the three pillars of observability for more collaborative analytics

AIOps observability adoption ascends in healthcare

Data Movement in Netflix Studio via Data Mesh

Reimagining Experimentation Analysis at Netflix

Expanding the AWS Cloud: Introducing the AWS Europe (London) Region

Streaming SQL in Data Mesh

What is a Distributed Storage System

Optimizing anomaly detection and noise

What is Application Performance Monitoring?

Mastering Hybrid Cloud Strategy

Optimizing data warehouse storage

What is RabbitMQ Used For

What is AIOps? Everything you wanted to know

Mastering Distributed SQL™ Databases in 2025

Expanding the AWS Cloud – Introducing the AWS Europe (Stockholm) Region

A Day in the Life of an Experimentation and Causal Inference Scientist @ Netflix

Expanding the Cloud: Introducing the AWS Asia Pacific (Mumbai) Region

The Need for Real-Time Device Tracking

Seer: leveraging big data to navigate the complexity of performance debugging in cloud microservices

Amazon EC2 Cluster GPU Instances - All Things Distributed

Stay Connected