Availability, Engineering and Metrics - Technology Performance Pulse

Part 2: A Survey of Analytics Engineering Work at Netflix

The Netflix TechBlog

JANUARY 2, 2025

This article is the second in a multi-part series sharing a breadth of Analytics Engineering work at Netflix, recently presented as part of our annual internal Analytics Engineering conference. Need to catch up? Check out Part 1. Because games differ from series/films, its crucial to validate this estimation method for games.

Analytics

Analytics Engineering Games Entertainment

Transform data into insights with Dynatrace Dashboards and Notebooks

Dynatrace

OCTOBER 16, 2024

You can now: Kickstart your creation journey using ready-made dashboards Accelerate your data exploration with seamless integration between apps Start from scratch with the new Explore interface Search for known metrics from anywhere Let’s look at each of these paths through an end-to-end use case focused on Kubernetes monitoring.

Social Media

Social Media Metrics Network Analytics

HTTP monitors on the latest Dynatrace platform extend insights into the health of your API endpoints and simplify test management

Dynatrace

DECEMBER 18, 2024

Thanks to the power of Grail, those details are available for all executions stored for the entire retention period during which synthetic results are kept. It now fully supports not only Network Availability Monitors but also HTTP synthetic monitors. Details of requests sent during each monitor execution are also available.

Monitoring

Monitoring Testing Metrics Analytics

Reliability indicators that matter to your business: SLOs for all data types

Dynatrace

OCTOBER 31, 2024

This lets you build your SLOs around the indicators that matter to you and your customers—critical metrics related to availability, failure rates, request response times, or select logs and business events. While the SLO management web UI and API are already available, the dashboard tile will be released within the next weeks.

Metrics

Metrics Availability Monitoring Scalability

Next generation Dynatrace Davis AI becomes the default causation engine

Dynatrace

NOVEMBER 26, 2019

Back during Perform 2019, we introduced the next generation of the Dynatrace AI causation engine , also known as Davis. becomes the default causation engine and will replace the previous version as the default for all new environments. as the default AI engine. AI causation engine. All existing Davis 1.0

Engineering

Engineering Serverless Metrics Code

Observability engineering: Getting Prometheus metrics right for Kubernetes with Dynatrace and Kepler

Dynatrace

DECEMBER 18, 2023

For busy site reliability engineers, ensuring system reliability, scalability, and overall health is an imperative that’s getting harder to achieve in ever-expanding, cloud-native, container-based environments. To get a more granular look into telemetry data, many analysts rely on custom metrics using Prometheus. What is Prometheus?

Metrics

Metrics Engineering Energy Tuning

Analyze all AWS data in minutes with Amazon CloudWatch Metric Streams available in Dynatrace

Dynatrace

MARCH 31, 2021

For quite some time already, Dynatrace has provided full observability into AWS services by ingesting CloudWatch metrics that are published by AWS services. Amazon CloudWatch gathers metric data from various services that run on AWS. Dynatrace ingests this data to perform root-cause analysis using the Dynatrace Davis® AI engine.

AWS

AWS Metrics Availability Lambda

Dynatrace simplifies OpenTelemetry metric collection for context-aware AI analytics

Dynatrace

JANUARY 17, 2023

The release candidate of OpenTelemetry metrics was announced earlier this year at Kubecon in Valencia, Spain. Since then, organizations have embraced OTLP as an all-in-one protocol for observability signals, including metrics, traces, and logs, which will also gain Dynatrace support in early 2023.

Analytics

Analytics Metrics Open Source Java

Dynatrace joins the Microsoft Intelligent Security Association

Dynatrace

NOVEMBER 20, 2024

Dynatrace, available as an Azure-native service , has a longstanding partnership with Microsoft, deeply rooted in a strong “build with” approach to deliver seamless user experience. The Davis AI engine automatically and continuously delivers actionable insights based on an environment’s current state.

Best Practices

Best Practices Innovation Azure Cloud

Flexible, scalable, self-service Kubernetes native observability now in General Availability

Dynatrace

MAY 17, 2022

From a cost perspective, internal customers waste valuable time sending tickets to operations teams asking for metrics, logs, and traces to be enabled. A team looking for metrics, traces, and logs no longer needs to file a ticket to get their app monitored in their own environments. This approach is costly and error prone.

Availability

Availability Scalability Cloud Metrics

Build systems more reliably with Dynatrace: Chaos Engineering

Dynatrace

AUGUST 21, 2024

This approach enhances key DORA metrics and enables early detection of failures in the release process, allowing SREs more time for innovation. This blog post explores the Reliability metric , which measures modern operational practices. It forms the cornerstone of chaos engineering experiments. Why reliability?

Engineering

Engineering Systems Latency Metrics

Dynatrace observability now available for Red Hat OpenShift on IBM Z and LinuxONE mainframes

Dynatrace

JULY 24, 2024

Dynatrace full stack observability for Red Hat OpenShift Dynatrace enhances software quality and operational efficiency, which drives innovation by unifying application, operation, and platform engineering teams on a single platform. You can automatically detect and analyze performance issues across your entire tech stack with Davis® AI.

Availability

Availability Infrastructure Metrics Hardware

Dynatrace extends Synthetic Monitoring capabilities with Network Availability Monitors to validate the availability of infrastructure and services

Dynatrace

JULY 15, 2024

As HTTP and browser monitors cover the application level of the ISO /OSI model , successful executions of synthetic tests indicate that availability and performance meet the expected thresholds of your entire technological stack. Our script, available on GitHub , provides details. into NAM test definitions.

Availability

Availability Network Monitoring Infrastructure

What is platform engineering?

Dynatrace

NOVEMBER 3, 2023

In response to this shift, platform engineering is growing in popularity. The practice of platform engineering has evolved alongside the increasing complexity of cloud environments. Platform engineers design and implement these platforms, as well as ensure their security, scalability, and reliability.

Engineering

Engineering DevOps Software Engineering Scalability

OpenPipeline: Simplify access to critical business data

Dynatrace

NOVEMBER 4, 2024

For years, logs have been the dominant approach many observability vendors have taken to report business metrics on dashboards. Most of the use cases in these two broad categories benefit from the flexibility that comes from multiple available sources of business data.

Analytics

Analytics Airlines Metrics Monitoring

Unlock the Power of DevSecOps with Newly Released Kubernetes Experience for Platform Engineering

Dynatrace

NOVEMBER 7, 2023

Platform engineering is on the rise. According to leading analyst firm Gartner, “80% of software engineering organizations will establish platform teams as internal providers of reusable services, components, and tools for application delivery…” by 2026.

Engineering

Engineering DevOps Best Practices Infrastructure

Power Dashboarding, Part I: Start your exploration journey with Dashboards

Dynatrace

FEBRUARY 6, 2025

Whether you’re a seasoned IT expert or a marketing professional looking to improve business performance, understanding the data available to you is essential. Even if infrastructure metrics aren’t your thing, you’re welcome to join us on this creative journey simply swap out the suggested metrics for ones that interest you.

Metrics

Metrics Infrastructure Monitoring Best Practices

Demo: Transform OpenTelemetry data into actionable insights with the Dynatrace Distributed Tracing app

Dynatrace

OCTOBER 29, 2024

A Dynatrace API token with the following permissions: Ingest OpenTelemetry traces ( openTelemetryTrace.ingest ) Ingest metrics ( metrics.ingest ) Ingest logs ( logs.ingest ) To set up the token, see Dynatrace API – Tokens and authentication in Dynatrace documentation. If you don’t have one, you can use a trial account.

Metrics

Metrics Tuning Monitoring Availability

How platform engineering and IDP observability can accelerate developer velocity

Dynatrace

MARCH 6, 2024

As organizations look to expand DevOps maturity, improve operational efficiency, and increase developer velocity, they are embracing platform engineering as a key driver. The pair showed how to track factors including developer velocity, platform adoption, DevOps research and assessment metrics, security, and operational costs.

Engineering

Engineering Development DevOps Infrastructure

60 seconds to self-upgrading observability on Google Kubernetes Engine

Dynatrace

MARCH 23, 2020

Dynatrace industry-leading tracing, metrics, and log ingestion provide the level of high fidelity data that teams need to make accurate predictions about capacity. This is important because manual tracing is super costly and there is a lack of information available on this topic to assist developers.

Google

Google Engineering Metrics Hardware

Tutorial: Guide to automated SRE-driven performance engineering

Dynatrace

MAY 28, 2020

In this blog, I will be going through a step-by-step guide on how to automate SRE-driven performance engineering. These tags will allow us to create dashboards, request attributes or calculate service metrics specifically for our application under test. This allows us to analyze metrics (SLIs) for each individual endpoint URL.

Engineering

Engineering Performance Metrics Best Practices

How to collect Prometheus metrics in Dynatrace

Dynatrace

NOVEMBER 16, 2021

Dynatrace has recently extended its Kubernetes operator by adding a new feature, the Prometheus OpenMetrics Ingest , which enables you to import Prometheus metrics in Dynatrace and build SLO and anomaly detection dashboards with Prometheus data. Here we’ll explore how to collect Prometheus metrics and what you can achieve with them.

Metrics

Metrics Infrastructure Database Open Source

Don’t just react: How executives can predict and prevent outages to maximize availability

Dynatrace

OCTOBER 3, 2024

The end goal, of course, is to optimize the availability of organizations’ software. Dynatrace is widely recognized for its AI capabilities’ ability to predict and prevent issues, and automatically identify root causes, maximizing availability. Eventually, the goal is to arrive at self-healing through autonomous cloud operations.

Availability

Availability DevOps Analytics Cloud

Five-nines availability: Always-on infrastructure delivers system availability during the holidays’ peak loads

Dynatrace

NOVEMBER 22, 2022

The nirvana state of system uptime at peak loads is known as “five-nines availability.” In its pursuit, IT teams hover over system performance dashboards hoping their preparations will deliver five nines—or even four nines—availability. But is five nines availability attainable? Downtime per year. 90% (one nine).

Infrastructure

Infrastructure Availability Systems Retail

Why applying chaos engineering to data-intensive applications matters

Dynatrace

MAY 23, 2024

Stream processing enables software engineers to model their applications’ business logic as high-level representations in a directed acyclic graph without explicitly defining a physical execution plan. We designed experimental scenarios inspired by chaos engineering. Recovery time of the throughput metric.

Engineering

Engineering Tuning Latency Open Source

AI-driven analysis of Spring Micrometer metrics in context, with typology at scale

Dynatrace

APRIL 7, 2022

Micrometer is used for instrumenting both out-of-the-box and custom metrics from Spring Boot applications. Davis topology-aware anomaly detection and alerting for your Micrometer metrics. Topology-related custom metrics for seamless reports and alerts. Micrometer uses a registry to export metrics to monitoring systems.

Metrics

Metrics Java Latency Cache

Dynatrace partners with AWS to provide enterprise-grade, intelligent observability for custom OpenTelemetry metrics

Dynatrace

DECEMBER 15, 2020

OpenTelemetry metrics are useful for augmenting the fully automatic observability that can be achieved with Dynatrace OneAgent. OpenTelemetry metrics add domain specific data such as business KPIs and license relevant consumption details. Enterprise-grade observability for custom OpenTelemetry metrics from AWS. Dynatrace news.

AWS

AWS Metrics Analytics Government

Transform log data into actionable metrics and have Davis AI do the work for you

Dynatrace

MARCH 16, 2022

Now, Dynatrace has the ability to turn numerical values from logs into metrics, which unlocks AI-powered answers, context, and automation for your apps and infrastructure, at scale. The parameter Billed Duration is only available in logs , so it makes sense to extract it from your logs so that you can keep an eye on your cloud costs.

Metrics

Metrics Lambda Infrastructure Monitoring

Better dashboarding with Dynatrace Davis AI: Instant meaningful insights

Dynatrace

JANUARY 21, 2025

Chances are, youre a seasoned expert who visualizes meticulously identified key metrics across several sophisticated charts. Activate Davis AI to analyze charts within seconds Davis AI can help you expand your dashboards and dive deeper into your available data to extract additional information.

Traffic

Traffic Metrics Analytics Monitoring

Simplify observability for all your custom metrics (Part 3: Scripting languages)

Dynatrace

DECEMBER 28, 2020

Welcome back to the blog series where we provide you with deeper dives into the latest observability awesomeness from Dynatrace , demonstrating how we bring scale, zero configuration, automatic AI-driven alerting, and root cause analysis to all your custom metrics, including open source observability frameworks like StatsD, Telegraf, and Prometheus.

Metrics

Metrics Open Source Monitoring Engineering

SLOs for Kubernetes clusters: Optimize resource utilization of Kubernetes clusters with service-level objectives

Dynatrace

NOVEMBER 11, 2024

A Kubernetes SLO that continuously evaluates CPU, memory usage, and capacity and compares these available resources to the requested and utilized memory of Kubernetes workloads makes potential resource waste visible, revealing opportunities for countermeasures.

Efficiency

Efficiency Best Practices Monitoring Cloud

Dynatrace Managed turnkey Premium High Availability for globally distributed data centers (Early Adopter)

Dynatrace

JUNE 25, 2020

Dynatrace Managed is intrinsically highly available as it stores three copies of all events, user sessions, and metrics across its cluster nodes. Our Premium High Availability comes with the following features: Active-active deployment model for optimum hardware utilization. Dynatrace news.

Availability

Availability Hardware Latency Traffic

Unmatched scalability and security of Dynatrace extensions now available for all supported technologies: 7 reasons to migrate your JMX and Python plugins

Dynatrace

NOVEMBER 3, 2023

address these limitations and brings new monitoring and analytical capabilities that weren’t available to Extensions 1.0: Comprehensive metrics support Extensions 2.0 These bundles ensure the provisioning of pre-configured dashboards, alerts, unified analysis views, and a topology model that relates metrics and entities.

Technology

Technology Technology Scalability Availability

Automating Success: Building a better developer experience with platform engineering

Dynatrace

FEBRUARY 12, 2024

When it comes to platform engineering, not only does observability play a vital role in the success of organizations’ transformation journeys—it’s key to successful platform engineering initiatives. The various presenters in this session aligned platform engineering use cases with the software development lifecycle.

Engineering

Engineering Development Infrastructure Cloud

Connect Fluentd logs with Dynatrace traces, metrics, and topology data to enhance Kubernetes observability

Dynatrace

APRIL 8, 2022

While Fluentd solves the challenges of collecting and normalizing Kubernetes events and logs, Kubernetes performance and availability problems can rarely be solved by investigating logs in isolation. All metrics, traces, and real user data are also surfaced in the context of specific events.

Metrics

Metrics Analytics Software Architecture Open Source

1. Streamlining Membership Data Engineering at Netflix with Psyberg

The Netflix TechBlog

NOVEMBER 14, 2023

By Abhinaya Shetty , Bharath Mummadisetty At Netflix, our Membership and Finance Data Engineering team harnesses diverse data related to plans, pricing, membership life cycle, and revenue to fuel analytics, power various dashboards, and make data-informed decisions. Our audits would detect this and alert the on-call data engineer (DE).

Data Engineering

Data Engineering Engineering Processing Games

Introducing Impressions at Netflix

The Netflix TechBlog

FEBRUARY 14, 2025

Every image you hover over isnt just a visual placeholder; its a critical data point that fuels our sophisticated personalization engine. This dual availability ensures immediate processing capabilities alongside comprehensive long-term data retention.

Tuning

Tuning Latency Efficiency Storage

Identify issues immediately with actionable metrics and context in Dynatrace Problem view

Dynatrace

JUNE 3, 2022

At the heart of Dynatrace Digital Experience Monitoring (DEM) is Davis, the state-of-the-art AI engine that accurately prioritizes the severity of each detected performance anomaly in terms of its potential impact on real users and business KPIs. Leverage AI assistance to deliver better customer experience. How to get started.

Metrics

Metrics Mobile IoT Monitoring

Running the Astronomy Shop OpenTelemetry demo application with Dynatrace

Dynatrace

MARCH 13, 2025

The configuration also includes an optional span metrics connector, which generates Request, Error, and Duration (R.E.D.) metrics from span data. The configuration also includes an optional span metrics connector, which generates Request, Error, and Duration (R.E.D.) metrics from span data.

Open Source

Open Source Metrics Architecture Infrastructure

Introducing Configurable Metaflow

The Netflix TechBlog

DECEMBER 19, 2024

The standard dictionary subscript notation is also available. Imagine a ML practitioner on the Netflix Content ML team, sourcing features from hundreds of columns in our data warehouse, and creating a multitude of models against a growing suite of metrics. You can access Configs of any past runs easily through the Client API.

Best Practices

Best Practices Cache Metrics Code

Migrating Netflix to GraphQL Safely

The Netflix TechBlog

JUNE 14, 2023

By the summer of 2020, many UI engineers were ready to move to GraphQL. The GraphQL shim enabled client engineers to move quickly onto GraphQL, figure out client-side concerns like cache normalization, experiment with different GraphQL clients, and investigate client performance without being blocked by server-side migrations.

Traffic

Traffic Latency Metrics Cache

Core Web Vitals for Search Engine Optimisation: What Do We Need to Know?

CSS Wizardry

JULY 23, 2023

I am available to help you find and fix your site-speed issues through performance audits , training and workshops , consultancy , and more. Web Vitals I still feel that site owners who are serious about web performance should augment Core Web Vitals with their own custom metrics (e.g. You should get in touch.

Engineering

Engineering Google Speed Mobile

Dynatrace simplifies StatsD, Telegraf, and Prometheus observability with Davis AI

Dynatrace

OCTOBER 7, 2020

Open-source metric sources automatically map to our Smartscape model for AI analytics. With this announcement, Dynatrace brings the value of its AI engine, the scale, security, and automation of Dynatrace OneAgent and the scale of our platform (which can handle 50,000 hosts) to open source technologies so that you get the best of both worlds.

Open Source

Open Source Metrics Analytics Tuning

Auto-adaptive thresholds for AI-driven quality gating

Dynatrace

JUNE 4, 2024

Build an umbrella for Development and Operations In modern software engineering, the discipline of platform engineering delivers DevSecOps practices to developers to bridge the gaps between development, security, and operations and enhance the developer experience. However, other data formats, like logs, can also be employed.

Metrics

Metrics Engineering Code Tuning

Part 2: A Survey of Analytics Engineering Work at Netflix

Transform data into insights with Dynatrace Dashboards and Notebooks

Trending Sources

HTTP monitors on the latest Dynatrace platform extend insights into the health of your API endpoints and simplify test management

Reliability indicators that matter to your business: SLOs for all data types

Next generation Dynatrace Davis AI becomes the default causation engine

Observability engineering: Getting Prometheus metrics right for Kubernetes with Dynatrace and Kepler

Analyze all AWS data in minutes with Amazon CloudWatch Metric Streams available in Dynatrace

Dynatrace simplifies OpenTelemetry metric collection for context-aware AI analytics

Dynatrace joins the Microsoft Intelligent Security Association

Flexible, scalable, self-service Kubernetes native observability now in General Availability

Build systems more reliably with Dynatrace: Chaos Engineering

Dynatrace observability now available for Red Hat OpenShift on IBM Z and LinuxONE mainframes

Dynatrace extends Synthetic Monitoring capabilities with Network Availability Monitors to validate the availability of infrastructure and services

What is platform engineering?

OpenPipeline: Simplify access to critical business data

Unlock the Power of DevSecOps with Newly Released Kubernetes Experience for Platform Engineering

Power Dashboarding, Part I: Start your exploration journey with Dashboards

Demo: Transform OpenTelemetry data into actionable insights with the Dynatrace Distributed Tracing app

How platform engineering and IDP observability can accelerate developer velocity

60 seconds to self-upgrading observability on Google Kubernetes Engine

Tutorial: Guide to automated SRE-driven performance engineering

How to collect Prometheus metrics in Dynatrace

Don’t just react: How executives can predict and prevent outages to maximize availability

Five-nines availability: Always-on infrastructure delivers system availability during the holidays’ peak loads

Why applying chaos engineering to data-intensive applications matters

AI-driven analysis of Spring Micrometer metrics in context, with typology at scale

Dynatrace partners with AWS to provide enterprise-grade, intelligent observability for custom OpenTelemetry metrics

Transform log data into actionable metrics and have Davis AI do the work for you

Better dashboarding with Dynatrace Davis AI: Instant meaningful insights

Simplify observability for all your custom metrics (Part 3: Scripting languages)

SLOs for Kubernetes clusters: Optimize resource utilization of Kubernetes clusters with service-level objectives

Dynatrace Managed turnkey Premium High Availability for globally distributed data centers (Early Adopter)

Unmatched scalability and security of Dynatrace extensions now available for all supported technologies: 7 reasons to migrate your JMX and Python plugins

Automating Success: Building a better developer experience with platform engineering

Connect Fluentd logs with Dynatrace traces, metrics, and topology data to enhance Kubernetes observability

1. Streamlining Membership Data Engineering at Netflix with Psyberg

Introducing Impressions at Netflix

Identify issues immediately with actionable metrics and context in Dynatrace Problem view

Running the Astronomy Shop OpenTelemetry demo application with Dynatrace

Introducing Configurable Metaflow

Migrating Netflix to GraphQL Safely

Core Web Vitals for Search Engine Optimisation: What Do We Need to Know?

Dynatrace simplifies StatsD, Telegraf, and Prometheus observability with Davis AI

Auto-adaptive thresholds for AI-driven quality gating

Stay Connected