Event, Processing and Strategy - Technology Performance Pulse

Netflix’s Distributed Counter Abstraction

The Netflix TechBlog

NOVEMBER 12, 2024

By: Rajiv Shringi , Oleksii Tkachuk , Kartik Sathyanarayanan Introduction In our previous blog post, we introduced Netflix’s TimeSeries Abstraction , a distributed service designed to store and query large volumes of temporal event data with low millisecond latencies. This process can also be used to track the provenance of increments.

Latency

Latency Cache Infrastructure Strategy

New integrations announced at AWS re:Invent enhance cloud performance, security, and automation

Dynatrace

DECEMBER 3, 2024

This integration simplifies the process of embedding Dynatrace full-stack observability directly into custom Amazon Machine Images (AMIs). VMware migration support for seamless transitions For enterprises transitioning VMware-based workloads to the cloud, the process can be complex and resource-intensive.

AWS

AWS Cloud Performance Innovation

Helping customers unlock the Power of Possible

Dynatrace

OCTOBER 29, 2024

The Dynatrace platform automatically captures and maps metrics, logs, traces, events, user experience data, and security signals into a single datastore, performing contextual analytics through a “power of three AI”—combining causal, predictive, and generative AI. What’s behind it all? With over 2.5

Innovation

Innovation Cloud Strategy AWS

Level up your strategic IT management with fully cost-transparent, fine-grained Dynatrace Cost Allocation

Dynatrace

NOVEMBER 27, 2024

Figure 4: Set up an anomaly detector for peak cost events. You can also create individual reports using Notebooks —or export your data as CSV—and share it with your financial teams for further processing.

Best Practices

Best Practices Strategy Cloud Efficiency

Introducing Impressions at Netflix

The Netflix TechBlog

FEBRUARY 14, 2025

It requires a state-of-the-art system that can track and process these impressions while maintaining a detailed history of each profiles exposure. In this multi-part blog series, we take you behind the scenes of our system that processes billions of impressions daily.

Tuning

Tuning Latency Efficiency Storage

What is application modernization? How to pick an application modernization strategy

Dynatrace

SEPTEMBER 9, 2022

Today, organizations must adopt solid modernization strategies to stay competitive in the market. According to a recent IDC report , IT organizations need to create a modernization and rationalization plan that aligns with their overall digital transformation strategy. Crafting an application modernization strategy.

Strategy

Strategy Benchmarking Serverless Cloud

Metric events: Set up anomaly detection based on your business needs

Dynatrace

OCTOBER 17, 2022

Davis is the causational AI from Dynatrace that processes billions of events and dependencies and constantly analyzes your IT infrastructure. Dynatrace metric events offer the flexibility needed to customize your anomaly detection configuration. Configure anomaly detection based on your business needs.

Metrics

Metrics Strategy Infrastructure Monitoring

Better dashboarding with Dynatrace Davis AI: Instant meaningful insights

Dynatrace

JANUARY 21, 2025

Ensuring smooth operations is no small feat, whether you’re in charge of application performance, IT infrastructure, or business processes. Static Threshold: This approach defines a fixed threshold suitable for well-known processes or when specific threshold values are critical.

Traffic

Traffic Metrics Analytics Monitoring

A Kubernetes platform engineering strategy tames Kubernetes complexity

Dynatrace

JULY 25, 2024

I spoke with Martin Spier, PicPay’s VP of Engineering, about the challenges PicPay experienced and the Kubernetes platform engineering strategy his team adopted in response. The company receives tens of thousands of requests per second on its edge layer and sees hundreds of millions of events per hour on its analytics layer.

Strategy

Strategy Engineering Open Source Java

AIOps strategy unlocks new possibilities for automation, customer satisfaction

Dynatrace

OCTOBER 1, 2024

To manage these complexities, organizations are turning to AIOps, an approach to IT operations that uses artificial intelligence (AI) to optimize operations, streamline processes, and deliver efficiency. One Dynatrace customer, TD Bank, placed Dynatrace at the center of its AIOps strategy to deliver seamless user experiences.

Strategy

Strategy Artificial Intelligence Innovation Efficiency

AIOps strategy central to proactive multicloud management

Dynatrace

DECEMBER 30, 2021

This article includes key takeaways on AIOps strategy: Manual, error-prone approaches have made it nearly impossible for organizations to keep pace with the complexity of modern, multicloud environments. AIOps strategy at the core of multicloud observability and management. Exploring keys to a better AIOps strategy at Perform 2022.

Strategy

Strategy Energy Cloud Innovation

Extract metrics from business events to increase the value of business analytics

Dynatrace

FEBRUARY 2, 2023

Technology and business leaders express increasing interest in integrating business data into their IT observability strategies, citing the value of effective collaboration between business and IT. To close these critical gaps, Dynatrace has defined a new class of events called business events.

Analytics

Analytics Metrics DevOps Storage

Foundation Model for Personalized Recommendation

The Netflix TechBlog

MARCH 28, 2025

The impetus for constructing a foundational recommendation model is based on the paradigm shift in natural language processing (NLP) to large language models (LLMs). Key insights from this shiftinclude: A Data-Centric Approach : Shifting focus from model-centric strategies, which heavily rely on feature engineering, to a data-centric one.

Tuning

Tuning Efficiency Latency Strategy

Plan, execute, and modernize a cloud migration strategy with Dynatrace

Dynatrace

MAY 9, 2022

For IT teams seeking agility, cost savings, and a faster on-ramp to innovation, a cloud migration strategy is critical. Define the strategy, assess the environment, and perform migration-readiness assessments and workshops. The pilot cloud migration helps uncover risks related to process, operational, and technology changes.

Strategy

Strategy Cloud Infrastructure Metrics

Title Launch Observability at Netflix Scale

The Netflix TechBlog

MARCH 4, 2025

Part 3: System Strategies and Architecture By: VarunKhaitan With special thanks to my stunning colleagues: Mallika Rao , Esmir Mesic , HugoMarques This blog post is a continuation of Part 2 , where we cleared the ambiguity around title launch observability at Netflix. The request schema for the observability endpoint.

Traffic

Traffic Strategy Entertainment Innovation

Dynatrace Amplify Sales Kickoff: 2022 – Digital Transformation is not an event, it’s a journey

Dynatrace

MAY 26, 2022

Over the last week, our Dynatrace team has been busy delivering three-star-studded Dynatrace Amplify Sales Kickoff events to our Partner community across the globe. If you couldn’t make the event, not to worry – we’ve wrapped up all the best bits for you below. Hope to see you at our next event, which we hope to be a hybrid!

Strategy

Strategy Programming Cloud Best Practices

New event type helps avoid unnecessary alerts for planned host downscaling

Dynatrace

FEBRUARY 20, 2020

Cloud providers, such as AWS, Azure, and GCP, help to automate the process of upscaling or downscaling compute power by providing autoscaling groups. As the expected behavior of spot instances is that they are shut down within 5 minutes of their creation, the traditional strategy of availability alerting isn’t viable.

AWS

AWS Azure Infrastructure Availability

Title Launch Observability at Netflix Scale

The Netflix TechBlog

DECEMBER 17, 2024

As Netflix expanded globally and the volume of title launches skyrocketed, the operational challenges of maintaining this manual process became undeniable. Metadata and assets must be correctly configured, data must flow seamlessly, microservices must process titles without error, and algorithms must function as intended.

Traffic

Traffic Scalability Strategy Monitoring

RabbitMQ vs. Kafka: Key Differences

Scalegrid

FEBRUARY 6, 2025

RabbitMQ is designed for flexible routing and message reliability, while Kafka handles high-throughput event streaming and real-time data processing. Kafka is optimized for high-throughput event streaming , excelling in real-time analytics and large-scale data ingestion. What is Apache Kafka?

Latency

Latency Analytics Architecture Storage

Best Practices for Scaling RabbitMQ

Scalegrid

FEBRUARY 24, 2025

Youll also learn strategies for maintaining data safety and managing node failures so your RabbitMQ setup is always up to the task. They can be mirrored and configured for either availability or consistency, providing different strategies for managing network partitions. Erlang is the backbone of RabbitMQ clustering.

Best Practices

Best Practices Traffic Strategy Efficiency

Black Friday traffic exposes gaps in observability strategies

Dynatrace

SEPTEMBER 2, 2022

The company did a postmortem on its monitoring strategy and realized it came up short. We’ve automated many of our ops processes to ensure proactive responses to issues like increases in demand, degradations in user experience, and unexpected changes in behavior,” one customer indicated. It was the longest 90 seconds of my life.

Traffic

Traffic Strategy Retail Ecommerce

Panel Recap: How is your performance and reliability strategy aligned with your customer experience?

Dynatrace

DECEMBER 10, 2020

I recently joined two industry veterans and Dynatrace partners, Syed Husain of Orasi and Paul Bruce of Neotys as panelists to discuss how performance engineering and test strategies have evolved as it pertains to customer experience. Rethinking the process means digital transformation. What trends are you seeing in the industry?

Strategy

Strategy Performance Logistics Monitoring

CrowdStrike incident takeaways: Revisiting vendor quality control and release standards to minimize outage exposure

Dynatrace

JULY 25, 2024

A key learning from the outage caused by the faulty CrowdStrike “Rapid Response” update is how critical it is to understand your vendors’ quality control and release processes. A variety of events and circumstances can cause an outage. What is your testing process? Questions to ask a vendor: How frequently do you release?

Strategy

Strategy Monitoring Open Source Testing

Six causes of major software outages–And how to avoid them

Dynatrace

AUGUST 8, 2024

As recent events have demonstrated, major software outages are an ever-present threat in our increasingly digital world. They may stem from software bugs, cyberattacks, surges in demand, issues with backup processes, network problems, or human errors. This often occurs during major events, promotions, or unexpected surges in usage.

Software

Software Software Infrastructure Network

Incremental Processing using Netflix Maestro and Apache Iceberg

The Netflix TechBlog

NOVEMBER 20, 2023

by Jun He , Yingyi Zhang , and Pawan Dixit Incremental processing is an approach to process new or changed data in workflows. The key advantage is that it only incrementally processes data that are newly added or updated to a dataset, instead of re-processing the complete dataset.

Processing

Processing Big Data Efficiency Engineering

PostgreSQL Trends: Most Popular Cloud Providers, Languages, VACUUM, Query Management Strategies & Deployment Types in Enterprise

Scalegrid

OCTOBER 7, 2019

We recently attended the PostgresConf event in San Jose to hear from the most active PostgreSQL user base on their database management strategies. Most Popular PostgreSQL VACUUM Strategies. VACUUM is an important process to maintain, especially for frequently-updated tables before it starts affecting your PostgreSQL performance.

Strategy

Strategy Cloud Java Open Source

How Dynatrace enhances continuous delivery and manages deployment risk

Dynatrace

JULY 25, 2024

By implementing these strategies, organizations can minimize the impact of potential failures and ensure a smoother transition for users. Dynatrace can monitor production environments for performance degradations and outage events that may cause customers to lose access.

Strategy

Strategy Azure Monitoring DevOps

Using Pausers in Event Loops

DZone

SEPTEMBER 14, 2022

This article explores how Chronicle’s Pausers — an open-source product — can be used to automatically apply a back-off strategy when there is no data to be processed, providing balance between resource usage and responsive, low-latency, low-jitter applications. Description of the Problem.

Latency

Latency Open Source Strategy Design

Advancing AIOps: Preventive operations powered by Davis AI

Dynatrace

FEBRUARY 4, 2025

Additionally, predictions based on historical data are reactive, solely relying on past information to anticipate future events, and can’t prevent all new or emerging issues. This limitation highlights the importance of continuous innovation and adaptation in IT operations and AIOps strategies.

Retail

Retail Metrics Engineering Efficiency

Automate CI/CD pipelines with Dynatrace: Part 2, Deploy stage

Dynatrace

NOVEMBER 28, 2023

These strategies can play a vital role in the early detection of issues, helping you identify potential performance bottlenecks and application issues during deployment for staging. Whenever a change is detected, Dynatrace automatically generates a Deployment change event for the corresponding process and the host on which the process runs.

Traffic

Traffic Best Practices Strategy Engineering

Transparent and confident software delivery with Dynatrace Release Analysis

Dynatrace

APRIL 28, 2021

Organizations that have transitioned to agile software development strategies (including the adoption of a DevOps culture and continuous delivery automation) enforce automated solutions for such decision making—or at the very least, use automation in the gathering of a release-quality metrics. Each entry represents a process group instance.

Software

Software Software Strategy Metrics

Seven benefits of AIOps to transform your business operations

Dynatrace

JULY 5, 2022

AIOps combines big data and machine learning to automate key IT operations processes, including anomaly detection and identification, event correlation, and root-cause analysis. To achieve these AIOps benefits, comprehensive AIOps tools incorporate four key stages of data processing: Collection. Aggregation. Enhanced automation.

Artificial Intelligence

Artificial Intelligence Cloud Innovation Strategy

What is IT automation?

Dynatrace

JULY 6, 2022

And what are the best strategies to reduce manual labor so your team can focus on more mission-critical issues? At its most basic, automating IT processes works by executing scripts or procedures either on a schedule or in response to particular events, such as checking a file into a code repository. So, what is IT automation?

Artificial Intelligence

Artificial Intelligence Tuning Strategy Big Data

Ensuring the Successful Launch of Ads on Netflix

The Netflix TechBlog

JUNE 1, 2023

New content or national events may drive brief spikes, but, by and large, traffic is usually smoothly increasing or decreasing. Next, we launched a Mantis job that processed all requests in the stream and replayed them in a duplicate production environment created for replay traffic. We used Elasticsearch dashboards to analyze results.

Traffic

Traffic Best Practices Systems Testing

Automate CI/CD pipelines with Dynatrace: Part 3, Testing stage

Dynatrace

DECEMBER 18, 2023

In the last blog post of this series, we delved into how Dynatrace, functioning as a deploy-stage orchestrator, solves the challenges confronted by Site Reliability Engineers (SREs) during the early of automating CI/CD processes. This slow feedback and time spent rerunning tests can hinder the overall software deployment process.

Testing

Testing Best Practices Performance Testing Engineering

How To Achieve High GC Throughput

DZone

JULY 29, 2024

Additionally, we’ll delve into actionable strategies to improve GC throughput, unlocking its benefits for modern software development. Whenever an automatic garbage collection event runs, it pauses the application to identify unreferenced objects from memory and evict them. What Is Garbage Collection Throughput?

Java

Java Strategy Metrics Processing

Accelerate and empower Site Reliability Engineering with Dynatrace observability

Dynatrace

OCTOBER 10, 2023

This intricate allocation strategy can be categorized into two main domains. Process Improvements (50%) The allocation for process improvements is devoted to automation and continuous improvement SREs help to ensure that systems are scalable, reliable, and efficient.

Engineering

Engineering DevOps Innovation Strategy

What the SEC cybersecurity disclosure mandate means for application security

Dynatrace

DECEMBER 5, 2023

The mandate also requires that organizations disclose overall cybersecurity risk management, strategy, and governance. What constitutes a “material event” for application security? The mandate provides some room for interpretation on what should be considered a “material” cybersecurity event.

Best Practices

Best Practices Government C++ Education

Dynatrace innovates again with the release of topology-driven auto-adaptive metric baselines

Dynatrace

JULY 23, 2020

This release extends auto-adaptive baselines to the following generic metric sources, all in the context of Dynatrace Smartscape topology: Built-in OneAgent infrastructure monitoring metrics (host, process, network, etc.). Calculated service/DEM metrics (revenue numbers, conversions, event counts, etc.). Custom log metrics.

Metrics

Metrics Innovation Strategy Monitoring

Noisy Neighbor Detection with eBPF

The Netflix TechBlog

SEPTEMBER 10, 2024

One issue that often complicates this process is the "noisy neighbor" problem. The sched_wakeup and sched_wakeup_new hooks are invoked when a process changes state from 'sleeping' to 'runnable.' ' They let us identify when a process is ready to run and is waiting for CPU time.

Latency

Latency Metrics Programming Monitoring

Bootstrapping DevOps automation: How to accelerate your SRE journey with DevOps automation best practices

Dynatrace

OCTOBER 10, 2022

Define validation processes for releases? To avoid this scenario, your SRE’s first step should be to employ technologies and strategies that tame the complexity of multicloud DevOps environments. Dynatrace supports three release detection strategies , two of which require additional configurations.

Best Practices

Best Practices DevOps Cloud Strategy

Enterprise Cloud Security Strategy For 2024

Scalegrid

JANUARY 9, 2024

Key Takeaways Enterprise cloud security is vital due to increased cloud adoption and the significant financial and reputational risks associated with security breaches; a multilayered security strategy that includes encryption, access management, and compliance is essential.

Strategy

Strategy Cloud Best Practices Government

10 digital experience monitoring best practices

Dynatrace

JUNE 21, 2024

How to improve digital experience monitoring Implementing a successful DEM strategy can come with challenges. It can help understand the flow of user interactions, identify areas for improvement, and drive a user experience strategy that better engages customers to meet their needs. Load event start. Load event end.

Best Practices

Best Practices Monitoring Metrics Transportation

1. Streamlining Membership Data Engineering at Netflix with Psyberg

The Netflix TechBlog

NOVEMBER 14, 2023

In this three-part blog post series, we introduce you to Psyberg , our incremental data processing framework designed to tackle such challenges! We’ll discuss batch data processing, the limitations we faced, and how Psyberg emerged as a solution. Let’s dive in! What is late-arriving data? How does late-arriving data impact us?

Data Engineering

Data Engineering Engineering Processing Games

Netflix’s Distributed Counter Abstraction

New integrations announced at AWS re:Invent enhance cloud performance, security, and automation

Trending Sources

Helping customers unlock the Power of Possible

Level up your strategic IT management with fully cost-transparent, fine-grained Dynatrace Cost Allocation

Introducing Impressions at Netflix

What is application modernization? How to pick an application modernization strategy

Metric events: Set up anomaly detection based on your business needs

Better dashboarding with Dynatrace Davis AI: Instant meaningful insights

A Kubernetes platform engineering strategy tames Kubernetes complexity

AIOps strategy unlocks new possibilities for automation, customer satisfaction

AIOps strategy central to proactive multicloud management

Extract metrics from business events to increase the value of business analytics

Foundation Model for Personalized Recommendation

Plan, execute, and modernize a cloud migration strategy with Dynatrace

Title Launch Observability at Netflix Scale

Dynatrace Amplify Sales Kickoff: 2022 – Digital Transformation is not an event, it’s a journey

New event type helps avoid unnecessary alerts for planned host downscaling

Title Launch Observability at Netflix Scale

RabbitMQ vs. Kafka: Key Differences

Best Practices for Scaling RabbitMQ

Black Friday traffic exposes gaps in observability strategies

Panel Recap: How is your performance and reliability strategy aligned with your customer experience?

CrowdStrike incident takeaways: Revisiting vendor quality control and release standards to minimize outage exposure

Six causes of major software outages–And how to avoid them

Incremental Processing using Netflix Maestro and Apache Iceberg

PostgreSQL Trends: Most Popular Cloud Providers, Languages, VACUUM, Query Management Strategies & Deployment Types in Enterprise

How Dynatrace enhances continuous delivery and manages deployment risk

Using Pausers in Event Loops

Advancing AIOps: Preventive operations powered by Davis AI

Automate CI/CD pipelines with Dynatrace: Part 2, Deploy stage

Transparent and confident software delivery with Dynatrace Release Analysis

Seven benefits of AIOps to transform your business operations

What is IT automation?

Ensuring the Successful Launch of Ads on Netflix

Automate CI/CD pipelines with Dynatrace: Part 3, Testing stage

How To Achieve High GC Throughput

Accelerate and empower Site Reliability Engineering with Dynatrace observability

What the SEC cybersecurity disclosure mandate means for application security

Dynatrace innovates again with the release of topology-driven auto-adaptive metric baselines

Noisy Neighbor Detection with eBPF

Bootstrapping DevOps automation: How to accelerate your SRE journey with DevOps automation best practices

Enterprise Cloud Security Strategy For 2024

10 digital experience monitoring best practices

1. Streamlining Membership Data Engineering at Netflix with Psyberg

Stay Connected