Availability and Systems - Technology Performance Pulse

Build resilient IT systems and manage regulatory requirements with compliance and resilience capabilities from Dynatrace

Dynatrace

JANUARY 15, 2025

Here’s how Dynatrace can help automate up to 80% of technical tasks required to manage compliance and resilience: Understand the complexity of IT systems in real time Proactively prevent, prioritize, and efficiently manage performance and security incidents Automate manual and routine tasks to increase your productivity 1.

Systems

Systems DevOps Analytics Monitoring

Rapid Event Notification System at Netflix

The Netflix TechBlog

FEBRUARY 18, 2022

To this end, we developed a Rapid Event Notification System (RENO) to support use cases that require server initiated communication with devices in a scalable and extensible manner. In this blog post, we will give an overview of the Rapid Event Notification System at Netflix and share some of the learnings we gained along the way.

Systems

Systems Traffic Architecture Mobile

Achieving High Availability in CI/CD With Observability

DZone

MARCH 5, 2024

Since most application releases depend on cloud infrastructure, having good continuous integration and continuous delivery (CI/CD) pipelines and end-to-end observability becomes essential for ensuring highly available systems.

Availability

Availability DevOps Infrastructure Scalability

Dynatrace extends Synthetic Monitoring capabilities with Network Availability Monitors to validate the availability of infrastructure and services

Dynatrace

JULY 15, 2024

As HTTP and browser monitors cover the application level of the ISO /OSI model , successful executions of synthetic tests indicate that availability and performance meet the expected thresholds of your entire technological stack. Combined with Dynatrace OneAgent ® , you gain a precise view of the status of your systems at a glance.

Availability

Availability Network Monitoring Infrastructure

Elevating System Management: The Role of Monitoring and Observability in DevOps

DZone

JUNE 21, 2023

In the ever-evolving world of DevOps , the ability to gain deep insights into system behavior, diagnose issues, and improve overall performance is one of the top priorities. Monitoring and observability are two key concepts that facilitate this process, offering valuable visibility into the health and performance of systems.

DevOps

DevOps Systems Monitoring Metrics

Don’t just react: How executives can predict and prevent outages to maximize availability

Dynatrace

OCTOBER 3, 2024

The end goal, of course, is to optimize the availability of organizations’ software. Dynatrace is widely recognized for its AI capabilities’ ability to predict and prevent issues, and automatically identify root causes, maximizing availability. That’s where observability from Dynatrace goes far beyond “observing systems.”

Availability

Availability DevOps Analytics Cloud

Managing PostgreSQL® High Availability – Part I: PostgreSQL Automatic Failover

Scalegrid

SEPTEMBER 5, 2024

Managing High Availability (HA) in your PostgreSQL hosting is very important to ensuring your database deployment clusters maintain exceptional uptime and strong operational performance so your data is always available to your application. Effective management of failover and switchover operations is crucial for high availability.

Availability

Availability Servers Database Open Source

Flexible, scalable, self-service Kubernetes native observability now in General Availability

Dynatrace

MAY 17, 2022

For years, enterprises managed observability data on a team-by-team basis , using a combination of ticketing systems and configuration management tools. The application consists of several microservices that are available as pod-backed services. Information about each of these topics will be available in upcoming announcements.

Availability

Availability Scalability Cloud Metrics

Dynatrace EdgeConnect securely connects your local systems to Dynatrace SaaS

Dynatrace

OCTOBER 10, 2023

EdgeConnect provides a secure bridge for SaaS-heavy companies like Dynatrace, which hosts numerous systems and data behind VPNs. In this hybrid world, IT and business processes often span across a blend of on-premises and SaaS systems, making standardization and automation necessary for efficiency.

Systems

Systems Efficiency Internet Internet

OpenPipeline: Simplify access to critical business data

Dynatrace

NOVEMBER 4, 2024

There’s a goldmine of business data traversing your IT systems, yet most of it remains untapped. Other data sources, including APIs and log files — are used to expand access, often to external or proprietary systems. In fact, it’s likely that some of your critical business systems already write business data to log files.

Analytics

Analytics Airlines Metrics Monitoring

Build systems more reliably with Dynatrace: Chaos Engineering

Dynatrace

AUGUST 21, 2024

These releases often assumed ideal conditions such as zero latency, infinite bandwidth, and no network loss, as highlighted in Peter Deutsch’s eight fallacies of distributed systems. With Dynatrace, teams can seamlessly monitor the entire system, including network switches, database storage, and third-party dependencies.

Engineering

Engineering Systems Latency Metrics

Netflix’s Distributed Counter Abstraction

The Netflix TechBlog

NOVEMBER 12, 2024

Both categories share common requirements, such as high throughput and high availability. Failures in a distributed system are a given, and having the ability to safely retry requests enhances the reliability of the service. The table below provides a detailed overview of the diverse requirements across these two categories.

Latency

Latency Cache Infrastructure Strategy

How To Implement Specific Distributed System Patterns Using Spring Boot: Introduction

DZone

SEPTEMBER 11, 2024

Regarding contemporary software architecture, distributed systems have been widely recognized for quite some time as the foundation for applications with high availability, scalability, and reliability goals. Spring Boot Overview One of the most popular Java EE frameworks for creating apps is Spring.

Systems

Systems Java Software Architecture Programming

Dynatrace observability now available for Red Hat OpenShift on IBM Z and LinuxONE mainframes

Dynatrace

JULY 24, 2024

IBM Z and LinuxONE mainframes running the Linux operating system enable you to respond faster to business demands, protect data from core to cloud, and streamline insights and automation. Dynatrace observability is available for Red Hat OpenShift on IBM Power. Learn more about the new Kubernetes Experience for Platform Engineering.

Availability

Availability Infrastructure Metrics Hardware

Supporting Diverse ML Systems at Netflix

The Netflix TechBlog

MARCH 7, 2024

The Machine Learning Platform (MLP) team at Netflix provides an entire ecosystem of tools around Metaflow , an open source machine learning infrastructure framework we started, to empower data scientists and machine learning practitioners to build and manage a variety of ML systems. ETL workflows), as well as downstream (e.g.

Systems

Systems Media Cache Open Source

Choreography Pattern: Optimizing Communication in Distributed Systems

DZone

SEPTEMBER 30, 2023

There are two popular methodologies available to tackle this challenge. While this architectural approach offers scalability, reusability, and adaptability, it also presents a unique challenge: effectively managing communication between these microservices. The first, Service Orchestration , was discussed in my previous article.

Systems

Systems Virtualization Architecture Scalability

Tailored access management, Part 3: Simplified setup for enterprise-scale access management

Dynatrace

OCTOBER 14, 2024

Manage the complexity of authorization systems Most modern authorization systems provide access management using Attribute-Based Access Control (ABAC). The system demands significant effort to design, manage, and maintain, especially as an organization’s needs evolve.

Monitoring

Monitoring Metrics Systems Scalability

Reliability indicators that matter to your business: SLOs for all data types

Dynatrace

OCTOBER 31, 2024

This lets you build your SLOs around the indicators that matter to you and your customers—critical metrics related to availability, failure rates, request response times, or select logs and business events. While the SLO management web UI and API are already available, the dashboard tile will be released within the next weeks.

Metrics

Metrics Availability Monitoring Scalability

New Distributed Tracing app provides effortless trace insights

Dynatrace

OCTOBER 23, 2024

Find what you’re looking for faster with: Enhanced charting and data visualization: Easily filter, group, search, and visualize trace data to gain deeper insights into your system’s behavior. The new Services app is already available to all DPS and non-DPS customers. You can even walk through the same example above.

Tuning

Tuning Website Availability Performance

Title Launch Observability at Netflix Scale

The Netflix TechBlog

MARCH 4, 2025

Part 3: System Strategies and Architecture By: VarunKhaitan With special thanks to my stunning colleagues: Mallika Rao , Esmir Mesic , HugoMarques This blog post is a continuation of Part 2 , where we cleared the ambiguity around title launch observability at Netflix. The request schema for the observability endpoint.

Traffic

Traffic Strategy Entertainment Innovation

HTTP monitors on the latest Dynatrace platform extend insights into the health of your API endpoints and simplify test management

Dynatrace

DECEMBER 18, 2024

But nowadays, with complex and dynamically changing modern IT systems, the last result details might not be enough in some cases. Thanks to the power of Grail, those details are available for all executions stored for the entire retention period during which synthetic results are kept.

Monitoring

Monitoring Testing Metrics Analytics

New continuous compliance requirements drive the need to converge observability and security

Dynatrace

DECEMBER 12, 2024

Boost your operational resilience: Combining availability and security is now essential. The Federal Reserve Regulation HH in the United States focuses on operational resilience requirements for systemically important financial market utilities. Its time to adopt a unified observability and security approach.

Analytics

Analytics Government Efficiency Innovation

Enrich Tenable vulnerability findings with Dynatrace runtime context

Dynatrace

JANUARY 14, 2025

However, the challenge often lies in the fragmentation of vulnerability data across different systems and tools. Managing vulnerabilities in a fragmented world In today’s complex digital landscape, managing vulnerabilities effectively is crucial for maintaining robust security.

Infrastructure

Infrastructure Monitoring Availability Systems

Chaos Engineering With Litmus: A CNCF Incubating Project

DZone

FEBRUARY 6, 2025

System resilience stands as the key requirement for e-commerce platforms during scaling operations to keep services operational and deliver performance excellence to users. We have developed a microservices architecture platform that encounters sporadic system failures when faced with heavy traffic events.

Engineering

Engineering Traffic Architecture Network

Demo: Transform OpenTelemetry data into actionable insights with the Dynatrace Distributed Tracing app

Dynatrace

OCTOBER 29, 2024

Note that the developers of the respective services need to make these metrics available by exposing them via, for example, a Prometheus endpoint that can be used by the OpenTelemetry collector to ingest them and forward them to your Dynatrace tenant.

Metrics

Metrics Tuning Monitoring Availability

Best Practices for Scaling RabbitMQ

Scalegrid

FEBRUARY 24, 2025

Scaling RabbitMQ ensures your system can handle growing traffic and maintain high performance. Key Takeaways RabbitMQ improves scalability and fault tolerance in distributed systems by decoupling applications, enabling reliable message exchanges.

Best Practices

Best Practices Traffic Strategy Scalability

Dynatrace joins the Microsoft Intelligent Security Association

Dynatrace

NOVEMBER 20, 2024

This rising risk amplifies the need for reliable security solutions that integrate with existing systems. Dynatrace, available as an Azure-native service , has a longstanding partnership with Microsoft, deeply rooted in a strong “build with” approach to deliver seamless user experience.

Best Practices

Best Practices Innovation Azure Cloud

Log filtering made easy: Data segmentation and advanced filters in Dynatrace Logs

Dynatrace

MARCH 5, 2025

For enterprises managing complex systems and vast datasets using traditional log management tools, finding specific log entries quickly and efficiently can feel like searching for a needle in a haystack. While selecting a Kubernetes segment, the selector provides a dynamic list of available resources. What are Dynatrace Segments?

AWS

AWS Azure Best Practices Efficiency

Next-level interaction and customization of data visualizations in Dynatrace Dashboards and Notebooks

Dynatrace

OCTOBER 10, 2024

New: identify hotspots with the honeycomb visualization Honeycombs are great for visualizing health in complex and distributed systems, enabling you to visualize countless entities effectively and at scale. To achieve the best visual outcome, we recommend experimenting with the available customization options.

Latency

Latency Infrastructure Monitoring Metrics

Foundation Model for Personalized Recommendation

The Netflix TechBlog

MARCH 28, 2025

By Ko-Jen Hsiao , Yesu Feng and Sudarshan Lamkhede Motivation Netflixs personalized recommender system is a complex system, boasting a variety of specialized machine learned models each catering to distinct needs including Continue Watching and Todays Top Picks for You. Refer to our recent overview for more details).

Tuning

Tuning Efficiency Latency Strategy

Introducing Impressions at Netflix

The Netflix TechBlog

FEBRUARY 14, 2025

It requires a state-of-the-art system that can track and process these impressions while maintaining a detailed history of each profiles exposure. In this multi-part blog series, we take you behind the scenes of our system that processes billions of impressions daily.

Tuning

Tuning Latency Efficiency Storage

Power Dashboarding, Part I: Start your exploration journey with Dashboards

Dynatrace

FEBRUARY 6, 2025

Whether you’re a seasoned IT expert or a marketing professional looking to improve business performance, understanding the data available to you is essential. As you went through these steps, you likely noticed some of the chart options available. Also, explore additional dashboards available on the Dynatrace Playground.

Metrics

Metrics Infrastructure Monitoring Best Practices

Dare to debug production with Dynatrace Live Debugger

Dynatrace

FEBRUARY 4, 2025

Live snapshot Visual Studio Code IDE Live snapshot IntelliJ IDE Get even more context and telemetries from Dynatrace into your IDE Having real-time code-level data available within the IDE is only the beginning. Dynatrace Live Debugger is currently in preview and will be generally available within the next 90 days.

Open Source

Open Source Code Engineering Best Practices

Microsoft Ignite 2024 guide: Cloud observability for AI transformation

Dynatrace

NOVEMBER 18, 2024

The power of cloud observability Modernizing legacy systems can be challenging, and it’s important to do so with purpose—not just to modernize for its own sake. By prioritizing observability, organizations can ensure the availability, performance, and security of business-critical applications.

Cloud

Cloud Azure Artificial Intelligence Innovation

Globalizing Productions with Netflix’s Media Production Suite

The Netflix TechBlog

MARCH 31, 2025

As file sizes grow and workflows become more complex, these issues are magnified, leading to inefficiencies that slow down post-production and reduce the available time spent on creativework. Depending on the market, or production budget, cutting-edge technology might not be available or affordable. So what isit?

Media

Media Logistics Innovation Cloud

My code::dive talk video is available: New Q&A

Sutter's Mill

DECEMBER 11, 2024

Our language and our community continues to grow, and it’s great to see us addressing the problems we most need to address, so we have an answer for safety, we have an answer for simpler build systems and reducing the number of side languages to make C++ work in practice. I use this world’s banking system.

Code

Code C++ Availability Energy

Simplify log onboarding: From zero to observability in minutes

Dynatrace

MARCH 5, 2025

The log ingestion wizard offers support for all log ingestion methods available in Dynatrace Hub Get started with Logs: The OneAgent advantage For most scenarios, Dynatrace OneAgent is your best friend for getting started with Dynatrace log ingestion. Different log ingestion methods are available to address various needs.

Open Source

Open Source IoT Cloud Azure

Create simple workflows to automate alerts during development

Dynatrace

JANUARY 22, 2025

You can select any trigger thats available for standard workflows, including schedules, problem triggers, customer event triggers, or on-demand triggers. To orchestrate the different logging services, you use Fluent Bit to forward these logs to your centralized logging system, like Dynatrace.

Development

Development Processing Monitoring Code

SLOs for Kubernetes clusters: Optimize resource utilization of Kubernetes clusters with service-level objectives

Dynatrace

NOVEMBER 11, 2024

Kubernetes is a widely used open source system for container orchestration. A Kubernetes SLO that continuously evaluates CPU, memory usage, and capacity and compares these available resources to the requested and utilized memory of Kubernetes workloads makes potential resource waste visible, revealing opportunities for countermeasures.

Efficiency

Efficiency Best Practices Monitoring Cloud

Dynatrace Cost & Carbon Optimization certified for accuracy and transparency

Dynatrace

MARCH 5, 2025

Integration with existing systems and processes : Integration with existing IT infrastructure, observability solutions, and workflows often requires significant investment and customization. The certification results are now publicly available.

Energy

Energy Analytics Traffic Cloud

RabbitMQ vs. Kafka: Key Differences

Scalegrid

FEBRUARY 6, 2025

Introduction to Message Brokers Message brokers enable applications, services, and systems to communicate by acting as intermediaries between senders and receivers. This decoupling simplifies system architecture and supports scalability in distributed environments.

Latency

Latency Analytics Architecture Storage

Dynatrace Observability for Developers saves time with real-time data

Dynatrace

FEBRUARY 4, 2025

In a single view, developers get an instant overview of application performance, system health, logs, problems, deployment status, user interactions, and much more. Using the trace ID, we can inspect the captured trace directly in the Dynatrace Tracing app, giving us a full view of the end-to-end transaction as it flows through our systems.

Development

Development Analytics Code Architecture

Implementing a Self-Healing Infrastructure With Kubernetes and Prometheus

DZone

MAY 3, 2023

In today's world, the need for highly available and fault-tolerant systems is more important than ever. It includes features such as automatic scaling, rolling updates, and self-healing, making it an ideal choice for building highly available systems.

Infrastructure

Infrastructure Open Source Scalability Monitoring

How to observe logs with Journald and Dynatrace

Dynatrace

APRIL 4, 2025

Journald provides unified structured logging for systems, services, and applications, eliminating the need for custom parsing for severity or details. System health, performance troubleshooting, and debugging situations no longer require manual correlation of logs across multiple disconnected tools or servers.

Analytics

Analytics Operating System Scalability Infrastructure

Build resilient IT systems and manage regulatory requirements with compliance and resilience capabilities from Dynatrace

Rapid Event Notification System at Netflix

Trending Sources

Achieving High Availability in CI/CD With Observability

Dynatrace extends Synthetic Monitoring capabilities with Network Availability Monitors to validate the availability of infrastructure and services

Elevating System Management: The Role of Monitoring and Observability in DevOps

Don’t just react: How executives can predict and prevent outages to maximize availability

Managing PostgreSQL® High Availability – Part I: PostgreSQL Automatic Failover

Flexible, scalable, self-service Kubernetes native observability now in General Availability

Dynatrace EdgeConnect securely connects your local systems to Dynatrace SaaS

OpenPipeline: Simplify access to critical business data

Build systems more reliably with Dynatrace: Chaos Engineering

Netflix’s Distributed Counter Abstraction

How To Implement Specific Distributed System Patterns Using Spring Boot: Introduction

Dynatrace observability now available for Red Hat OpenShift on IBM Z and LinuxONE mainframes

Supporting Diverse ML Systems at Netflix

Choreography Pattern: Optimizing Communication in Distributed Systems

Tailored access management, Part 3: Simplified setup for enterprise-scale access management

Reliability indicators that matter to your business: SLOs for all data types

New Distributed Tracing app provides effortless trace insights

Title Launch Observability at Netflix Scale

HTTP monitors on the latest Dynatrace platform extend insights into the health of your API endpoints and simplify test management

New continuous compliance requirements drive the need to converge observability and security

Enrich Tenable vulnerability findings with Dynatrace runtime context

Chaos Engineering With Litmus: A CNCF Incubating Project

Demo: Transform OpenTelemetry data into actionable insights with the Dynatrace Distributed Tracing app

Best Practices for Scaling RabbitMQ

Dynatrace joins the Microsoft Intelligent Security Association

Log filtering made easy: Data segmentation and advanced filters in Dynatrace Logs

Next-level interaction and customization of data visualizations in Dynatrace Dashboards and Notebooks

Foundation Model for Personalized Recommendation

Introducing Impressions at Netflix

Power Dashboarding, Part I: Start your exploration journey with Dashboards

Dare to debug production with Dynatrace Live Debugger

Microsoft Ignite 2024 guide: Cloud observability for AI transformation

Globalizing Productions with Netflix’s Media Production Suite

My code::dive talk video is available: New Q&A

Simplify log onboarding: From zero to observability in minutes

Create simple workflows to automate alerts during development

SLOs for Kubernetes clusters: Optimize resource utilization of Kubernetes clusters with service-level objectives

Dynatrace Cost & Carbon Optimization certified for accuracy and transparency

RabbitMQ vs. Kafka: Key Differences

Dynatrace Observability for Developers saves time with real-time data

Implementing a Self-Healing Infrastructure With Kubernetes and Prometheus

How to observe logs with Journald and Dynatrace

Stay Connected