Infrastructure, Processing and Tuning - Technology Performance Pulse

New integrations announced at AWS re:Invent enhance cloud performance, security, and automation

Dynatrace

DECEMBER 3, 2024

This integration simplifies the process of embedding Dynatrace full-stack observability directly into custom Amazon Machine Images (AMIs). This seamless integration accelerates cloud adoption, allowing enterprises to maximize the value of their AWS infrastructure and focus on innovation rather than managing observability configurations.

AWS

AWS Cloud Performance Innovation

Netflix’s Distributed Counter Abstraction

The Netflix TechBlog

NOVEMBER 12, 2024

However, this category requires near-immediate access to the current count at low latencies, all while keeping infrastructure costs to a minimum. Eventually Consistent : This category needs accurate and durable counts, and is willing to tolerate a slight delay in accuracy and a slightly higher infrastructure cost as a trade-off.

Latency

Latency Cache Infrastructure Strategy

Building Netflix’s Distributed Tracing Infrastructure

The Netflix TechBlog

OCTOBER 19, 2020

Now let’s look at how we designed the tracing infrastructure that powers Edgar. Reconstructing a streaming session was a tedious and time consuming process that involved tracing all interactions (requests) between the Netflix app, our Content Delivery Network (CDN), and backend microservices.

Infrastructure

Infrastructure Transportation Storage Open Source

Monitoring of Kubernetes Infrastructure for day 2 operations

Dynatrace

JULY 8, 2020

One of the promises of container orchestration platforms is to make i t easier for the developers to accelerate the deployment of their app lication s without having to worry about scalability and infrastructure dependencies. It is important to understand the impact infrastructure can have on the platform and the application it runs.

Infrastructure

Infrastructure Monitoring Cloud Metrics

Foundation Model for Personalized Recommendation

The Netflix TechBlog

MARCH 28, 2025

It facilitates the distribution of these learnings to other models, either through shared model weights for fine tuning or directly through embeddings. The impetus for constructing a foundational recommendation model is based on the paradigm shift in natural language processing (NLP) to large language models (LLMs).

Tuning

Tuning Efficiency Latency Strategy

RabbitMQ vs. Kafka: Key Differences

Scalegrid

FEBRUARY 6, 2025

RabbitMQ is designed for flexible routing and message reliability, while Kafka handles high-throughput event streaming and real-time data processing. RabbitMQ follows a message broker model with advanced routing, while Kafkas event streaming architecture uses partitioned logs for distributed processing. What is Apache Kafka?

Latency

Latency Analytics Architecture Storage

Extend infrastructure observability with JMX Extensions and additional full-stack metrics

Dynatrace

APRIL 20, 2020

Infrastructure exists to support the backing services that are collectively perceived by users to be your web application. Issues that manifest themselves as performance degradation on a user’s device can often be traced back to underlying infrastructure issues. Dynatrace news. Monitor additional metrics.

Infrastructure

Infrastructure Metrics Java Virtualization

Rebuilding Netflix Video Processing Pipeline with Microservices

The Netflix TechBlog

JANUARY 10, 2024

Future blogs will provide deeper dives into each service, sharing insights and lessons learned from this process. The Netflix video processing pipeline went live with the launch of our streaming service in 2007. The Netflix video processing pipeline went live with the launch of our streaming service in 2007.

Processing

Processing Media Latency Innovation

Title Launch Observability at Netflix Scale

The Netflix TechBlog

DECEMBER 17, 2024

As Netflix expanded globally and the volume of title launches skyrocketed, the operational challenges of maintaining this manual process became undeniable. Metadata and assets must be correctly configured, data must flow seamlessly, microservices must process titles without error, and algorithms must function as intended.

Traffic

Traffic Scalability Strategy Monitoring

Building and Scaling Data Lineage at Netflix to Improve Data Infrastructure Reliability, and…

The Netflix TechBlog

MARCH 25, 2019

Building and Scaling Data Lineage at Netflix to Improve Data Infrastructure Reliability, and Efficiency By: Di Lin , Girish Lingappa , Jitender Aswani Imagine yourself in the role of a data-inspired decision maker staring at a metric on a dashboard about to make a critical business decision but pausing to ask a question?—?“Can

Infrastructure

Infrastructure Big Data Transportation Architecture

Why applying chaos engineering to data-intensive applications matters

Dynatrace

MAY 23, 2024

Stream processing One approach to such a challenging scenario is stream processing, a computing paradigm and software architectural style for data-intensive software systems that emerged to cope with requirements for near real-time processing of massive amounts of data.

Engineering

Engineering Tuning Latency Open Source

How Netflix uses eBPF flow logs at scale for network insight

The Netflix TechBlog

JUNE 7, 2021

Challenges The cloud network infrastructure that Netflix utilizes today consists of AWS services such as VPC, DirectConnect, VPC Peering, Transit Gateways, NAT Gateways, etc and Netflix owned devices. These metrics are visualized using Lumen , a self-service dashboarding infrastructure. What is BPF?

Network

Network Transportation AWS Cloud

What is IT automation?

Dynatrace

JULY 6, 2022

At its most basic, automating IT processes works by executing scripts or procedures either on a schedule or in response to particular events, such as checking a file into a code repository. Adding AIOps to automation processes makes the volume of data that applications and multicloud environments generate much less overwhelming.

Artificial Intelligence

Artificial Intelligence Tuning Strategy Big Data

Running the Astronomy Shop OpenTelemetry demo application with Dynatrace

Dynatrace

MARCH 13, 2025

OpenTelemetry provides a common set of tools, APIs, and SDKs to help collect observability signals from applications and infrastructure endpoints. Traces, metrics, and logs are already well covered, but interesting enhancements are being made frequently, so stay tuned.

Open Source

Open Source Metrics Architecture Infrastructure

Unlock log analytics: Seamless insights without writing queries

Dynatrace

MAY 28, 2024

You can easily pivot between a hot Kubernetes cluster and the log file related to the issue in 2-3 clicks in these Dynatrace® Apps: Infrastructure & Observability (I&O), Databases, Clouds, and Kubernetes. A sudden drop in received log data? For a single log record found, you can easily see the surrounding logs.

Analytics

Analytics Infrastructure Database Cloud

Observability throughout the software development lifecycle increases delivery performance

Dynatrace

OCTOBER 4, 2024

Today, development teams suffer from a lack of automation for time-consuming tasks, the absence of standardization due to an overabundance of tool options, and insufficiently mature DevSecOps processes. This process begins when the developer merges a code change and ends when it is running in a production environment.

Software

Software Software Development Performance

What is DevSecOps?

Dynatrace

JANUARY 26, 2021

DevSecOps is a cross-team collaboration framework that integrates security into DevOps processes from the start rather than waiting to address security in a separate silo. DevOps has gained ground in recent years as a way to combine key operational principles with development cycles, recognizing that these two processes must coexist.

DevOps

DevOps Best Practices Open Source Tuning

Kubernetes made simple? Kelsey Hightower and Andreas Grabner discuss the future of cloud-native technologies

Dynatrace

FEBRUARY 17, 2022

So when people hear me explain things, it’s this process of convincing myself I completely understand it.”. There are stakeholders, dependencies, and end-users throughout the process. Infrastructure as code vs infrastructure as data. As a result, teams can automate more processes in their software development lifecycle.

Technology

Technology Technology Cloud Infrastructure

DevSecOps: Recent experiences in field of Federal & Government

Dynatrace

MAY 15, 2020

This is especially true when we consider the explosive growth of cloud and container environments, where containers are orchestrated and infrastructure is software defined, meaning even the simplest of environments move at speeds beyond manual control, and beyond the speed of legacy Security practices. And this poses a significant risk.

Government

Government DevOps Infrastructure Network

Why log monitoring and log analytics matter in a hyperscale world

Dynatrace

NOVEMBER 15, 2021

Logs can include data about user inputs, system processes, and hardware states. Log monitoring is a process by which developers and administrators continuously observe logs as they’re being recorded. Log analytics is the process of evaluating and interpreting log data so teams can quickly detect and resolve issues.

Analytics

Analytics Monitoring DevOps Artificial Intelligence

Flexible, scalable, self-service Kubernetes native observability now in General Availability

Dynatrace

MAY 17, 2022

To solve this problem , Dynatrace offers a fully automated approach to infrastructure and application observability including Kubernetes control plane, deployments, pods, nodes, and a wide array of cloud-native technologies. None of this complexity is exposed to application and infrastructure teams. A look to the future.

Availability

Availability Scalability Cloud Metrics

Conducting log analysis with an observability platform and full data context

Dynatrace

APRIL 20, 2023

With more automated approaches to log monitoring and log analysis, however, organizations can gain visibility into their applications and infrastructure efficiently and with greater precision—even as cloud environments grow. They enable IT teams to identify and address the precise cause of application and infrastructure issues.

Analytics

Analytics Infrastructure Storage Architecture

Using Dynatrace to master the 5 pillars of the AWS Well-Architected Framework (Part 1)

Dynatrace

NOVEMBER 24, 2020

Tracking changes to automated processes, including auditing impacts to the system, and reverting to the previous environment states seamlessly. The ultimate goal of each of these reviews is to identify gaps, quantify risk, and develop recommendations for improving the team, processes, and architecture with each of the five pillars.

AWS

AWS Artificial Intelligence Best Practices Lambda

Applying Netflix DevOps Patterns to Windows

The Netflix TechBlog

AUGUST 22, 2019

Baking Windows with Packer By Justin Phelps and Manuel Correa Customizing Windows images at Netflix was a manual, error-prone, and time consuming process. We looked at our process for creating a Windows AMI and discovered it was error-prone and full of toil. Last year, we decided to improve the AMI baking process.

DevOps

DevOps AWS Tuning Infrastructure

Auto-Diagnosis and Remediation in Netflix Data Platform

The Netflix TechBlog

JANUARY 13, 2022

Pensive infrastructure comprises two separate systems to support batch and streaming workloads. This blog will explore these two systems and how they perform auto-diagnosis and remediation across our Big Data Platform and Real-time infrastructure. In the future, we are looking to automate this process.

Big Data

Big Data Infrastructure Metrics Games

Software intelligence as code enables tailored observability, AIOps, and application security at scale

Dynatrace

FEBRUARY 9, 2022

More recently, teams have begun to apply DevOps best practices to infrastructure automation, giving developers a more active role with GitOps as an operational framework. Key components of GitOps are declarative infrastructure as code, orchestration, and observability.

Code

Code Software Software DevOps

OneAgent for Linux on IBM Z (General Availability)

Dynatrace

NOVEMBER 20, 2019

At Dynatrace, where we provide a software intelligence platform for hybrid environments (from infrastructure to cloud) we see a growing need to measure how mainframe architecture and the services running on it contribute to the overall performance and availability of applications. Network metrics are also collected for detected processes.

Availability

Availability Hardware Java Tuning

Comparing PostgreSQL DigitalOcean Performance & Pricing – ScaleGrid vs. DigitalOcean Managed Databases

Scalegrid

JUNE 4, 2020

As an open source database, it’s a highly popular choice for enterprise applications looking to modernize their infrastructure and reduce their total cost of ownership, along with startup and developer applications looking for a powerful, flexible and cost-effective database to work with. PostgreSQL Configuration Management & Tuning.

Database

Database Latency Benchmarking Performance

Best Practices for Scaling RabbitMQ

Scalegrid

FEBRUARY 24, 2025

Proper setup involves creating a configuration process that accounts for hostname changes, which could prevent nodes from rejoining the cluster. Message load balancing guarantees that messages are processed evenly across different queues and nodes within the RabbitMQ system. Erlang is the backbone of RabbitMQ clustering.

Best Practices

Best Practices Traffic Strategy Efficiency

Beyond DORA compliance: How Dynatrace helps the financial sector stay resilient

Dynatrace

JULY 30, 2024

Complexity of digital ecosystems Pain point : Financial services operate in complex environments with numerous applications, hybrid cloud infrastructures, and third-party vendors. This complexity increases cybersecurity risks and complicates governance.

Government

Government Innovation Analytics Open Source

Introducing Netflix’s Key-Value Data Abstraction Layer

The Netflix TechBlog

SEPTEMBER 18, 2024

Vidhya Arvind , Rajasekhar Ummadisetty , Joey Lynch , Vinay Chella Introduction At Netflix our ability to deliver seamless, high-quality, streaming experiences to millions of users hinges on robust, global backend infrastructure. As the consumer processes these results, the system tracks the number of items consumed and the total size used.

Latency

Latency Storage Cache Efficiency

Automated observability, security, and reliability at scale

Dynatrace

JULY 18, 2023

As software development grows more complex, managing components using an automated onboarding process becomes increasingly important. The validation process is automated based on events that occur, while the objectives’ configuration, which is validated by the Site Reliability Guardian , is stored in a separate file.

Best Practices

Best Practices Code Infrastructure Latency

PostgreSQL vs. Oracle: Difference in Costs, Ease of Use & Functionality

Scalegrid

JULY 13, 2020

Compare ease of use across compatibility, extensions, tuning, operating systems, languages and support providers. of PostgreSQL users are currently in the process of migrating to the RDBMS, according to the 2019 PostgreSQL Trends Report , an astounding percentage considering this is the 4th most popular database in the world.

Open Source

Open Source Tuning C++ Database

5 SRE best practices you can implement today

Dynatrace

JULY 6, 2022

Organizations that have achieved SRE maturity have a better handle on the state of their infrastructure, the ability to tie reliability metrics more tightly to business objectives, and the means to ensure a consistent and responsive customer experience. Design, implement, and tune effective SLOs.

Best Practices

Best Practices Open Source Tuning Infrastructure

Dynatrace joins the OpenTelemetry project

Dynatrace

JUNE 13, 2019

The Dynatrace platform automatically integrates OpenTelemetry data, thereby providing the highest possible scalability, enterprise manageability, seamless processing of data, and, most importantly the best analytics through Davis (our AI-driven analytics engine), and automation support available. What Dynatrace will contribute.

IoT

IoT C++ Analytics Java

OpenTelemetry 101: A nontechnical guide for IT leaders and enthusiasts

Dynatrace

JULY 22, 2024

Using OpenTelemetry, developers can collect and process telemetry data from applications, services, and systems. Text-based records of events and activities generated by applications and infrastructure components. To understand what this means, let’s first look at two of the core concepts: observability and telemetry.

Latency

Latency Best Practices Metrics Open Source

New Prometheus-based extensions enable intelligent observability for more than 200 additional technologies

Dynatrace

FEBRUARY 14, 2022

Among these, you can find essential elements of application and infrastructure stacks, from app gateways (like HAProxy), through app fabric (like RabbitMQ), to databases (like MongoDB) and storage systems (like NetApp, Consul, Memcached, and InfluxDB, just to name a few). Many technologies expose their metrics in the Prometheus data format.

Technology

Technology Technology Metrics Infrastructure

Auto-adaptive thresholds for AI-driven quality gating

Dynatrace

JUNE 4, 2024

This process, known as auto-adaptive thresholding, eliminates the need to define a static threshold upfront. While platform engineers can build and prepare the necessary infrastructure and templates for self-adoption, developers must still provide some customization.

Metrics

Metrics Engineering Code Tuning

What is serverless computing? Driving efficiency without sacrificing observability

Dynatrace

JANUARY 26, 2021

REST APIs, authentication, databases, email, and video processing all have a home on serverless platforms. The Serverless Process. Cloud-hosted managed services eliminate the minute day-to-day tasks associated with hosting IT infrastructure on-premises. The average request is handled, processed, and returned quickly.

Serverless

Serverless Efficiency Lambda AWS

Keeping data in India with AI-powered observability operated on AWS Mumbai

Dynatrace

NOVEMBER 6, 2023

Recently, the Parliament of India released the Digital Personal Data Protection Act 2023 , which regulates the processing of digital personal data in India and recognizes the right of individuals to protect their data in India. Dynatrace is already supported in 17 local regions on three hyperscalers (AWS, Azure, and GCP).

AWS

AWS Azure Tuning Analytics

Ensure safe and secure releases at scale by providing Golden Paths

Dynatrace

NOVEMBER 14, 2023

Golden Paths for rapid product development Modern software development aims to streamline development and delivery processes to ensure fast releases to the market without violating quality and security standards. After completing this two-step process, a ready-to-use guardian is created.

Best Practices

Best Practices Government DevOps Engineering

Introducing Netflix TimeSeries Data Abstraction Layer

The Netflix TechBlog

OCTOBER 8, 2024

The Key-Value Abstraction offers a flexible, scalable solution for storing and accessing structured key-value data, while the Data Gateway Platform provides essential infrastructure for protecting, configuring, and deploying the data tier. Let’s dive into the various aspects of this abstraction.

Latency

Latency Storage Traffic Tuning

Build and operate multicloud FaaS with enhanced, intelligent end-to-end observability

Dynatrace

APRIL 25, 2023

These functions are executed by a serverless platform or provider (such as AWS Lambda, Azure Functions or Google Cloud Functions) that manages the underlying infrastructure, scaling and billing. Enable faster development and deployment cycles by abstracting away the infrastructure complexity.

Serverless

Serverless Lambda Azure AWS

Dynatrace innovates again with the release of topology-driven auto-adaptive metric baselines

Dynatrace

JULY 23, 2020

Today we’re happy to announce, that with the release of Dynatrace version 1.198 (SaaS and Managed), auto-adaptive baseline extends beyond application performance (APM) metrics to include thousands of infrastructure and cloud metrics as well. AI-powered root cause analysis via auto-adaptive baselines for more than classic APM metrics.

Metrics

Metrics Innovation Strategy Monitoring

New integrations announced at AWS re:Invent enhance cloud performance, security, and automation

Netflix’s Distributed Counter Abstraction

Trending Sources

Building Netflix’s Distributed Tracing Infrastructure

Monitoring of Kubernetes Infrastructure for day 2 operations

Foundation Model for Personalized Recommendation

RabbitMQ vs. Kafka: Key Differences

Extend infrastructure observability with JMX Extensions and additional full-stack metrics

Rebuilding Netflix Video Processing Pipeline with Microservices

Title Launch Observability at Netflix Scale

Building and Scaling Data Lineage at Netflix to Improve Data Infrastructure Reliability, and…

Why applying chaos engineering to data-intensive applications matters

How Netflix uses eBPF flow logs at scale for network insight

What is IT automation?

Running the Astronomy Shop OpenTelemetry demo application with Dynatrace

Unlock log analytics: Seamless insights without writing queries

Observability throughout the software development lifecycle increases delivery performance

What is DevSecOps?

Kubernetes made simple? Kelsey Hightower and Andreas Grabner discuss the future of cloud-native technologies

DevSecOps: Recent experiences in field of Federal & Government

Why log monitoring and log analytics matter in a hyperscale world

Flexible, scalable, self-service Kubernetes native observability now in General Availability

Conducting log analysis with an observability platform and full data context

Using Dynatrace to master the 5 pillars of the AWS Well-Architected Framework (Part 1)

Applying Netflix DevOps Patterns to Windows

Auto-Diagnosis and Remediation in Netflix Data Platform

Software intelligence as code enables tailored observability, AIOps, and application security at scale

OneAgent for Linux on IBM Z (General Availability)

Comparing PostgreSQL DigitalOcean Performance & Pricing – ScaleGrid vs. DigitalOcean Managed Databases

Best Practices for Scaling RabbitMQ

Beyond DORA compliance: How Dynatrace helps the financial sector stay resilient

Introducing Netflix’s Key-Value Data Abstraction Layer

Automated observability, security, and reliability at scale

PostgreSQL vs. Oracle: Difference in Costs, Ease of Use & Functionality

5 SRE best practices you can implement today

Dynatrace joins the OpenTelemetry project

OpenTelemetry 101: A nontechnical guide for IT leaders and enthusiasts

New Prometheus-based extensions enable intelligent observability for more than 200 additional technologies

Auto-adaptive thresholds for AI-driven quality gating

What is serverless computing? Driving efficiency without sacrificing observability

Keeping data in India with AI-powered observability operated on AWS Mumbai

Ensure safe and secure releases at scale by providing Golden Paths

Introducing Netflix TimeSeries Data Abstraction Layer

Build and operate multicloud FaaS with enhanced, intelligent end-to-end observability

Dynatrace innovates again with the release of topology-driven auto-adaptive metric baselines

Stay Connected