Engineering, Monitoring and Software Engineering

Error Monitoring vs Defect Monitoring: Key Differences

DZone

AUGUST 30, 2021

Identifying defects and troubleshooting for their root cause is one of the important but painful tasks in software engineering and essential to maintaining good quality software. To help them in the quest for improving MTTR, software developers use application monitoring tools.

Monitoring

Monitoring Software Engineering Engineering Software

SRE Best Practices for Java Applications

DZone

MARCH 12, 2025

Site reliability engineering (SRE) plays a vital role in ensuring Java applications' high availability, performance, and scalability. This discipline merges software engineering and operations, aiming to create a robust infrastructure that supports seamless user experiences.

Best Practices

Best Practices Java Software Engineering Scalability

Unlock the Power of DevSecOps with Newly Released Kubernetes Experience for Platform Engineering

Dynatrace

NOVEMBER 7, 2023

Platform engineering is on the rise. According to leading analyst firm Gartner, “80% of software engineering organizations will establish platform teams as internal providers of reusable services, components, and tools for application delivery…” by 2026. All important health signals are highlighted.

Engineering

Engineering DevOps Best Practices Infrastructure

How platform engineering and IDP observability can accelerate developer velocity

Dynatrace

MARCH 6, 2024

As organizations look to expand DevOps maturity, improve operational efficiency, and increase developer velocity, they are embracing platform engineering as a key driver. Platform engineering: Build for self-service Self-service deployment is a key attribute of platform engineering. “It makes them more productive.

Engineering

Engineering Development DevOps Infrastructure

Site reliability engineering: 5 things you need to know

Dynatrace

FEBRUARY 4, 2021

What is site reliability engineering? Site reliability engineering (SRE) is the practice of applying software engineering principles to operations and infrastructure processes to help organizations create highly reliable and scalable software systems. Dynatrace news. SRE focuses on automation.

Engineering

Engineering DevOps Government Latency

Automating Success: Building a better developer experience with platform engineering

Dynatrace

FEBRUARY 12, 2024

When it comes to platform engineering, not only does observability play a vital role in the success of organizations’ transformation journeys—it’s key to successful platform engineering initiatives. The various presenters in this session aligned platform engineering use cases with the software development lifecycle.

Engineering

Engineering Development Infrastructure Cloud

Site Reliability Engineering

DZone

JANUARY 19, 2024

In the dynamic world of online services, the concept of site reliability engineering (SRE) has risen as a pivotal discipline, ensuring that large-scale systems maintain their performance and reliability.

Engineering

Engineering Tuning Software Engineering Internet

Site reliability engineering: 5 things to you need to know

Dynatrace

FEBRUARY 4, 2021

Site reliability engineering (SRE) is the practice of applying software engineering principles to operations and infrastructure processes to help organizations create highly reliable and scalable software systems. ” According to Google, “SRE is what you get when you treat operations as a software problem.”

Engineering

Engineering DevOps Government Latency

The state of site reliability engineering: SRE challenges and best practices in 2023

Dynatrace

NOVEMBER 14, 2023

Site reliability engineering (SRE) has become increasingly important to organizations looking to keep up with the rapid pace of digital transformation. Effective site reliability engineering requires enterprise-wide transformation Without a unified understanding of SRE practices, organizational silos can quickly form between departments.

Best Practices

Best Practices Engineering DevOps Software Engineering

Kubernetes Observability: Lessons Learned From Running Kubernetes in Production

DZone

OCTOBER 1, 2024

In recent years, observability has re-emerged as a critical aspect of DevOps and software engineering in general, driven by the growing complexity and scale of modern, cloud-native applications.

Software Engineering

Software Engineering DevOps Cloud Architecture

Title Launch Observability at Netflix Scale

The Netflix TechBlog

MARCH 4, 2025

This standardization enhances adoption within the personalization stack, simplifies the system, and improves understanding and debuggability for engineers. They must also provide enough information for partner engineers to identify the problem with the underlying service in cases of system-level issues. there is a dedicated collector.

Traffic

Traffic Strategy Entertainment Innovation

AWS observability: AWS monitoring best practices for resiliency

Dynatrace

NOVEMBER 22, 2021

These resources generate vast amounts of data in various locations, including containers, which can be virtual and ephemeral, thus more difficult to monitor. These challenges make AWS observability a key practice for building and monitoring cloud-native applications. AWS monitoring best practices. Automate monitoring tasks.

Best Practices

Best Practices AWS Monitoring Serverless

5 powerful use cases beyond debugging for Dynatrace Live Debugger

Dynatrace

MARCH 25, 2025

Following are some of the coolest things weve seen engineers do with Live Debugger. Performance benchmarking Performance benchmarking is one of the unresolved mysteries of software engineering. Maybe you want to monitor performance under different system loads. In many ways, it’s more of an art than a science.

Benchmarking

Benchmarking Code Open Source Engineering

SRE Opportunities Grow as Businesses Reopen and Reacclimate

DZone

AUGUST 4, 2021

Take one look at LinkedIn right now, and you’ll notice some of the most in-demand jobs include application developers and software engineers. After a deeper dive, you’ll find many companies across multiple industries are looking for site reliability engineers or SREs.

Software Engineering

Software Engineering Speed Engineering Monitoring

Open-Sourcing a Monitoring GUI for Metaflow

The Netflix TechBlog

OCTOBER 27, 2021

Open-Sourcing a Monitoring GUI for Metaflow, Netflix’s ML Platform tl;dr Today, we are open-sourcing a long-awaited GUI for Metaflow. The Metaflow GUI allows data scientists to monitor their workflows in real-time, track experiments, and see detailed logs and results for every executed task.

Open Source

Open Source Monitoring Scalability Code

Software engineering for machine learning: a case study

The Morning Paper

JULY 7, 2019

Software engineering for machine learning: a case study Amershi et al., More specifically, we’ll be looking at the results of an internal study with over 500 participants designed to figure out how product development and software engineering is changing at Microsoft with the rise of AI and ML. ICSE’19.

Software Engineering

Software Engineering Engineering Software Software

A New Era Has Come, and So Must Your Database Observability

DZone

SEPTEMBER 28, 2023

Software engineers didn’t need to understand the database, and even if they owned it, it was just a single component of the system. Guaranteeing software quality was much easier because the deployment happened rarely, and things could be captured on time via automated tests. Reasoning about applications is now much harder.

Database

Database Software Engineering Software Software

Up your quality and agility factor – using automation to build “performance-as-a-self-service”

Dynatrace

MARCH 3, 2020

For software engineering teams, this demand means not only delivering new features faster but ensuring quality, performance, and scalability too. One way to apply improvements is transforming the way application performance engineering and testing is done. Here is the definition of this model: ?. Try it today using Keptn .

Performance

Performance Education Innovation Software Architecture

How to leverage mobile analytics to ensure crash-free, five-star mobile applications

Dynatrace

JUNE 28, 2022

After investigating, the software engineering team discovered that it wasn’t leveraging application performance monitoring (APM) tooling data to its full potential. First, the organization assembled a group of motivated SMEs, including members of the product team as well as iOS and Android engineers.

Mobile

Mobile Analytics Best Practices Software Engineering

Auto-adaptive thresholds for AI-driven quality gating

Dynatrace

JUNE 4, 2024

Build an umbrella for Development and Operations In modern software engineering, the discipline of platform engineering delivers DevSecOps practices to developers to bridge the gaps between development, security, and operations and enhance the developer experience. However, other data formats, like logs, can also be employed.

Metrics

Metrics Engineering Code Tuning

What is DevOps orchestration? And why invest in orchestration tools?

Dynatrace

DECEMBER 5, 2022

Cloud providers enable faster delivery of new services but require new practices, including a need for closely monitoring costs. One key advantage of this integration is a single point of access to monitoring, logging, and other information needed to keep software development operations running efficiently.

DevOps

DevOps Virtualization Best Practices Innovation

How Red Hat and Dynatrace intelligently automate your production environment

Dynatrace

MAY 6, 2024

Problem remediation is too time-consuming According to the DevOps Automation Pulse Survey 2023 , on average, a software engineer takes nine hours to remediate a problem within a production application. With that, Software engineers, SREs, and DevOps can define a broad automation and remediation mapping.

DevOps

DevOps Software Engineering Games Java

Bringing AV1 Streaming to Netflix Members’ TVs

The Netflix TechBlog

NOVEMBER 9, 2021

The Client and UI Engineering team built a certification test with these streams to analyze both the device logs as well as the pictures rendered on the screen. The Performance Engineering team specializes in optimizing resource utilization at Netflix. Challenge 4: How do we continuously monitor AV1 streaming?

Media

Media Open Source Software Engineering Efficiency

Architected for resiliency: How Dynatrace withstands data center outages

Dynatrace

JUNE 15, 2021

The email walked through how our Dynatrace self-monitoring notified users of the outage but automatically remediated the problem thanks to our platform’s architecture. There are several ways Dynatrace monitors and alerts on the impact of service disruption. Ready to learn more? Then read on! Fact #1: AWS EC2 outage properly documented.

AWS

AWS Traffic Architecture Azure

Connect your software with the right people: Ownership drives effective collaboration

Dynatrace

MARCH 28, 2023

To address this need, Dynatrace now provides automation for DevSecOps collaboration that associates ownership information with monitored services to further minimize mean-time-to-restore (MTTR). Associating ownership-team details with monitored services is flexible. team structure, or links to external resources such as a wiki.

Software

Software Software Monitoring Software Engineering

Site reliability done right: 5 SRE best practices that deliver on business objectives

Dynatrace

MAY 31, 2023

Site reliability engineering (SRE) has recently become a critical discipline in recent years as the world has shifted in favor of web-based interactions. This shift is leading more organizations to hire site reliability engineers to guarantee the reliability and resiliency of their services. Mobile retail e-commerce spending in the U.

Best Practices

Best Practices DevOps Latency Metrics

Revolutionizing Observability: How AI-Driven Observability Unlocks a New Era of Efficiency

DZone

FEBRUARY 12, 2024

It is a crucial aspect of distributed systems, as it allows stakeholders such as Software Engineers, Site Reliability Engineers , and Product Managers to troubleshoot issues with their service, monitor performance, and gain insights into the software system's behavior.

Efficiency

Efficiency Software Engineering Monitoring Metrics

Dynatrace extends distributed tracing for serverless on AWS Lambda (GA)?

Dynatrace

JANUARY 19, 2021

The Dynatrace AI engine, Davis,?automatically Other tools in the market for monitoring AWS Lambda traces can’t deliver real end-to-end visibility from the end-user perspective across all?moving – Robert Trueman, Head of Software Engineering at CDL. Real User Monitoring. potential gaps and blind spots?make

Lambda

Lambda Serverless AWS Mobile

Conducting log analysis with an observability platform and full data context

Dynatrace

APRIL 20, 2023

With more automated approaches to log monitoring and log analysis, however, organizations can gain visibility into their applications and infrastructure efficiently and with greater precision—even as cloud environments grow. “It’s quite a big scale,” said an engineer at the financial services group.

Analytics

Analytics Infrastructure Storage Architecture

The 737Max and Why Software Engineers Might Want to Pay Attention

J. Paul Reed

MARCH 14, 2019

The 737Max and Why Software Engineers Might Want to Pay Attention As someone with a bit of a reputation for talking about aviation and software development and operations , I’ve been asked about the 737Max repeatedly over the past week. To cope, they added additional monitoring and control systems.

Software Engineering

Software Engineering Engineering Software Software

Extend the AI and automation core of Dynatrace with host extensions to resolve infrastructure problems

Dynatrace

MAY 13, 2020

A single instance of OneAgent can handle the monitoring of many types of entities , including servers, applications, services, databases, and more. But what if a particular metric is crucial for your monitoring needs and it isn’t there? Let the Davis AI causation engine analyze additional metrics. Dynatrace news.

Infrastructure

Infrastructure Metrics Monitoring Software Engineering

Protect your organization against zero-day vulnerabilities

Dynatrace

AUGUST 3, 2022

Techniques such as statistics-based monitoring and behavior-based monitoring are also possible. Statistics-based monitoring is when organizations take statistics from exploits that vendors have detected and feed them into a system to learn and identify these attacks. Application logs are a good data source for this method.

Java

Java Traffic Benchmarking Strategy

Scale DevOps and SRE with open source Keptn

Dynatrace

APRIL 18, 2022

When it comes to site reliability engineering (SRE) initiatives adopting DevOps practices, developers and operations teams frequently find themselves at odds with one another. Teams can plug in new tools for testing, deployment, or even monitoring without having to modify a lot of existing automation tools. Dynatrace news.

Open Source

Open Source DevOps Cloud Metrics

Dynatrace Perform 2024 Guide: Deriving business value from AI data analysis

Dynatrace

JANUARY 23, 2024

Composite’ AI, platform engineering, AI data analysis through custom apps This focus on data reliability and data quality also highlights the need for organizations to bring a “ composite AI ” approach to IT operations, security, and DevOps. To learn more about platform engineering, explore the following resources.

Performance

Performance DevOps Innovation Energy

Automated observability, security, and reliability at scale

Dynatrace

JULY 18, 2023

To handle this challenge, enterprises need to automate and streamline the onboarding and lifecycle of tool configurations in the software development processes, including aspects of observability, security, alerting, and remediation. Development teams must set up tailored configurations for each tool and component they’re responsible for.

Best Practices

Best Practices Code Infrastructure Latency

Netflix at AWS re:Invent 2019

The Netflix TechBlog

NOVEMBER 22, 2019

by Shefali Vyas Dalal AWS re:Invent is a couple weeks away and our engineers & leaders are thrilled to be in attendance yet again this year! Over the years, this platform took on support for both elastic online services and fully featured batch workloads supporting use cases across Netflix engineering.

AWS

AWS Entertainment Open Source Benchmarking

Application observability meets developer observability: Unlock a 360º view of your environment

Dynatrace

NOVEMBER 6, 2023

In a recent webinar , Dynatrace DevOps activist Andi Grabner and senior software engineer Yarden Laifenfeld explored developer observability. Why is developer observability important for engineers? With traditional monitoring tools, the granular data that developers require typically involves manual preparation.

Development

Development DevOps Programming Cloud

Autonomous Cloud Enablement aka Scaling NoOps via Self-Service

Dynatrace

FEBRUARY 6, 2020

A key to success from the start was that not only we did build Dynatrace, but Anita’s team was also always “Customer 0” of Dynatrace because clearly we were in need of a world class monitoring platform that gave us visibility into our deployments in dev, staging and production. Wave two: NoOps to ensure stability!

Cloud

Cloud DevOps Engineering Speed

What is application security? And why it needs a new approach

Dynatrace

MARCH 17, 2021

Application security is a software engineering term that refers to several different types of security practices designed to ensure applications do not contain vulnerabilities that could allow illicit access to sensitive data, unauthorized code modification, or resource hijacking. Dynatrace news. So, why is all this important?

Open Source

Open Source Cloud Games Java

OpenTelemetry observability and Dynatrace deliver actionable answers at scale

Dynatrace

AUGUST 21, 2023

If a microservice falls in the forest and all your monitoring solutions report it differently, can operators accurately trace what happened and automate a response? Different monitoring point solutions, such as Jaeger, Zipkin, Logstash, Fluentd, and StatsD, each have their own way of observing and recording such an event.

Open Source

Open Source Analytics Lambda Metrics

Escaping POC Purgatory: Evaluation-Driven Development for AI Systems

O'Reilly

MARCH 25, 2025

Weve seen this across dozens of companies, and the teams that break out of this trap all adopt some version of Evaluation-Driven Development (EDD), where testing, monitoring, and evaluation drive every decision from the start. One of the great aspects of the AI Age is that more people will be able to build software.

Systems

Systems Development Tuning Monitoring

Best PostgreSQL GUI [2024]

Scalegrid

OCTOBER 18, 2024

A dashboard for monitoring activities such as database locks, connected sessions, and prepared transactions for multiple servers. They come with features such as query analysis, performance monitoring, and advanced SQL refactoring capabilities.

Open Source

Open Source Database Cloud Operating System

DevOps observability: A guide for DevOps and DevSecOps teams

Dynatrace

JANUARY 18, 2023

From site reliability engineering to service-level objectives and DevSecOps, these resources focus on how organizations are using these best practices to innovate at speed without sacrificing quality, reliability, or security. SRE applies software engineering principles to operations and infrastructure processes. – blog.

DevOps

DevOps Best Practices Innovation Strategy

The Show Must Go On: Securing Netflix Studios At Scale

The Netflix TechBlog

SEPTEMBER 13, 2021

Supporting developers through those checklists for edge cases, and then validating that each team’s choices resulted in an architecture with all the desired security properties, was similarly not scalable for our security engineers. Netflix engineers talk a lot about the concept of a “ Paved Road ”.

Internet

Internet Internet Cloud Traffic

Error Monitoring vs Defect Monitoring: Key Differences

SRE Best Practices for Java Applications

Trending Sources

Unlock the Power of DevSecOps with Newly Released Kubernetes Experience for Platform Engineering

How platform engineering and IDP observability can accelerate developer velocity

Site reliability engineering: 5 things you need to know

Automating Success: Building a better developer experience with platform engineering

Site Reliability Engineering

Site reliability engineering: 5 things to you need to know

The state of site reliability engineering: SRE challenges and best practices in 2023

Kubernetes Observability: Lessons Learned From Running Kubernetes in Production

Title Launch Observability at Netflix Scale

AWS observability: AWS monitoring best practices for resiliency

5 powerful use cases beyond debugging for Dynatrace Live Debugger

SRE Opportunities Grow as Businesses Reopen and Reacclimate

Open-Sourcing a Monitoring GUI for Metaflow

Software engineering for machine learning: a case study

A New Era Has Come, and So Must Your Database Observability

Up your quality and agility factor – using automation to build “performance-as-a-self-service”

How to leverage mobile analytics to ensure crash-free, five-star mobile applications

Auto-adaptive thresholds for AI-driven quality gating

What is DevOps orchestration? And why invest in orchestration tools?

How Red Hat and Dynatrace intelligently automate your production environment

Bringing AV1 Streaming to Netflix Members’ TVs

Architected for resiliency: How Dynatrace withstands data center outages

Connect your software with the right people: Ownership drives effective collaboration

Site reliability done right: 5 SRE best practices that deliver on business objectives

Revolutionizing Observability: How AI-Driven Observability Unlocks a New Era of Efficiency

Dynatrace extends distributed tracing for serverless on AWS Lambda (GA)?

Conducting log analysis with an observability platform and full data context

The 737Max and Why Software Engineers Might Want to Pay Attention

Extend the AI and automation core of Dynatrace with host extensions to resolve infrastructure problems

Protect your organization against zero-day vulnerabilities

Scale DevOps and SRE with open source Keptn

Dynatrace Perform 2024 Guide: Deriving business value from AI data analysis

Automated observability, security, and reliability at scale

Netflix at AWS re:Invent 2019

Application observability meets developer observability: Unlock a 360º view of your environment

Autonomous Cloud Enablement aka Scaling NoOps via Self-Service

What is application security? And why it needs a new approach

OpenTelemetry observability and Dynatrace deliver actionable answers at scale

Escaping POC Purgatory: Evaluation-Driven Development for AI Systems

Best PostgreSQL GUI [2024]

DevOps observability: A guide for DevOps and DevSecOps teams

The Show Must Go On: Securing Netflix Studios At Scale

Stay Connected