Availability, Infrastructure and Software Engineering

SRE Best Practices for Java Applications

DZone

MARCH 12, 2025

Site reliability engineering (SRE) plays a vital role in ensuring Java applications' high availability, performance, and scalability. This discipline merges software engineering and operations, aiming to create a robust infrastructure that supports seamless user experiences.

Best Practices

Best Practices Java Software Engineering Scalability

Extend the AI and automation core of Dynatrace with host extensions to resolve infrastructure problems

Dynatrace

MAY 13, 2020

All the data bound to hosts is analyzed by the Davis AI causation engine and made available on custom dashboards and events pages. One of our software engineers, Tomasz Gajger, has been involved in a research project related to GPU performance analysis. Example 1: Gain visibility into your NVIDIA GPUs. What’s next.

Infrastructure

Infrastructure Metrics Monitoring Software Engineering

Building and Scaling Data Lineage at Netflix to Improve Data Infrastructure Reliability, and…

The Netflix TechBlog

MARCH 25, 2019

Building and Scaling Data Lineage at Netflix to Improve Data Infrastructure Reliability, and Efficiency By: Di Lin , Girish Lingappa , Jitender Aswani Imagine yourself in the role of a data-inspired decision maker staring at a metric on a dashboard about to make a critical business decision but pausing to ask a question?—?“Can

Infrastructure

Infrastructure Big Data Transportation Architecture

SRE vs DevOps: What you need to know

Dynatrace

FEBRUARY 24, 2021

SRE is the transformation of traditional operations practices by using software engineering and DevOps principles to improve the availability, performance, and scalability of releases by building resiliency into apps and infrastructure. Investing in automation and tooling to avoid toil. SRE vs DevOps? Reduced latency.

DevOps

DevOps Software Engineering Speed Google

Unlock the Power of DevSecOps with Newly Released Kubernetes Experience for Platform Engineering

Dynatrace

NOVEMBER 7, 2023

Platform engineering is on the rise. According to leading analyst firm Gartner, “80% of software engineering organizations will establish platform teams as internal providers of reusable services, components, and tools for application delivery…” by 2026. Automation, automation, automation.

Engineering

Engineering DevOps Best Practices Infrastructure

What is platform engineering?

Dynatrace

NOVEMBER 3, 2023

Platform engineering is a practice that outlines how development teams build internal platforms to create self-service capabilities for software engineering teams. The result is a cloud-native approach to software delivery. Platform engineering cannot stand alone, however.

Engineering

Engineering DevOps Software Engineering Scalability

How platform engineering and IDP observability can accelerate developer velocity

Dynatrace

MARCH 6, 2024

Platform engineering creates and manages a shared infrastructure and set of tools, such as internal developer platforms (IDPs) , to enable software developers to build, deploy, and operate applications more efficiently. “That means making it available, resilient, and secure,” Grabner said.

Engineering

Engineering Development DevOps Infrastructure

How Red Hat and Dynatrace intelligently automate your production environment

Dynatrace

MAY 6, 2024

Problem remediation is too time-consuming According to the DevOps Automation Pulse Survey 2023 , on average, a software engineer takes nine hours to remediate a problem within a production application. With that, Software engineers, SREs, and DevOps can define a broad automation and remediation mapping.

DevOps

DevOps Software Engineering Games Java

Site reliability engineering: 5 things you need to know

Dynatrace

FEBRUARY 4, 2021

Site reliability engineering (SRE) is the practice of applying software engineering principles to operations and infrastructure processes to help organizations create highly reliable and scalable software systems. SRE focuses on automation.

Engineering

Engineering DevOps Government Latency

Software engineering for machine learning: a case study

The Morning Paper

JULY 7, 2019

Software engineering for machine learning: a case study Amershi et al., More specifically, we’ll be looking at the results of an internal study with over 500 participants designed to figure out how product development and software engineering is changing at Microsoft with the rise of AI and ML. ICSE’19.

Software Engineering

Software Engineering Engineering Software Software

Site reliability done right: 5 SRE best practices that deliver on business objectives

Dynatrace

MAY 31, 2023

Keeping pace with modern digital transformation requires ensuring that applications are responsive, resilient, and always available amid increased complexity. Site reliability engineering (SRE) has recently become a critical discipline in recent years as the world has shifted in favor of web-based interactions.

Best Practices

Best Practices DevOps Latency Metrics

Why applying chaos engineering to data-intensive applications matters

Dynatrace

MAY 23, 2024

Stream processing One approach to such a challenging scenario is stream processing, a computing paradigm and software architectural style for data-intensive software systems that emerged to cope with requirements for near real-time processing of massive amounts of data.

Engineering

Engineering Tuning Latency Open Source

Auto-adaptive thresholds for AI-driven quality gating

Dynatrace

JUNE 4, 2024

Build an umbrella for Development and Operations In modern software engineering, the discipline of platform engineering delivers DevSecOps practices to developers to bridge the gaps between development, security, and operations and enhance the developer experience. For full details, see Dynatrace Documentation.

Metrics

Metrics Engineering Code Tuning

Automated observability, security, and reliability at scale

Dynatrace

JULY 18, 2023

While infrastructure has historically been treated as a bottleneck where proper scaling and compute power are applied to improve performance, these aspects are now typically addressed by hyperscalers that offer cloud-based infrastructure and infrastructure as a service.

Best Practices

Best Practices Code Infrastructure Latency

Site reliability engineering: 5 things to you need to know

Dynatrace

FEBRUARY 4, 2021

Site reliability engineering (SRE) is the practice of applying software engineering principles to operations and infrastructure processes to help organizations create highly reliable and scalable software systems. SRE focuses on automation.

Engineering

Engineering DevOps Government Latency

Open-Sourcing Metaflow, a Human-Centric Framework for Data Science

The Netflix TechBlog

DECEMBER 3, 2019

About two years ago, we, at our newly formed Machine Learning Infrastructure team started asking our data scientists a question: “What is the hardest thing for you as a data scientist at Netflix?” mainly because of mundane reasons related to software engineering. like they would do in a Jupyter notebook.

Open Source

Open Source AWS Infrastructure Energy

Connect your software with the right people: Ownership drives effective collaboration

Dynatrace

MARCH 28, 2023

All such automation is available while your environment is continuously enriched with additional contextual information that connects the responsible teams with your software development process. Associated ownership information is available on each entity page. Assignment of vulnerabilities to the responsible team members.

Software

Software Software Monitoring Software Engineering

Automating Success: Building a better developer experience with platform engineering

Dynatrace

FEBRUARY 12, 2024

Check out the following use cases to learn how to drive innovation from development to production efficiently and securely with platform engineering observability. According to Deinhammer, once software is released into production, it’s usually deployed at a large scale—often in complex multicloud or hybrid setups.

Engineering

Engineering Development Infrastructure Cloud

Netflix at AWS re:Invent 2019

The Netflix TechBlog

NOVEMBER 22, 2019

December 2 1pm-2pm CMP 326-R Capacity Management Made Easy with Amazon EC2 Auto Scaling Vadim Filanovsky , Senior Performance Engineer & Anoop Kapoor, AWS Abstract :Amazon EC2 Auto Scaling offers a hands-free capacity management experience to help customers maintain a healthy fleet, improve application availability, and reduce costs.

AWS

AWS Entertainment Open Source Benchmarking

Demystifying Interviewing for Backend Engineers @ Netflix

The Netflix TechBlog

FEBRUARY 1, 2022

It’s also a great opportunity for you to learn more about the available roles, the technical challenges the teams are facing and what it’s like to work on a backend engineering team at Netflix. If you want to practice, focus on medium-difficulty real-world problems you might encounter in a software engineering role.

Engineering

Engineering Games Entertainment Innovation

Data Engineers of Netflix?—?Interview with Pallavi Phadnis

The Netflix TechBlog

OCTOBER 28, 2021

Interview with Pallavi Phadnis This post is part of our “ Data Engineers of Netflix ” series, where our very own data engineers talk about their journeys to Data Engineering @ Netflix. Pallavi Phadnis is a Senior Software Engineer at Netflix. Pallavi, what’s your journey to data engineering at Netflix?

Data Engineering

Data Engineering Engineering Big Data Software Engineering

The State of DevOps Automation assessment: How automated are you?

Dynatrace

APRIL 22, 2024

In response to the scale and complexity of modern cloud-native technology, organizations are increasingly reliant on automation to properly manage their infrastructure and workflows. Operations automation: The operations section addresses the level of automation organizations use in maintaining and managing existing software.

DevOps

DevOps Artificial Intelligence Government Innovation

ConsoleMe: A Central Control Plane for AWS Permissions and Access

The Netflix TechBlog

MARCH 10, 2021

Motivation Growth in the cloud has exploded, and it is now easier than ever to create infrastructure on the fly. Groups beyond software engineering teams are standing up their own systems and automation. At many companies, managing cloud hygiene and security usually falls under the infrastructure or security teams.

AWS

AWS Cloud Games Infrastructure

All of Netflix’s HDR video streaming is now dynamically optimized

The Netflix TechBlog

NOVEMBER 29, 2023

HDR was launched at Netflix in 2016 and the number of titles available in HDR has been growing ever since. Join us and be a part of the amazing team that brought you this tech-blog; open positions: Software Engineer, Cloud Gaming Software Engineer, Live Streaming References [1] L. Krasula, A. Choudhury, S.

Open Source

Open Source Software Engineering Internet Internet

Application observability meets developer observability: Unlock a 360º view of your environment

Dynatrace

NOVEMBER 6, 2023

In a recent webinar , Dynatrace DevOps activist Andi Grabner and senior software engineer Yarden Laifenfeld explored developer observability. They also care about infrastructure: SREs require system visibility and incident management. Dynatrace enables teams to specify SLOs, such as latency, uptime, availability, and more.

Development

Development DevOps Programming Cloud

MLOps and DevOps: Why Data Makes It Different

O'Reilly

OCTOBER 19, 2021

As with many burgeoning fields and disciplines, we don’t yet have a shared canonical infrastructure stack or best practices for developing and deploying data-intensive applications. In effect, the engineer designs and builds the world wherein the software operates. What: The Modern Stack of ML Infrastructure.

DevOps

DevOps Software Engineering Open Source Infrastructure

Dynatrace extends distributed tracing for serverless on AWS Lambda (GA)?

Dynatrace

JANUARY 19, 2021

Serverless architectures help developers innovate more efficiently and effectively by removing the burden of managing underlying infrastructure. – Robert Trueman, Head of Software Engineering at CDL. available for quite some time?already. available due to an attached OneAgent extension, we’ve added a dedicated?

Lambda

Lambda Serverless AWS Mobile

Experimentation is a major focus of Data Science across Netflix

The Netflix TechBlog

JANUARY 11, 2022

In the absence of true A/B testing, our analyses rely on using the available data to adjust away spurious correlations between the treatment and the outcome metrics. But how well we can do so depends on whether the available data is sufficient to account for all such correlations. without having to become software engineers themselves.

Innovation

Innovation Metrics Engineering Testing

Dynatrace Perform 2024 Guide: Deriving business value from AI data analysis

Dynatrace

JANUARY 23, 2024

Another key theme at Dynatrace Perform 2024 is organizations’ growing adoption of platform engineering , which helps accelerate the delivery of software applications. Platform engineering improves developer productivity by providing self-service capabilities with automated infrastructure operations.

Performance

Performance DevOps Innovation Artificial Intelligence

Sponsored Post: Software Buyers Council, InMemory.Net, Triplebyte, Etleap, Stream, Scalyr

High Scalability

FEBRUARY 19, 2019

Triplebyte lets exceptional software engineers skip screening steps at hundreds of top tech companies like Apple, Dropbox, Mixpanel, and Instacart. Join Etleap , an Amazon Redshift ETL tool to learn the latest trends in designing a modern analytics infrastructure. Who's Hiring? Make your job search O (1), not O ( n ).

Software

Software Software Analytics Infrastructure

Open-Sourcing Metaflow, a Human-Centric Framework for Data Science

The Netflix TechBlog

DECEMBER 3, 2019

About two years ago, we, at our newly formed Machine Learning Infrastructure team started asking our data scientists a question: “What is the hardest thing for you as a data scientist at Netflix?” mainly because of mundane reasons related to software engineering. like they would do in a Jupyter notebook.

Open Source

Open Source AWS Infrastructure Energy

Sponsored Post: Software Buyers Council, InMemory.Net, Triplebyte, Etleap, Stream, Scalyr

High Scalability

FEBRUARY 5, 2019

Triplebyte lets exceptional software engineers skip screening steps at hundreds of top tech companies like Apple, Dropbox, Mixpanel, and Instacart. Shape the future of software in your industry. Receive occasional invitations to chat with for 30 minutes about your area of expertise and software usage. Who's Hiring?

Software

Software Software Infrastructure Metrics

Sponsored Post: Software Buyers Council, InMemory.Net, Triplebyte, Etleap, Stream, Scalyr

High Scalability

MARCH 19, 2019

Triplebyte lets exceptional software engineers skip screening steps at hundreds of top tech companies like Apple, Dropbox, Mixpanel, and Instacart. Join Etleap , an Amazon Redshift ETL tool to learn the latest trends in designing a modern analytics infrastructure. Who's Hiring? Make your job search O (1), not O ( n ).

Software

Software Software Analytics Infrastructure

Sponsored Post: Software Buyers Council, InMemory.Net, Triplebyte, Etleap, Stream, Scalyr

High Scalability

FEBRUARY 5, 2019

Triplebyte lets exceptional software engineers skip screening steps at hundreds of top tech companies like Apple, Dropbox, Mixpanel, and Instacart. Shape the future of software in your industry. Receive occasional invitations to chat with for 30 minutes about your area of expertise and software usage. Who's Hiring?

Software

Software Software Infrastructure Metrics

DevOps observability: A guide for DevOps and DevSecOps teams

Dynatrace

JANUARY 18, 2023

Site reliability engineering (SRE) is a software operations methodology that enables organizations to create highly reliable and scalable applications. SRE applies software engineering principles to operations and infrastructure processes. Site reliability engineers, or SREs, lead these efforts.

DevOps

DevOps Best Practices Innovation Strategy

SRE Best Practices for Java Applications

Extend the AI and automation core of Dynatrace with host extensions to resolve infrastructure problems

Trending Sources

Building and Scaling Data Lineage at Netflix to Improve Data Infrastructure Reliability, and…

SRE vs DevOps: What you need to know

Unlock the Power of DevSecOps with Newly Released Kubernetes Experience for Platform Engineering

What is platform engineering?

How platform engineering and IDP observability can accelerate developer velocity

How Red Hat and Dynatrace intelligently automate your production environment

Site reliability engineering: 5 things you need to know

Software engineering for machine learning: a case study

Site reliability done right: 5 SRE best practices that deliver on business objectives

Why applying chaos engineering to data-intensive applications matters

Auto-adaptive thresholds for AI-driven quality gating

Automated observability, security, and reliability at scale

Site reliability engineering: 5 things to you need to know

Open-Sourcing Metaflow, a Human-Centric Framework for Data Science

Connect your software with the right people: Ownership drives effective collaboration

Automating Success: Building a better developer experience with platform engineering

Netflix at AWS re:Invent 2019

Demystifying Interviewing for Backend Engineers @ Netflix

Data Engineers of Netflix?—?Interview with Pallavi Phadnis

The State of DevOps Automation assessment: How automated are you?

ConsoleMe: A Central Control Plane for AWS Permissions and Access

All of Netflix’s HDR video streaming is now dynamically optimized

Application observability meets developer observability: Unlock a 360º view of your environment

MLOps and DevOps: Why Data Makes It Different

Dynatrace extends distributed tracing for serverless on AWS Lambda (GA)?

Sponsored Post: PerfOps, InMemory.Net, Triplebyte, Etleap, Stream, Scalyr

Experimentation is a major focus of Data Science across Netflix

Sponsored Post: InMemory.Net, Triplebyte, Etleap, Stream, Scalyr

Sponsored Post: PerfOps, InMemory.Net, Triplebyte, Etleap, Stream, Scalyr

Sponsored Post: PerfOps, InMemory.Net, Triplebyte, Etleap, Stream, Scalyr

Sponsored Post: InMemory.Net, Triplebyte, Etleap, Stream, Scalyr

Dynatrace Perform 2024 Guide: Deriving business value from AI data analysis

Sponsored Post: Software Buyers Council, InMemory.Net, Triplebyte, Etleap, Stream, Scalyr

Open-Sourcing Metaflow, a Human-Centric Framework for Data Science

Sponsored Post: pMD, InMemory.Net, Triplebyte, Etleap, Stream, Scalyr

Sponsored Post: Software Buyers Council, InMemory.Net, Triplebyte, Etleap, Stream, Scalyr

Sponsored Post: Software Buyers Council, InMemory.Net, Triplebyte, Etleap, Stream, Scalyr

Sponsored Post: InMemory.Net, Triplebyte, Etleap, Stream, Scalyr

Sponsored Post: Etleap, PerfOps, InMemory.Net, Triplebyte, Stream, Scalyr

Sponsored Post: Software Buyers Council, InMemory.Net, Triplebyte, Etleap, Stream, Scalyr

DevOps observability: A guide for DevOps and DevSecOps teams

Sponsored Post: Twitch, InMemory.Net, Triplebyte, Etleap, Stream, Scalyr, MemSQL

Stay Connected