Infrastructure and Software Engineering - Technology Performance Pulse

Sustainability: Thoughts from a software engineer

Dynatrace

MARCH 17, 2025

How to achieve sustainable IT practices Use observability tools The first step in driving improvements is to obtain a comprehensive view of your IT infrastructure’s climate impact. Platform engineers can set defaults for development teams, such as the number of replicas a service should have or whether it scales automatically.

Software Engineering

Software Engineering Engineering Software Software

Bringing Software Engineering Rigor to Data

DZone

FEBRUARY 20, 2023

In software engineering, we've learned that building robust and stable applications has a direct correlation with overall organization performance. The data community is striving to incorporate the core concepts of engineering rigor found in software communities but still has further to go. Posted with permission.

Software Engineering

Software Engineering Engineering Software Software

SRE Best Practices for Java Applications

DZone

MARCH 12, 2025

Site reliability engineering (SRE) plays a vital role in ensuring Java applications' high availability, performance, and scalability. This discipline merges software engineering and operations, aiming to create a robust infrastructure that supports seamless user experiences.

Best Practices

Best Practices Java Software Engineering Scalability

Building and Scaling Data Lineage at Netflix to Improve Data Infrastructure Reliability, and…

The Netflix TechBlog

MARCH 25, 2019

Building and Scaling Data Lineage at Netflix to Improve Data Infrastructure Reliability, and Efficiency By: Di Lin , Girish Lingappa , Jitender Aswani Imagine yourself in the role of a data-inspired decision maker staring at a metric on a dashboard about to make a critical business decision but pausing to ask a question?—?“Can come join us.

Infrastructure

Infrastructure Big Data Transportation Architecture

Extend the AI and automation core of Dynatrace with host extensions to resolve infrastructure problems

Dynatrace

MAY 13, 2020

All the data bound to hosts is analyzed by the Davis AI causation engine and made available on custom dashboards and events pages. One of our software engineers, Tomasz Gajger, has been involved in a research project related to GPU performance analysis. Looking for ways to solve some of your infrastructure-related problems?

Infrastructure

Infrastructure Metrics Monitoring Software Engineering

How to Prepare for Your DevOps Interview

DZone

SEPTEMBER 5, 2019

Over the past decade, DevOps has emerged as a new tech culture and career that marries the rapid iteration desired by software development with the rock-solid stability of the infrastructure operations team. How do you ace your DevOps interview?

DevOps

DevOps Software Engineering Infrastructure Engineering

Podcast: Interview with Software Engineering Daily

Sutter's Mill

JUNE 7, 2024

Also in April, I was interviewed by Jordi Mon Companys for Software Engineering Daily, and that interview was just published on the SE Daily podcast. government recently released a report calling on the technical community to proactively reduce the attack surface area of software infrastructure.

Software Engineering

Software Engineering Engineering Software Software

What is platform engineering?

Dynatrace

NOVEMBER 3, 2023

Platform engineering is a practice that outlines how development teams build internal platforms to create self-service capabilities for software engineering teams. The result is a cloud-native approach to software delivery. Platform engineering cannot stand alone, however.

Engineering

Engineering DevOps Software Engineering Scalability

Key Elements of Site Reliability Engineering (SRE)

DZone

MARCH 14, 2023

Site Reliability Engineering (SRE) is a systematic and data-driven approach to improving the reliability, scalability, and efficiency of systems. It combines principles of software engineering, operations, and quality assurance to ensure that systems meet performance goals and business objectives.

Engineering

Engineering Software Engineering Scalability Efficiency

SRE vs DevOps: What you need to know

Dynatrace

FEBRUARY 24, 2021

SRE is the transformation of traditional operations practices by using software engineering and DevOps principles to improve the availability, performance, and scalability of releases by building resiliency into apps and infrastructure. Investing in automation and tooling to avoid toil. SRE vs DevOps?

DevOps

DevOps Software Engineering Speed Google

How Red Hat and Dynatrace intelligently automate your production environment

Dynatrace

MAY 6, 2024

Problem remediation is too time-consuming According to the DevOps Automation Pulse Survey 2023 , on average, a software engineer takes nine hours to remediate a problem within a production application. With that, Software engineers, SREs, and DevOps can define a broad automation and remediation mapping.

DevOps

DevOps Software Engineering Games Java

Unlock the Power of DevSecOps with Newly Released Kubernetes Experience for Platform Engineering

Dynatrace

NOVEMBER 7, 2023

Platform engineering is on the rise. According to leading analyst firm Gartner, “80% of software engineering organizations will establish platform teams as internal providers of reusable services, components, and tools for application delivery…” by 2026. Automation, automation, automation.

Engineering

Engineering DevOps Best Practices Infrastructure

Software engineering for machine learning: a case study

The Morning Paper

JULY 7, 2019

Software engineering for machine learning: a case study Amershi et al., More specifically, we’ll be looking at the results of an internal study with over 500 participants designed to figure out how product development and software engineering is changing at Microsoft with the rise of AI and ML. ICSE’19.

Software Engineering

Software Engineering Engineering Software Software

How platform engineering and IDP observability can accelerate developer velocity

Dynatrace

MARCH 6, 2024

Platform engineering creates and manages a shared infrastructure and set of tools, such as internal developer platforms (IDPs) , to enable software developers to build, deploy, and operate applications more efficiently. “It makes them more productive.

Engineering

Engineering Development DevOps Infrastructure

AWS observability: AWS monitoring best practices for resiliency

Dynatrace

NOVEMBER 22, 2021

Because of its matrix of cloud services across multiple environments, AWS and other multicloud environments can be more difficult to manage and monitor compared with traditional on-premises infrastructure. EC2 is Amazon’s Infrastructure-as-a-service (IaaS) compute platform designed to handle any workload at scale. Amazon EC2.

Best Practices

Best Practices AWS Monitoring Serverless

Conducting log analysis with an observability platform and full data context

Dynatrace

APRIL 20, 2023

With more automated approaches to log monitoring and log analysis, however, organizations can gain visibility into their applications and infrastructure efficiently and with greater precision—even as cloud environments grow. They enable IT teams to identify and address the precise cause of application and infrastructure issues.

Analytics

Analytics Infrastructure Storage Architecture

Site reliability engineering: 5 things you need to know

Dynatrace

FEBRUARY 4, 2021

What is site reliability engineering? Site reliability engineering (SRE) is the practice of applying software engineering principles to operations and infrastructure processes to help organizations create highly reliable and scalable software systems. Dynatrace news.

Engineering

Engineering DevOps Government Latency

Why applying chaos engineering to data-intensive applications matters

Dynatrace

MAY 23, 2024

Stream processing One approach to such a challenging scenario is stream processing, a computing paradigm and software architectural style for data-intensive software systems that emerged to cope with requirements for near real-time processing of massive amounts of data.

Engineering

Engineering Tuning Latency Open Source

Automating Success: Building a better developer experience with platform engineering

Dynatrace

FEBRUARY 12, 2024

Check out the following use cases to learn how to drive innovation from development to production efficiently and securely with platform engineering observability. According to Deinhammer, once software is released into production, it’s usually deployed at a large scale—often in complex multicloud or hybrid setups.

Engineering

Engineering Development Infrastructure Cloud

Nurturing Design in Your Software Engineering Culture

Strategic Tech

MARCH 16, 2021

There are a few qualities that differentiate average from high performing software engineering organisations. In my experience, the culture is better and the results are better in orgs where engineers and architects obsess over the design of code and architecture. My experience is the opposite.

Software Engineering

Software Engineering Design Engineering Software

Automated observability, security, and reliability at scale

Dynatrace

JULY 18, 2023

While infrastructure has historically been treated as a bottleneck where proper scaling and compute power are applied to improve performance, these aspects are now typically addressed by hyperscalers that offer cloud-based infrastructure and infrastructure as a service.

Best Practices

Best Practices Code Infrastructure Latency

Site Reliability Engineering

DZone

JANUARY 19, 2024

Site Reliability Engineering in Today’s World Site reliability engineering is an engineering discipline devoted to maintaining and improving the reliability, durability, and performance of large-scale web services.

Engineering

Engineering Tuning Software Engineering Internet

Data Engineers of Netflix?—?Interview with Pallavi Phadnis

The Netflix TechBlog

OCTOBER 28, 2021

Interview with Pallavi Phadnis This post is part of our “ Data Engineers of Netflix ” series, where our very own data engineers talk about their journeys to Data Engineering @ Netflix. Pallavi Phadnis is a Senior Software Engineer at Netflix. Pallavi, what’s your journey to data engineering at Netflix?

Data Engineering

Data Engineering Engineering Big Data Software Engineering

Site reliability engineering: 5 things to you need to know

Dynatrace

FEBRUARY 4, 2021

Site reliability engineering (SRE) is the practice of applying software engineering principles to operations and infrastructure processes to help organizations create highly reliable and scalable software systems. Dynatrace news.

Engineering

Engineering DevOps Government Latency

What is DevOps orchestration? And why invest in orchestration tools?

Dynatrace

DECEMBER 5, 2022

Today, DevOps orchestration is necessary to gain a comprehensive view and means of control over infrastructure, services, and software development practices. Orchestration leverages DevOps tools that allow for rapid updates and releases, version control, and other best practices for software engineering.

DevOps

DevOps Virtualization Best Practices Innovation

Auto-adaptive thresholds for AI-driven quality gating

Dynatrace

JUNE 4, 2024

Build an umbrella for Development and Operations In modern software engineering, the discipline of platform engineering delivers DevSecOps practices to developers to bridge the gaps between development, security, and operations and enhance the developer experience.

Metrics

Metrics Engineering Code Tuning

The State of DevOps Automation assessment: How automated are you?

Dynatrace

APRIL 22, 2024

In response to the scale and complexity of modern cloud-native technology, organizations are increasingly reliant on automation to properly manage their infrastructure and workflows. Operations automation: The operations section addresses the level of automation organizations use in maintaining and managing existing software.

DevOps

DevOps Artificial Intelligence Government Innovation

Scaling Appsec at Netflix (Part 2)

The Netflix TechBlog

JUNE 6, 2022

This approach has also allowed us to build strong relationships with central engineering teams at Netflix (Data Platform, Developer Tools, Cloud Infrastructure, IAM Product Engineering) that will continue to serve as central points of leverage for security in the long term. However, it has not been all sunshine and rainbows.

Software Engineering

Software Engineering Scalability Education Engineering

How and Why the Developer-First Approach Is Changing the Observability Landscape

DZone

DECEMBER 11, 2024

In our quest for greater scalability, resilience, and flexibility within the digital infrastructure of our organization, there has been a strategic pivot away from traditional monolithic application architectures towards embracing modern software engineering practices such as microservices architecture coupled with cloud-native applications.

Development

Development Software Engineering Architecture Scalability

Netflix at AWS re:Invent 2019

The Netflix TechBlog

NOVEMBER 22, 2019

Netflix shares how Amazon EC2 Auto Scaling allows its infrastructure to automatically adapt to changing traffic patterns in order to keep its audience entertained and its costs on target. Instead, we provide them with delightfully usable ML infrastructure that they can use to manage a project’s lifecycle. Wednesday?—?December

AWS

AWS Entertainment Open Source Benchmarking

Site reliability done right: 5 SRE best practices that deliver on business objectives

Dynatrace

MAY 31, 2023

How site reliability engineering affects organizations’ bottom line SRE applies the disciplines of software engineering to infrastructure management, both on-premises and in the cloud. However, cloud complexity has made software delivery challenging. But the transition to SRE maturity is not always easy.

Best Practices

Best Practices DevOps Latency Metrics

Demystifying Interviewing for Backend Engineers @ Netflix

The Netflix TechBlog

FEBRUARY 1, 2022

If you want to practice, focus on medium-difficulty real-world problems you might encounter in a software engineering role. Several of our backend engineering teams are searching for our next stunning colleagues. We recommend against interview coding practice puzzle-type exercises, as we don’t ask those types of questions.

Engineering

Engineering Games Entertainment Innovation

MLOps and DevOps: Why Data Makes It Different

O'Reilly

OCTOBER 19, 2021

As with many burgeoning fields and disciplines, we don’t yet have a shared canonical infrastructure stack or best practices for developing and deploying data-intensive applications. In effect, the engineer designs and builds the world wherein the software operates. What: The Modern Stack of ML Infrastructure.

DevOps

DevOps Software Engineering Open Source Infrastructure

The Show Must Go On: Securing Netflix Studios At Scale

The Netflix TechBlog

SEPTEMBER 13, 2021

During our initial consultations, it was clear that developers preferred prioritizing product work over security or infrastructure improvements. The automation of the infrastructure setup, combined with reducing risk enough to streamline security review saves developers days, if not weeks, on each application.

Internet

Internet Internet Cloud Traffic

Open-Sourcing Metaflow, a Human-Centric Framework for Data Science

The Netflix TechBlog

DECEMBER 3, 2019

About two years ago, we, at our newly formed Machine Learning Infrastructure team started asking our data scientists a question: “What is the hardest thing for you as a data scientist at Netflix?” mainly because of mundane reasons related to software engineering. like they would do in a Jupyter notebook.

Open Source

Open Source AWS Infrastructure Energy

All of Netflix’s HDR video streaming is now dynamically optimized

The Netflix TechBlog

NOVEMBER 29, 2023

Join us and be a part of the amazing team that brought you this tech-blog; open positions: Software Engineer, Cloud Gaming Software Engineer, Live Streaming References [1] L. Krasula, A. Choudhury, S. Malfait, A. 263–1–8 (2023) [ online ] [2] A.

Open Source

Open Source Software Engineering Internet Internet

Connect your software with the right people: Ownership drives effective collaboration

Dynatrace

MARCH 28, 2023

Any software engineer can search for monitored entities that relate to specific deployments and their respective teams. Infrastructure owners can easily see ownership information and identify areas that aren’t yet owned by a team.

Software

Software Software Monitoring Software Engineering

Designing Instagram

High Scalability

JANUARY 11, 2022

FUN FACT : In this talk , Rodrigo Schmidt, director of engineering at Instagram talks about the different challenges they have faced in scaling the data infrastructure at Instagram. The streaming data store makes the system extensible to support other use-cases (e.g. media search index, locations search index, and so forth) in future.

Design

Design Media Storage Logistics

ConsoleMe: A Central Control Plane for AWS Permissions and Access

The Netflix TechBlog

MARCH 10, 2021

Motivation Growth in the cloud has exploded, and it is now easier than ever to create infrastructure on the fly. Groups beyond software engineering teams are standing up their own systems and automation. At many companies, managing cloud hygiene and security usually falls under the infrastructure or security teams.

AWS

AWS Cloud Games Infrastructure

Protect your organization against zero-day vulnerabilities

Dynatrace

AUGUST 3, 2022

Although IT teams are thorough in checking their code for any errors, an attacker can always discover a loophole to exploit and damage applications, infrastructure, and critical data. If a malicious attacker can identify a key software vulnerability, they can exploit the vulnerability, allowing them to gain access to your systems.

Java

Java Traffic Benchmarking Strategy

Application observability meets developer observability: Unlock a 360º view of your environment

Dynatrace

NOVEMBER 6, 2023

In a recent webinar , Dynatrace DevOps activist Andi Grabner and senior software engineer Yarden Laifenfeld explored developer observability. They also care about infrastructure: SREs require system visibility and incident management. The DevOps people looking end-to-end.

Development

Development DevOps Programming Cloud

Autonomous Cloud Enablement aka Scaling NoOps via Self-Service

Dynatrace

FEBRUARY 6, 2020

To do that, Anita’s team drove innovation around a common delivery pipeline to enable developers automating operational tasks such as runbook execution to solve infrastructure problems. We have teams building cloud-native services for our billing, licensing, customer experience, proactive customer support, marketplace or Davis Assistant.

Cloud

Cloud DevOps Engineering Speed

Experimentation is a major focus of Data Science across Netflix

The Netflix TechBlog

JANUARY 11, 2022

Behind the scenes, Netflix infrastructure has already kicked into gear, finding the fastest way to deliver your chosen content with great audio and video quality. The numerous engineering teams involved in delivering high quality audio and video use A/B tests to improve the experience we deliver to our members around the world.

Innovation

Innovation Metrics Engineering Testing

What Is Load Testing? Ensuring Robust System Performance Under Pressure

DZone

JULY 5, 2023

While load testing may sound like an esoteric domain exclusive to software engineers or network administrators, it is, in fact, a silent superhero in our increasingly digital world. It's the silent force keeping the digital infrastructure wheel rotating smoothly, even during peak usage times.

Systems

Systems Testing Software Engineering Performance

Sustainability: Thoughts from a software engineer

Bringing Software Engineering Rigor to Data

Trending Sources

SRE Best Practices for Java Applications

Building and Scaling Data Lineage at Netflix to Improve Data Infrastructure Reliability, and…

Extend the AI and automation core of Dynatrace with host extensions to resolve infrastructure problems

How to Prepare for Your DevOps Interview

Podcast: Interview with Software Engineering Daily

What is platform engineering?

Key Elements of Site Reliability Engineering (SRE)

SRE vs DevOps: What you need to know

How Red Hat and Dynatrace intelligently automate your production environment

Unlock the Power of DevSecOps with Newly Released Kubernetes Experience for Platform Engineering

Software engineering for machine learning: a case study

How platform engineering and IDP observability can accelerate developer velocity

AWS observability: AWS monitoring best practices for resiliency

Conducting log analysis with an observability platform and full data context

Site reliability engineering: 5 things you need to know

Why applying chaos engineering to data-intensive applications matters

Automating Success: Building a better developer experience with platform engineering

Nurturing Design in Your Software Engineering Culture

Automated observability, security, and reliability at scale

Site Reliability Engineering

Data Engineers of Netflix?—?Interview with Pallavi Phadnis

Site reliability engineering: 5 things to you need to know

What is DevOps orchestration? And why invest in orchestration tools?

Auto-adaptive thresholds for AI-driven quality gating

The State of DevOps Automation assessment: How automated are you?

Scaling Appsec at Netflix (Part 2)

How and Why the Developer-First Approach Is Changing the Observability Landscape

Netflix at AWS re:Invent 2019

Site reliability done right: 5 SRE best practices that deliver on business objectives

Demystifying Interviewing for Backend Engineers @ Netflix

MLOps and DevOps: Why Data Makes It Different

The Show Must Go On: Securing Netflix Studios At Scale

Open-Sourcing Metaflow, a Human-Centric Framework for Data Science

All of Netflix’s HDR video streaming is now dynamically optimized

Connect your software with the right people: Ownership drives effective collaboration

Designing Instagram

ConsoleMe: A Central Control Plane for AWS Permissions and Access

Protect your organization against zero-day vulnerabilities

Application observability meets developer observability: Unlock a 360º view of your environment

Autonomous Cloud Enablement aka Scaling NoOps via Self-Service

Experimentation is a major focus of Data Science across Netflix

What Is Load Testing? Ensuring Robust System Performance Under Pressure

Stay Connected