Engineering, Software Engineering and Systems - Technology Performance Pulse

Sustainability: Thoughts from a software engineer

Dynatrace

MARCH 17, 2025

Scale to zero Scaling systems to match current demand prevents underutilized machines from consuming significant energy while idling. While building production systems that can scale to zero and reliably restart can be challenging, it’s often simpler in test stages and build pipelines, making this a great place to start.

Software Engineering

Software Engineering Engineering Software Software

The Next Major Shift in Enterprise Software Engineering: From Platforms to “Platformless”

DZone

JANUARY 30, 2024

The evolution of enterprise software engineering has been marked by a series of "less" shifts — from client-server to web and mobile ("client-less"), data center to cloud ("data-center-less"), and app server to serverless.

Software Engineering

Software Engineering Engineering Software Software

Low-Maintenance Backend Architectures for Scalable Applications

DZone

JANUARY 10, 2025

After years of working in the intricate world of software engineering, I learned that the most beautiful solutions are often those unseen: backends that hum along, scaling with grace and requiring very little attention. Developers could understand and manage the entire systems intricacies.

Architecture

Architecture Scalability Software Engineering Cloud

Our First Netflix Data Engineering Summit

The Netflix TechBlog

DECEMBER 14, 2023

Engineers from across the company came together to share best practices on everything from Data Processing Patterns to Building Reliable Data Pipelines. The result was a series of talks which we are now sharing with the rest of the Data Engineering community! Learn more about how batch and streaming data pipelines are built at Netflix.

Data Engineering

Data Engineering Engineering Software Engineering Best Practices

Showcasing engineering excellence at Dynatrace

Dynatrace

APRIL 29, 2021

Whilst our traditional Dynatrace website predominantly showcases Dynatrace content and product information for visitors, the idea behind the creation of our new Engineering website – engineering.dynatrace.com – was to set up a space to feature the results of our research and innovation efforts and aims to be a site made by engineers for engineers.

Engineering

Engineering Open Source Website Innovation

Key Elements of Site Reliability Engineering (SRE)

DZone

MARCH 14, 2023

Site Reliability Engineering (SRE) is a systematic and data-driven approach to improving the reliability, scalability, and efficiency of systems. It combines principles of software engineering, operations, and quality assurance to ensure that systems meet performance goals and business objectives.

Engineering

Engineering Software Engineering Scalability Efficiency

Why applying chaos engineering to data-intensive applications matters

Dynatrace

MAY 23, 2024

Stream processing One approach to such a challenging scenario is stream processing, a computing paradigm and software architectural style for data-intensive software systems that emerged to cope with requirements for near real-time processing of massive amounts of data. Recovery time of the latency p90.

Engineering

Engineering Tuning Latency Open Source

Site reliability engineering: 5 things you need to know

Dynatrace

FEBRUARY 4, 2021

What is site reliability engineering? Site reliability engineering (SRE) is the practice of applying software engineering principles to operations and infrastructure processes to help organizations create highly reliable and scalable software systems. Dynatrace news. SRE focuses on automation.

Engineering

Engineering DevOps Government Latency

Unlock the Power of DevSecOps with Newly Released Kubernetes Experience for Platform Engineering

Dynatrace

NOVEMBER 7, 2023

Platform engineering is on the rise. According to leading analyst firm Gartner, “80% of software engineering organizations will establish platform teams as internal providers of reusable services, components, and tools for application delivery…” by 2026.

Engineering

Engineering DevOps Best Practices Infrastructure

Title Launch Observability at Netflix Scale

The Netflix TechBlog

MARCH 4, 2025

Part 3: System Strategies and Architecture By: VarunKhaitan With special thanks to my stunning colleagues: Mallika Rao , Esmir Mesic , HugoMarques This blog post is a continuation of Part 2 , where we cleared the ambiguity around title launch observability at Netflix. The request schema for the observability endpoint.

Traffic

Traffic Strategy Entertainment Innovation

Site reliability engineering: 5 things to you need to know

Dynatrace

FEBRUARY 4, 2021

Site reliability engineering (SRE) is the practice of applying software engineering principles to operations and infrastructure processes to help organizations create highly reliable and scalable software systems. Dynatrace news. SRE bridges the gap between Dev and Ops teams. SRE focuses on automation.

Engineering

Engineering DevOps Government Latency

Demystifying Interviewing for Backend Engineers @ Netflix

The Netflix TechBlog

FEBRUARY 1, 2022

By Karen Casella, Director of Engineering, Access & Identity Management Have you ever experienced one of the following scenarios while looking for your next role? Most backend engineering teams follow a process very similar to what is shown below. If so, we invite you to begin the interview process.

Engineering

Engineering Games Entertainment Innovation

Site Reliability Engineering

DZone

JANUARY 19, 2024

In the dynamic world of online services, the concept of site reliability engineering (SRE) has risen as a pivotal discipline, ensuring that large-scale systems maintain their performance and reliability.

Engineering

Engineering Tuning Software Engineering Internet

5 powerful use cases beyond debugging for Dynatrace Live Debugger

Dynatrace

MARCH 25, 2025

Following are some of the coolest things weve seen engineers do with Live Debugger. You can verify any system settings that might impact your tests and see them in action. Performance benchmarking Performance benchmarking is one of the unresolved mysteries of software engineering.

Benchmarking

Benchmarking Code Open Source Engineering

A Recap of the Data Engineering Open Forum at Netflix

The Netflix TechBlog

JUNE 20, 2024

A summary of sessions at the first Data Engineering Open Forum at Netflix on April 18th, 2024 The Data Engineering Open Forum at Netflix on April 18th, 2024. At Netflix, we aspire to entertain the world, and our data engineering teams play a crucial role in this mission by enabling data-driven decision-making at scale.

Data Engineering

Data Engineering Engineering Entertainment Software Engineering

The state of site reliability engineering: SRE challenges and best practices in 2023

Dynatrace

NOVEMBER 14, 2023

Site reliability engineering (SRE) has become increasingly important to organizations looking to keep up with the rapid pace of digital transformation. Effective site reliability engineering requires enterprise-wide transformation Without a unified understanding of SRE practices, organizational silos can quickly form between departments.

Best Practices

Best Practices Engineering DevOps Software Engineering

Mastering System Design: A Comprehensive Guide to System Scaling for Millions (Part 1)

DZone

JANUARY 19, 2024

A transformative journey into the realm of system design with our tutorial, tailored for software engineers aspiring to architect solutions that seamlessly scale to serve millions of users.

Systems

Systems Design Software Engineering Scalability

Data Engineers of Netflix?—?Interview with Pallavi Phadnis

The Netflix TechBlog

OCTOBER 28, 2021

Data Engineers of Netflix?—?Interview Interview with Pallavi Phadnis This post is part of our “ Data Engineers of Netflix ” series, where our very own data engineers talk about their journeys to Data Engineering @ Netflix. Pallavi Phadnis is a Senior Software Engineer at Netflix.

Data Engineering

Data Engineering Engineering Big Data Software Engineering

Chaos Engineering: Building Resilient Systems, One Failure at a Time

DZone

JUNE 20, 2024

In the world of software engineering, where complex systems are the norm, ensuring reliability and resilience is paramount. However, traditional testing methods often fall short of uncovering hidden vulnerabilities and edge cases that could lead to system failures. What Is Chaos Engineering?

Engineering

Engineering Systems Software Engineering Software

Escaping POC Purgatory: Evaluation-Driven Development for AI Systems

O'Reilly

MARCH 25, 2025

The system is inconsistent, slow, hallucinatingand that amazing demo starts collecting digital dust. Traditional versus GenAI software: Excitement builds steadilyor crashes after the demo. Two big things: They bring the messiness of the real world into your system through unstructured data. Leadership gets excited. The way out?

Systems

Systems Development Tuning Monitoring

Beyond “Prompt and Pray”

O'Reilly

JANUARY 21, 2025

TL;DR: Enterprise AI teams are discovering that purely agentic approaches (dynamically chaining LLM calls) dont deliver the reliability needed for production systems. The prompt-and-pray modelwhere business logic lives entirely in promptscreates systems that are unreliable, inefficient, and impossible to maintain at scale.

Software Engineering

Software Engineering Efficiency Engineering Systems

LLMs Demand Observability-Driven Development

DZone

OCTOBER 13, 2023

Our industry is in the early days of an explosion in software using LLMs, as well as (separately, but relatedly) a revolution in how engineers write and run code, thanks to generative AI.

Development

Development Software Engineering Engineering Software

Podcast interview: Rust and C++

Sutter's Mill

OCTOBER 23, 2024

In early September I had a very enjoyable technical chat with Steve Klabnik of Rust fame and interviewer Kevin Ball of Software Engineering Daily, and the podcast is now available. Rust, while newer, is gaining traction in roles that demand safety and concurrency, particularly in systems programming.

C++

C++ Software Engineering Programming Games

How to Prepare for Your DevOps Interview

DZone

SEPTEMBER 5, 2019

Over the past decade, DevOps has emerged as a new tech culture and career that marries the rapid iteration desired by software development with the rock-solid stability of the infrastructure operations team. As of August 2019, there are currently over 50,000 LinkedIn DevOps job listings in the United States alone.

DevOps

DevOps Software Engineering Infrastructure Engineering

Scaling Is Not Just About Products – It’s About Teams, Too

DZone

SEPTEMBER 19, 2021

We are well aware of what is meant by system scalability. System scalability is about maintaining the SLA of the system as the user base continues to grow and as the user activity continues to rise. Software engineering team scalability is equally important. Software Engineering Team Scalability.

Scalability

Scalability Software Engineering Engineering Systems

How to Trace Linux System Calls in Production (Without Breaking Performance)

DZone

JANUARY 23, 2021

If you need to dynamically trace Linux process system calls, you might first consider strace. strace is simple to use and works well for issues such as "Why can't the software run on this machine?" So are there any tools that excel at tracing system calls in a production environment? The answer is YES.

Systems

Systems Performance Software Engineering Performance Testing

Software engineering for machine learning: a case study

The Morning Paper

JULY 7, 2019

Software engineering for machine learning: a case study Amershi et al., More specifically, we’ll be looking at the results of an internal study with over 500 participants designed to figure out how product development and software engineering is changing at Microsoft with the rise of AI and ML. ICSE’19.

Software Engineering

Software Engineering Engineering Software Software

Up your quality and agility factor – using automation to build “performance-as-a-self-service”

Dynatrace

MARCH 3, 2020

For software engineering teams, this demand means not only delivering new features faster but ensuring quality, performance, and scalability too. One way to apply improvements is transforming the way application performance engineering and testing is done. Here is the definition of this model: ?. Try it today using Keptn .

Performance

Performance Education Innovation Software Architecture

Architecture Patterns: The Circuit-Breaker

DZone

NOVEMBER 3, 2023

In the world of distributed systems, the likelihood of components failing or becoming unresponsive is higher compared to monolithic systems. Therefore, resilience — the ability of a system to handle and recover from failures — becomes critically important in distributed environments.

Architecture

Architecture Software Engineering Traffic Engineering

Computational Causal Inference at Netflix

The Netflix TechBlog

AUGUST 11, 2020

These methods can provide rich information for decision making, such as in experimentation platforms (“XP”) or in algorithmic policy engines. We want to amplify the effectiveness of our researchers by providing them software that can estimate causal effects models efficiently, and can integrate causal effects into large engineering systems.

Software Engineering

Software Engineering Scalability Engineering Strategy

A New Era Has Come, and So Must Your Database Observability

DZone

SEPTEMBER 28, 2023

Software engineers didn’t need to understand the database, and even if they owned it, it was just a single component of the system. Guaranteeing software quality was much easier because the deployment happened rarely, and things could be captured on time via automated tests.

Database

Database Software Engineering Software Software

A Note to Business Leaders on Software Engineering

Strategic Tech

MARCH 26, 2019

Software development is not an established discipline where there is a clear technique used to solve any given problem. In fact, there are near infinite ways to solve every software engineering challenging. However, as software systems age, the time it takes to add new features grows exponentially?—?and

Software Engineering

Software Engineering Engineering Software Software

Nurturing Design in Your Software Engineering Culture

Strategic Tech

MARCH 16, 2021

There are a few qualities that differentiate average from high performing software engineering organisations. In my experience, the culture is better and the results are better in orgs where engineers and architects obsess over the design of code and architecture. Regularly spending time with domain experts is important.

Software Engineering

Software Engineering Design Engineering Software

How Red Hat and Dynatrace intelligently automate your production environment

Dynatrace

MAY 6, 2024

Problem remediation is too time-consuming According to the DevOps Automation Pulse Survey 2023 , on average, a software engineer takes nine hours to remediate a problem within a production application. Context-rich tickets can be created in systems like Jira or ServiceNow for traceability and compliance.

DevOps

DevOps Software Engineering Games Java

Site reliability engineering: Six SRE trends to unleash DevOps innovation

Dynatrace

JUNE 2, 2022

Site reliability engineering (SRE) continues to gain popularity as organizations embrace hybrid cloud strategies and IT automation at scale. By applying software engineering principles to operations and infrastructure practices, SRE enables organizations to streamline and automate IT processes. Dynatrace news.

DevOps

DevOps Innovation Engineering Benchmarking

Growth Engineering at Netflix?—?Automated Imagery Generation

The Netflix TechBlog

FEBRUARY 9, 2021

Growth Engineering at Netflix?—?Automated In the Growth Engineering team, we refer to this as the top of the signup funnel. For more background on the signup funnel and Growth Engineering’s role in the signup funnel, please read our initial post on the topic: Growth Engineering at Netflix? Growth Engineering at Netflix?—?Automated

Engineering

Engineering Storage Latency Entertainment

Designing Instagram

High Scalability

JANUARY 11, 2022

Machine Learning Engineer at Amazon and has led several machine-learning initiatives across the Amazon ecosystem. The streaming data store makes the system extensible to support other use-cases (e.g. System Components. The system will comprise of several micro-services each performing a separate task.

Design

Design Media Storage Logistics

The 737Max and Why Software Engineers Might Want to Pay Attention

J. Paul Reed

MARCH 14, 2019

The 737Max and Why Software Engineers Might Want to Pay Attention As someone with a bit of a reputation for talking about aviation and software development and operations , I’ve been asked about the 737Max repeatedly over the past week. the part under control of the automatic system?—?can

Software Engineering

Software Engineering Engineering Software Software

Bringing AV1 Streaming to Netflix Members’ TVs

The Netflix TechBlog

NOVEMBER 9, 2021

The Client and UI Engineering team built a certification test with these streams to analyze both the device logs as well as the pictures rendered on the screen. The Performance Engineering team specializes in optimizing resource utilization at Netflix. Some titles (e.g., Challenge 4: How do we continuously monitor AV1 streaming?

Media

Media Open Source Software Engineering Efficiency

Growth Engineering at Netflix- Creating a Scalable Offers Platform

The Netflix TechBlog

FEBRUARY 9, 2021

The Growth Engineering team is responsible for executing growth initiatives that help us anticipate and adapt to this change. In particular, it’s our job to design and build the systems and protocols that enable customers from all over the world to sign up for Netflix with the plan features and incentives that best suit their needs.

Engineering

Engineering Scalability Architecture Innovation

Starting an SRE Team? Stay Away From Uptime.

DZone

DECEMBER 8, 2021

A good SRE engineer will tell you your service is never down. A great SRE engineer will tell you that’s not what you should be measuring. In fact, they’ll tell you their job is customer service.

Engineering

Engineering Scalability Systems Traffic

What Is Load Testing? Ensuring Robust System Performance Under Pressure

DZone

JULY 5, 2023

While load testing may sound like an esoteric domain exclusive to software engineers or network administrators, it is, in fact, a silent superhero in our increasingly digital world. Acting behind the scenes, load testing ensures the apps and websites we use daily are capable of withstanding the demands of their users without stumbling.

Systems

Systems Testing Software Engineering Performance

Architected for resiliency: How Dynatrace withstands data center outages

Dynatrace

JUNE 15, 2021

The fact is, Reliability and Resiliency must be rooted in the architecture of a distributed system. And the last sentence of the email was what made me want to share this story publicly, as it’s a testimonial to how modern software engineering and operations should make you feel. Let me start with the end-user impact.

AWS

AWS Traffic Architecture Azure

Site reliability done right: 5 SRE best practices that deliver on business objectives

Dynatrace

MAY 31, 2023

Site reliability engineering (SRE) has recently become a critical discipline in recent years as the world has shifted in favor of web-based interactions. This shift is leading more organizations to hire site reliability engineers to guarantee the reliability and resiliency of their services. Mobile retail e-commerce spending in the U.

Best Practices

Best Practices DevOps Latency Metrics

Sustainability: Thoughts from a software engineer

The Next Major Shift in Enterprise Software Engineering: From Platforms to “Platformless”

Trending Sources

Low-Maintenance Backend Architectures for Scalable Applications

Our First Netflix Data Engineering Summit

Showcasing engineering excellence at Dynatrace

Key Elements of Site Reliability Engineering (SRE)

Why applying chaos engineering to data-intensive applications matters

Site reliability engineering: 5 things you need to know

Unlock the Power of DevSecOps with Newly Released Kubernetes Experience for Platform Engineering

Title Launch Observability at Netflix Scale

Site reliability engineering: 5 things to you need to know

Demystifying Interviewing for Backend Engineers @ Netflix

Site Reliability Engineering

5 powerful use cases beyond debugging for Dynatrace Live Debugger

A Recap of the Data Engineering Open Forum at Netflix

The state of site reliability engineering: SRE challenges and best practices in 2023

Mastering System Design: A Comprehensive Guide to System Scaling for Millions (Part 1)

Data Engineers of Netflix?—?Interview with Pallavi Phadnis

Chaos Engineering: Building Resilient Systems, One Failure at a Time

Escaping POC Purgatory: Evaluation-Driven Development for AI Systems

Beyond “Prompt and Pray”

LLMs Demand Observability-Driven Development

Podcast interview: Rust and C++

How to Prepare for Your DevOps Interview

Scaling Is Not Just About Products – It’s About Teams, Too

How to Trace Linux System Calls in Production (Without Breaking Performance)

Software engineering for machine learning: a case study

Up your quality and agility factor – using automation to build “performance-as-a-self-service”

Architecture Patterns: The Circuit-Breaker

Computational Causal Inference at Netflix

A New Era Has Come, and So Must Your Database Observability

A Note to Business Leaders on Software Engineering

Nurturing Design in Your Software Engineering Culture

How Red Hat and Dynatrace intelligently automate your production environment

Site reliability engineering: Six SRE trends to unleash DevOps innovation

Growth Engineering at Netflix?—?Automated Imagery Generation

Designing Instagram

The 737Max and Why Software Engineers Might Want to Pay Attention

Bringing AV1 Streaming to Netflix Members’ TVs

Growth Engineering at Netflix- Creating a Scalable Offers Platform

Starting an SRE Team? Stay Away From Uptime.

What Is Load Testing? Ensuring Robust System Performance Under Pressure

Architected for resiliency: How Dynatrace withstands data center outages

Site reliability done right: 5 SRE best practices that deliver on business objectives

Stay Connected