Infrastructure, Processing and Software Engineering

Sustainability: Thoughts from a software engineer

Dynatrace

MARCH 17, 2025

How to achieve sustainable IT practices Use observability tools The first step in driving improvements is to obtain a comprehensive view of your IT infrastructure’s climate impact. For example, reporting jobs can process monthly data without running exactly at the end of the month.

Software Engineering

Software Engineering Engineering Software Software

Building and Scaling Data Lineage at Netflix to Improve Data Infrastructure Reliability, and…

The Netflix TechBlog

MARCH 25, 2019

Building and Scaling Data Lineage at Netflix to Improve Data Infrastructure Reliability, and Efficiency By: Di Lin , Girish Lingappa , Jitender Aswani Imagine yourself in the role of a data-inspired decision maker staring at a metric on a dashboard about to make a critical business decision but pausing to ask a question?—?“Can

Infrastructure

Infrastructure Big Data Transportation Architecture

Extend the AI and automation core of Dynatrace with host extensions to resolve infrastructure problems

Dynatrace

MAY 13, 2020

OneAgent gives you all the operational and business performance metrics you need, from the front end to the back end and everything in between—cloud instances, hosts, network health, processes, and services. All the data bound to hosts is analyzed by the Davis AI causation engine and made available on custom dashboards and events pages.

Infrastructure

Infrastructure Metrics Monitoring Software Engineering

What is platform engineering?

Dynatrace

NOVEMBER 3, 2023

Platform engineering is a practice that outlines how development teams build internal platforms to create self-service capabilities for software engineering teams. The result is a cloud-native approach to software delivery. Platform engineering cannot stand alone, however.

Engineering

Engineering DevOps Software Engineering Scalability

SRE vs DevOps: What you need to know

Dynatrace

FEBRUARY 24, 2021

DevOps is focused on optimizing software development and delivery, and SRE is focused on operations processes. DevOps is not a specific process, but rather a general collection of flexible software creation and delivery practices that looks to close the gap between software development and IT operations.

DevOps

DevOps Software Engineering Speed Google

How Red Hat and Dynatrace intelligently automate your production environment

Dynatrace

MAY 6, 2024

A tight integration between Red Hat Ansible Automation Platform, Dynatrace Davis ® AI, and the Dynatrace observability and security platform enables closed-loop remediation to automate the process from: Detecting a problem. With DQL, the workflow trigger to initiate a required automation and remediation process can be defined.

DevOps

DevOps Software Engineering Games Java

Why applying chaos engineering to data-intensive applications matters

Dynatrace

MAY 23, 2024

Stream processing One approach to such a challenging scenario is stream processing, a computing paradigm and software architectural style for data-intensive software systems that emerged to cope with requirements for near real-time processing of massive amounts of data.

Engineering

Engineering Tuning Latency Open Source

Unlock the Power of DevSecOps with Newly Released Kubernetes Experience for Platform Engineering

Dynatrace

NOVEMBER 7, 2023

Platform engineering is on the rise. According to leading analyst firm Gartner, “80% of software engineering organizations will establish platform teams as internal providers of reusable services, components, and tools for application delivery…” by 2026. Open source logs and metrics take precedence in the monitoring process.

Engineering

Engineering DevOps Best Practices Infrastructure

Site reliability engineering: 5 things you need to know

Dynatrace

FEBRUARY 4, 2021

What is site reliability engineering? Site reliability engineering (SRE) is the practice of applying software engineering principles to operations and infrastructure processes to help organizations create highly reliable and scalable software systems. Dynatrace news.

Engineering

Engineering DevOps Government Latency

Conducting log analysis with an observability platform and full data context

Dynatrace

APRIL 20, 2023

With more automated approaches to log monitoring and log analysis, however, organizations can gain visibility into their applications and infrastructure efficiently and with greater precision—even as cloud environments grow. They enable IT teams to identify and address the precise cause of application and infrastructure issues.

Analytics

Analytics Infrastructure Storage Architecture

How platform engineering and IDP observability can accelerate developer velocity

Dynatrace

MARCH 6, 2024

Platform engineering creates and manages a shared infrastructure and set of tools, such as internal developer platforms (IDPs) , to enable software developers to build, deploy, and operate applications more efficiently. This process is vital to an IDP’s effectiveness. “It makes them more productive.

Engineering

Engineering Development DevOps Infrastructure

Software engineering for machine learning: a case study

The Morning Paper

JULY 7, 2019

Software engineering for machine learning: a case study Amershi et al., Previously on The Morning Paper we’ve looked at the spread of machine learning through Facebook and Google and some of the lessons learned together with processes and tools to address the challenges arising. A general process. ICSE’19.

Software Engineering

Software Engineering Engineering Software Software

Automating Success: Building a better developer experience with platform engineering

Dynatrace

FEBRUARY 12, 2024

At the 2024 Dynatrace Perform conference in Las Vegas, Michael Winkler, senior principal product management at Dynatrace, ran a technical session exploring just some of the many ways in which Dynatrace helps to automate the processes around development, releases, and operation.

Engineering

Engineering Development Infrastructure Cloud

Site reliability engineering: 5 things to you need to know

Dynatrace

FEBRUARY 4, 2021

Site reliability engineering (SRE) is the practice of applying software engineering principles to operations and infrastructure processes to help organizations create highly reliable and scalable software systems. Dynatrace news. SRE drives a “shift left” mindset.

Engineering

Engineering DevOps Government Latency

The State of DevOps Automation assessment: How automated are you?

Dynatrace

APRIL 22, 2024

In response to the scale and complexity of modern cloud-native technology, organizations are increasingly reliant on automation to properly manage their infrastructure and workflows. DevOps automation eliminates extraneous manual processes, enabling DevOps teams to develop, test, deliver, deploy, and execute other key processes at scale.

DevOps

DevOps Artificial Intelligence Government Innovation

Automated observability, security, and reliability at scale

Dynatrace

JULY 18, 2023

As software development grows more complex, managing components using an automated onboarding process becomes increasingly important. However, scaling up software development requires more tools along the software product lifecycle, which must be configured promptly and efficiently.

Best Practices

Best Practices Code Infrastructure Latency

Designing Instagram

High Scalability

JANUARY 11, 2022

FUN FACT : In this talk , Rodrigo Schmidt, director of engineering at Instagram talks about the different challenges they have faced in scaling the data infrastructure at Instagram. There are two major processes which gets executed when a user posts a photo on Instagram. System Components. Streaming Data Model. Optimization.

Design

Design Media Storage Logistics

Auto-adaptive thresholds for AI-driven quality gating

Dynatrace

JUNE 4, 2024

This process, known as auto-adaptive thresholding, eliminates the need to define a static threshold upfront. To reduce developers’ cognitive load by providing timely information, platform engineers must create tools that allow validations to be run in the early phases of development, with direct and fast feedback loops.

Metrics

Metrics Engineering Code Tuning

What is DevOps orchestration? And why invest in orchestration tools?

Dynatrace

DECEMBER 5, 2022

DevOps is a widely practiced set of procedures and tools for streamlining the development, release, and updating of software. In their most basic form, DevOps procedures can result in complicated processes, data silos, and fragmented responsibilities. DevOps orchestration in practice. Automation versus orchestration.

DevOps

DevOps Virtualization Best Practices Innovation

AWS observability: AWS monitoring best practices for resiliency

Dynatrace

NOVEMBER 22, 2021

Because of its matrix of cloud services across multiple environments, AWS and other multicloud environments can be more difficult to manage and monitor compared with traditional on-premises infrastructure. EC2 is Amazon’s Infrastructure-as-a-service (IaaS) compute platform designed to handle any workload at scale. Amazon EC2.

Best Practices

Best Practices AWS Monitoring Serverless

Demystifying Interviewing for Backend Engineers @ Netflix

The Netflix TechBlog

FEBRUARY 1, 2022

You apply for multiple roles at the same company and proceed through the interview process with each hiring team separately, despite the fact that there is tremendous overlap in the roles. Interviewing can be a daunting endeavor and how companies, and teams, approach the process varies greatly.

Engineering

Engineering Games Entertainment Innovation

Data Engineers of Netflix?—?Interview with Pallavi Phadnis

The Netflix TechBlog

OCTOBER 28, 2021

Interview with Pallavi Phadnis This post is part of our “ Data Engineers of Netflix ” series, where our very own data engineers talk about their journeys to Data Engineering @ Netflix. Pallavi Phadnis is a Senior Software Engineer at Netflix. Pallavi, what’s your journey to data engineering at Netflix?

Data Engineering

Data Engineering Engineering Big Data Software Engineering

Site reliability done right: 5 SRE best practices that deliver on business objectives

Dynatrace

MAY 31, 2023

How site reliability engineering affects organizations’ bottom line SRE applies the disciplines of software engineering to infrastructure management, both on-premises and in the cloud. However, cloud complexity has made software delivery challenging. But the transition to SRE maturity is not always easy.

Best Practices

Best Practices DevOps Latency Metrics

Open-Sourcing Metaflow, a Human-Centric Framework for Data Science

The Netflix TechBlog

DECEMBER 3, 2019

About two years ago, we, at our newly formed Machine Learning Infrastructure team started asking our data scientists a question: “What is the hardest thing for you as a data scientist at Netflix?” mainly because of mundane reasons related to software engineering. like they would do in a Jupyter notebook.

Open Source

Open Source AWS Infrastructure Energy

Nurturing Design in Your Software Engineering Culture

Strategic Tech

MARCH 16, 2021

There are a few qualities that differentiate average from high performing software engineering organisations. In my experience, the culture is better and the results are better in orgs where engineers and architects obsess over the design of code and architecture. My experience is the opposite.

Software Engineering

Software Engineering Design Engineering Software

ConsoleMe: A Central Control Plane for AWS Permissions and Access

The Netflix TechBlog

MARCH 10, 2021

Motivation Growth in the cloud has exploded, and it is now easier than ever to create infrastructure on the fly. Groups beyond software engineering teams are standing up their own systems and automation. At many companies, managing cloud hygiene and security usually falls under the infrastructure or security teams.

AWS

AWS Cloud Games Infrastructure

The Show Must Go On: Securing Netflix Studios At Scale

The Netflix TechBlog

SEPTEMBER 13, 2021

During our initial consultations, it was clear that developers preferred prioritizing product work over security or infrastructure improvements. We were under pressure to improve our adoption numbers and decided to focus first on the setup friction by improving the developer experience and automating the onboarding process.

Internet

Internet Internet Cloud Traffic

All of Netflix’s HDR video streaming is now dynamically optimized

The Netflix TechBlog

NOVEMBER 29, 2023

For a given rate-quality operating point, the DO process helps allocate bits among the various shots while maximizing an overall objective function. Additionally, the current version has some algorithmic limitations that we are in the process of improving before the official release. Krasula, A. Choudhury, S. Malfait, A.

Open Source

Open Source Software Engineering Internet Internet

MLOps and DevOps: Why Data Makes It Different

O'Reilly

OCTOBER 19, 2021

As with many burgeoning fields and disciplines, we don’t yet have a shared canonical infrastructure stack or best practices for developing and deploying data-intensive applications. What does a modern technology stack for streamlined ML processes look like? All ML projects are software projects. Why: Data Makes It Different.

DevOps

DevOps Software Engineering Infrastructure Open Source

Connect your software with the right people: Ownership drives effective collaboration

Dynatrace

MARCH 28, 2023

All such automation is available while your environment is continuously enriched with additional contextual information that connects the responsible teams with your software development process. Any software engineer can search for monitored entities that relate to specific deployments and their respective teams.

Software

Software Software Monitoring Software Engineering

Netflix at AWS re:Invent 2019

The Netflix TechBlog

NOVEMBER 22, 2019

Netflix shares how Amazon EC2 Auto Scaling allows its infrastructure to automatically adapt to changing traffic patterns in order to keep its audience entertained and its costs on target. Instead, we provide them with delightfully usable ML infrastructure that they can use to manage a project’s lifecycle. Wednesday?—?December

AWS

AWS Entertainment Open Source Benchmarking

Application observability meets developer observability: Unlock a 360º view of your environment

Dynatrace

NOVEMBER 6, 2023

In a recent webinar , Dynatrace DevOps activist Andi Grabner and senior software engineer Yarden Laifenfeld explored developer observability. They also care about infrastructure: SREs require system visibility and incident management. The DevOps people looking end-to-end. “Then, I add a breakpoint.

Development

Development DevOps Programming Cloud

Autonomous Cloud Enablement aka Scaling NoOps via Self-Service

Dynatrace

FEBRUARY 6, 2020

We talked about the cultural and process changes in order to follow through on Bernd’s requirement of “ 1 hour Code to Production !”. To do that, Anita’s team drove innovation around a common delivery pipeline to enable developers automating operational tasks such as runbook execution to solve infrastructure problems.

Cloud

Cloud DevOps Engineering Speed

DevOps observability: A guide for DevOps and DevSecOps teams

Dynatrace

JANUARY 18, 2023

However, getting reliable answers from observability data so teams can automate more processes to ensure speed, quality, and reliability can be challenging. According to recent Dynatrace research , organizations expect to make software updates 58% more frequently in the coming year. What is SRE (site reliability engineering)?

DevOps

DevOps Best Practices Innovation Strategy

OpenTelemetry observability and Dynatrace deliver actionable answers at scale

Dynatrace

AUGUST 21, 2023

But, as Justin Scherer, senior software engineer from Northwestern Mutual found, OpenTelemetry by itself is not a panacea. The collector processes the data and exports it to a predetermined monitoring, tracing, or logging back end, such as Dynatrace, for analysis. What is OpenTelemetry?

Open Source

Open Source Analytics Lambda Metrics

Dynatrace Perform 2024 Guide: Deriving business value from AI data analysis

Dynatrace

JANUARY 23, 2024

Another key theme at Dynatrace Perform 2024 is organizations’ growing adoption of platform engineering , which helps accelerate the delivery of software applications. Platform engineering improves developer productivity by providing self-service capabilities with automated infrastructure operations.

Performance

Performance DevOps Innovation Energy

The End of Programming as We Know It

O'Reilly

FEBRUARY 4, 2025

In the early days of the personal computer, every computer manufacturer needed software engineers who could write low-level drivers that performed the work of reading and writing to memory boards, hard disks, and peripherals such as modems and printers. All of this happens through a process that Bessen calls learning by doing.

Programming

Programming Google Infrastructure Internet

Re-Architecting the Video Gatekeeper

The Netflix TechBlog

JULY 12, 2019

By Drew Koszewnik This is the story about how the Content Setup Engineering team used Hollow, a Netflix OSS technology, to re-architect and simplify an essential component in our content pipeline?—?delivering delivering a large amount of business value in the process. Improved debuggability and visibility into liveness processing.

Cache

Cache Architecture Engineering Latency

Site reliability engineering: Six SRE trends to unleash DevOps innovation

Dynatrace

JUNE 2, 2022

Site reliability engineering (SRE) continues to gain popularity as organizations embrace hybrid cloud strategies and IT automation at scale. By applying software engineering principles to operations and infrastructure practices, SRE enables organizations to streamline and automate IT processes.

DevOps

DevOps Innovation Engineering Benchmarking

Open-Sourcing Metaflow, a Human-Centric Framework for Data Science

The Netflix TechBlog

DECEMBER 3, 2019

About two years ago, we, at our newly formed Machine Learning Infrastructure team started asking our data scientists a question: “What is the hardest thing for you as a data scientist at Netflix?” mainly because of mundane reasons related to software engineering. like they would do in a Jupyter notebook.

Open Source

Open Source AWS Infrastructure Energy

Open-Sourcing a Monitoring GUI for Metaflow

The Netflix TechBlog

OCTOBER 27, 2021

Following our culture of freedom and responsibility , Metaflow grants data scientists the freedom to choose the right modeling approach, handle data and features flexibly, and construct workflows easily while ensuring that the resulting project executes responsibly and robustly on the production infrastructure.

Open Source

Open Source Monitoring Scalability Code

Post: Scrapinghub, Fauna, Sisu, Educative, PA File Sight, Etleap, Triplebyte, Stream

High Scalability

MARCH 24, 2020

Scrapinghub is hiring a Senior Software Engineer (Big Data/AI). this is going to be a challenging journey for any backend engineer! T riplebyte lets exceptional software engineers skip screening steps at hundreds of top tech companies like Apple, Dropbox, Mixpanel, and Instacart. Who's Hiring? Apply here.

Education

Education Software Engineering Engineering Big Data

Sustainability: Thoughts from a software engineer

Building and Scaling Data Lineage at Netflix to Improve Data Infrastructure Reliability, and…

Trending Sources

Extend the AI and automation core of Dynatrace with host extensions to resolve infrastructure problems

What is platform engineering?

SRE vs DevOps: What you need to know

How Red Hat and Dynatrace intelligently automate your production environment

Why applying chaos engineering to data-intensive applications matters

Unlock the Power of DevSecOps with Newly Released Kubernetes Experience for Platform Engineering

Site reliability engineering: 5 things you need to know

Conducting log analysis with an observability platform and full data context

How platform engineering and IDP observability can accelerate developer velocity

Software engineering for machine learning: a case study

Automating Success: Building a better developer experience with platform engineering

Site reliability engineering: 5 things to you need to know

The State of DevOps Automation assessment: How automated are you?

Automated observability, security, and reliability at scale

Designing Instagram

Auto-adaptive thresholds for AI-driven quality gating

What is DevOps orchestration? And why invest in orchestration tools?

AWS observability: AWS monitoring best practices for resiliency

Demystifying Interviewing for Backend Engineers @ Netflix

Data Engineers of Netflix?—?Interview with Pallavi Phadnis

Site reliability done right: 5 SRE best practices that deliver on business objectives

Open-Sourcing Metaflow, a Human-Centric Framework for Data Science

Nurturing Design in Your Software Engineering Culture

ConsoleMe: A Central Control Plane for AWS Permissions and Access

The Show Must Go On: Securing Netflix Studios At Scale

All of Netflix’s HDR video streaming is now dynamically optimized

MLOps and DevOps: Why Data Makes It Different

Connect your software with the right people: Ownership drives effective collaboration

Netflix at AWS re:Invent 2019

Application observability meets developer observability: Unlock a 360º view of your environment

Autonomous Cloud Enablement aka Scaling NoOps via Self-Service

DevOps observability: A guide for DevOps and DevSecOps teams

OpenTelemetry observability and Dynatrace deliver actionable answers at scale

Dynatrace Perform 2024 Guide: Deriving business value from AI data analysis

The End of Programming as We Know It

Re-Architecting the Video Gatekeeper

Site reliability engineering: Six SRE trends to unleash DevOps innovation

Open-Sourcing Metaflow, a Human-Centric Framework for Data Science

Open-Sourcing a Monitoring GUI for Metaflow

Sponsored Post: ipdata, StackHawk, InterviewCamp.io, Educative, Triplebyte, Stream, Fauna

Sponsored Post: StackHawk, InterviewCamp.io, Educative, Triplebyte, Stream, Fauna

Post: Scrapinghub, Fauna, Sisu, Educative, PA File Sight, Etleap, Triplebyte, Stream

Stay Connected