Trending Articles

article thumbnail

Load Testing Essentials for High-Traffic Applications

DZone

Todays applications must simultaneously serve millions of users, so high performance is a hard requirement for this heavy load. When you consider marketing campaigns, seasonal spikes, or social media virality episodes, this demand can overshoot projections and bring systems to a grinding halt. To that end, monitoring performance and load testing has become an integral part of app development and deployment: it mimics real application performance under stress, and with this kind of testing, teams

Traffic 147
article thumbnail

Part 2: A Survey of Analytics Engineering Work at Netflix

The Netflix TechBlog

This article is the second in a multi-part series sharing a breadth of Analytics Engineering work at Netflix, recently presented as part of our annual internal Analytics Engineering conference. Need to catch up? Check out Part 1. In this article, we highlight a few exciting analytic business applications, and in our final article well go into aspects of the technical craft.

130
130
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Seamless RDS to DynamoDB Migration: Unlocking Scalability With the Dual Write Strategy

DZone

Migrating from Amazon RDS to DynamoDB can be a significant challenge, especially when transitioning from a relational database like RDS (PostgreSQL, MySQL, etc.) to DynamoDB, a NoSQL, key-value store. One of the most effective strategies for migrating data incrementally is the Dual Write approach. This allows you to keep both databases in sync during the transition, minimizing downtime and reducing the risk of data inconsistency.

Strategy 246
article thumbnail

New continuous compliance requirements drive the need to converge observability and security

Dynatrace

At the time when I was building the most innovative observability company, security seemed too distant. However, customers began approaching me, praising Dynatraces deep end-to-end insights into even the most complex digital service deployments and asking how to use it for security compliance, exposure, and response use cases. I realized that our platforms unique ability to contextualize security events, metrics, logs, traces, and user behavior could revolutionize the security domain by convergi

Analytics 219
article thumbnail

Introducing Configurable Metaflow

The Netflix TechBlog

David J. Berg * , David Casler ^, Romain Cledat * , Qian Huang * , Rui Lin * , Nissan Pow * , Nurcan Sonmez * , Shashank Srikanth * , Chaoying Wang * , Regina Wang * , Darin Yu * *: Model Development Team, Machine Learning Platform ^: Content Demand ModelingTeam A month ago at QConSF, we showcased how Netflix utilizes Metaflow to power a diverse set of ML and AI use cases , managing thousands of unique Metaflow flows.

article thumbnail

MySQL Transaction ERROR 1412 and Isolation Levels

Percona

This blog post explains the cause of “ERROR 1412 (HY000): Table definition has changed, please retry transaction” with the specific Isolation level settings. Background As per the MySQL documentation, this error should occur for “operations that make a temporary copy of the original table and delete the original table when the temporary copy is built.

article thumbnail

Summarizing Books as Podcasts

O'Reilly

Like just about everyone, we were impressed by the ability of NotebookLM to generate podcasts: Two virtual people holding a discussion. You can give it some links, and it will generate a podcast based on the links. The podcasts were interesting and engaging. But they also had some limitations. The problem with NotebookLM is that, while you can give it a prompt, it largely does what its going to do.

More Trending

article thumbnail

Dynatrace wins InfoWorld’s 2024 Technology of the Year Award for DevOps: Observability

Dynatrace

We’re excited to share that Dynatrace has been recognized in the DevOps: Observability category of InfoWorlds 2024 Technology of the Year awards! This honor reflects our dedication to helping organizations tackle complexity and achieve greater clarity in managing their digital environments. Why this award matters The InfoWorld Technology of the Year awards celebrate standout solutions in AI, APIs, applications, business intelligence, cloud, data management, DevOps, and software development

DevOps 217
article thumbnail

Title Launch Observability at Netflix Scale

The Netflix TechBlog

Part 1: Understanding The Challenges By: VarunKhaitan With special thanks to my stunning colleagues: Mallika Rao , Esmir Mesic , HugoMarques Introduction At Netflix, we manage over a thousand global content launches each month, backed by billions of dollars in annual investment. Ensuring the success and discoverability of each title across our platform is a top priority, as we aim to connect every story with the right audience to delight our members.

Traffic 172
article thumbnail

How To Deal with a AUTO_INCREMENT Max Value Problem in MySQL and MariaDB

Percona

An application down due to not being able to write into a table anymore due to a maximum allowed auto-increment value may be one of the worst nightmares of a DBA.

article thumbnail

Unveiling Graph Structures in Microservices: Service Dependency Graph, Call Graph, and Causal Graph

Abhishek Tiwari

The rise of service-oriented architecture (SOA) and microservices architecture has led to a major shift in software development, enabling the creation of complex, distributed systems composed of independent, loosely coupled services. These architectures offer numerous benefits, including scalability, flexibility, and resilience. However, the distributed nature of these systems introduces new challenges related to understanding, managing, and analysing their behaviour.

article thumbnail

AWS Performance Tuning: Why EC2 Autoscaling Isn’t a Silver Bullet

DZone

AWS EC2 Autoscaling is frequently regarded as the ideal solution for managing fluctuating workloads. It offers automatic adjustments of computing resources in response to demand, theoretically removing the necessity for manual involvement. Nevertheless, depending exclusively on EC2 Autoscaling can result in inefficiencies, overspending, and performance issues.

AWS 260
article thumbnail

When things go sideways: Troubleshooting the OpenTelemetry Operator

Dynatrace

This blog post was co-written with Reese Lee. If you already have an application running in Kubernetes and are exploring using OpenTelemetry to gain insights into the health and performance of your app and cluster, you might be interested in an implementation of the Kubernetes Operator called the OpenTelemetry Operator. As youll learn shortly, due to its range of capabilities, the Operator is your go-to for (almost) hassle-free OpenTelemetry management.

Java 212
article thumbnail

MySQL with Diagrams Part One: Replication Architecture

Percona

In this series, MySQL with Diagrams, Ill use diagrams to explain internals, architectures, and structures as detailed as possible. In basic terms, here’s how replication works: the transactions are written into a binary log on the source side, carried into the replica, and applied.

article thumbnail

The Hidden Cost of Success: Understanding Non-Fatal Errors in Microservices

Abhishek Tiwari

A recent research from Washington University in St. Louis and Uber Technologies reveals a critical but often overlooked phenomenon in microservices architectures: non-fatal errors. While these errors don’t cause system failures, they introduce significant performance overhead that can impact user experience and operational costs. Study by Lee et. al explores how prevalent are non-fatal errors and what impact do they have on the exposed latency of top-level requests?

Latency 64
article thumbnail

Charge Vertical Scaling With the Latest Java GCs

DZone

In the dynamic landscape of Java ecosystem enhancements, one could miss an important progress that has been made in Java Garbage Collection (GC) in recent years. Meanwhile, the latest generations of GC bring far-reaching consequences for running Java applications. This article intends to highlight the significant effects brought to us by ZGC and Shenandoah.

Java 264
article thumbnail

HTTP monitors on the latest Dynatrace platform extend insights into the health of your API endpoints and simplify test management

Dynatrace

The improved UI of the new Synthetic app makes managing your synthetic tests and analyzing their results easier and more effective. Exploratory analytics now cover more bespoke scenarios, allowing you to access any element of test results stored in the Dynatrace Grail data lakehouse. This allows you to build customized visualizations with Dashboards or perform in-depth analysis with Notebooks.

article thumbnail

Our 10 most popular web performance articles of 2024

Speed Curve

We love writing articles and blog posts that help folks solve real web performance and UX problems. Here are the ones you loved most in 2024. (The number one item may surprise you!) Some of these articles come from our recently published Web Performance Guide – a collection of evergreen how-to resources (written by actual humans!) that will help you master website monitoring, analytics, and diagnostics.

article thumbnail

Software Licensing and Open Source: Why Definitions Matter

Percona

Over the last few years, there have been many calls to evolve the Open Source Definition (OSD) to fit the modern world. Some would like to seenoncompetitive licenses such as the Server Side Public License (SSPL) or Elastic License considered open source.

article thumbnail

Microservice Architecture of Alibaba

Abhishek Tiwari

Alibaba has constructed a sophisticated microservices architecture to address the challenges of serving its vast user base and handling complex business operations. This article delves into the nuances of Alibaba’s microservices architecture, presenting critical insights into its design, scalability, performance optimisation, and resource management strategies.

article thumbnail

How OpenAI’s Downtime Incident Teaches Us to Build More Resilient Systems

DZone

On December 11, 2024, OpenAI services experienced significant downtime due to an issue stemming from a new telemetry service deployment. This incident impacted API, ChatGPT, and Sora services, resulting in service disruptions that lasted for several hours. As a company that aims to provide accurate and efficient AI solutions, OpenAI has shared a detailed post-mortem report to transparently discuss what went wrong and how they plan to prevent similar occurrences in the future.

Systems 241
article thumbnail

Predictable costs for Log Management & Analytics with new simplified licensing plan

Dynatrace

As cloud complexity increases and security concerns mount, organizations need log analytics to discover and investigate issues and gain critical business intelligence. But exploring the breadth of log analytics scenarios with most log vendors often results in unexpectedly high monthly log bills and aggressive year-over-year costs. To give organizations the freedom to explore log analytics without barriers due to cost concerns, Dynatrace is proud to announce a new Dynatrace Platform Subscription

Analytics 152
article thumbnail

ICYMI: Some of our most exciting product updates of 2024!

Speed Curve

Every year feels like a big year here at SpeedCurve, and 2024 was no exception. Here's a recap of product highlights designed to make your performance monitoring even better and easier! Our biggest achievements this year have centred on making it easier for you to: Gather more meaningful real user monitoring (RUM) data Get actionable insights from Core Web Vitals Simplify your synthetic testing Get expert performance coaching when and how you need it Keep reading to learn more.

Cache 52
article thumbnail

Percona Wins 2024 Digital Innovator Award from Intellyx

Percona

As organizations turn to the cloud for more and more products and services, many are discovering that the majority of these offerings come with hidden, yet significant, costs.

article thumbnail

Distributed Traces of Microservices Architecture: Publicly Available Datasets

Abhishek Tiwari

Distributed traces are essential for understanding microservices behaviour, performance, and reliability. They provide insights into service dependencies, call graphs, and runtime dynamics, enabling researchers to develop and test tools for optimisation and fault diagnosis. However, there is a noticeable gap in terms of publicly available real-world distributed tracing datasets for microservices architecture research to bridge the divide between theoretical research and real-world implementation

article thumbnail

Monitor Spring Boot Web Application Performance Using Micrometer and InfluxDB

DZone

As a Java developer, there's nothing more frustrating than dealing with sluggish application performance in production. Diagnosing issues within complex microservice architectures can quickly become a time-consuming and daunting task. Fortunately, the Spring Boot framework offers a powerful observability stack that streamlines real-time monitoring and performance analysis.

article thumbnail

Five observability predictions for 2025

Dynatrace

As the digital world grows more complex, 2025 will bring a tipping point for organizations navigating increasingly dynamic and interconnected IT environments. Observability, long a cornerstone of IT operations, will take on transformative new roles. Driven by rapid advances in AI, evolving regulatory frameworks, and mounting sustainability pressures, observability will no longer be a passive diagnostic tool.

Energy 169
article thumbnail

Performance Hero: Pat Meenan

Speed Curve

This month, we celebrate everything that OG performance hero Pat Meenan has done – and continues to do – for the web performance community. When we started the Performance Hero series earlier this year, we had an idea of the types of folks in our community we wanted to acknowledge: People who are making a difference in web performance People who are humble People who give without expectation People who don't necessarily crave the spotlight When looking at these attributes – for

article thumbnail

Percona Server for MongoDB 8.0: The Most Performant Ever

Percona

MongoDB Community Edition 8.0 has been available since October. At Percona, we took the time to examine this release carefully, check performance, and guarantee it works perfectly, stand-alone, and with other tools like Percona Backup for MongoDB and Percona Monitoring and Management.

Servers 83
article thumbnail

Power Laws and Heavy-Tail Distributions in Hyperscale Microservices Architectures

Abhishek Tiwari

Recent analyses of Meta and Alibaba’s production microservices architectures reveal distinct patterns of heavy-tail and power law distributions. These patterns manifest in service scale, request patterns, and resource utilisation, offering insights into the inherent characteristics of large-scale distributed systems.

article thumbnail

Which Flow Is Best for Your Data Needs: Time Series vs. Streaming Databases

DZone

Data is being generated from various sources, including electronic devices, machines, and social media, across all industries. However, unless it is processed and stored effectively, it holds little value. A significant evolution is taking place in the way data is organized for further analysis. Some databases prioritize organizing data based on its time of generation, while others focus on different functionalities.

article thumbnail

Yes, the future is digital. More importantly, it's asset light.

The Agile Manager

The 20th century company acquired capital assets (property, plant and equipment); employed a large, low skilled secondary workforce to produce things with that PPE; and employed a small, high skilled primary workforce to manage both the secondary workforce and administrate the PPE. By comparison, the 21st century company rents infrastructure - commercial space, cloud services, computers - and both employs and contracts knowledge workers who collaborate on solving problems.

article thumbnail

Cloud Efficiency at Netflix

The Netflix TechBlog

By J Han , PallaviPhadnis Context At Netflix, we use Amazon Web Services (AWS) for our cloud infrastructure needs, such as compute, storage, and networking to build and run the streaming platform that we love. Our ecosystem enables engineering teams to run applications and services at scale, utilizing a mix of open-source and proprietary solutions. In turn, our self-serve platforms allow teams to create and deploy, sometimes custom, workloads more efficiently.

article thumbnail

How to Replace Patroni and Etcd IP/Host Information in PostgreSQL

Percona

As we know, Patroni is a well-established standard for an HA framework for PostgreSQL clusters. From time to time, we need to perform maintenance tasks like upgrading the topology or making changes to the existing setup. Here, we will discuss mainly how we can replace the IP/Host information in Patroni and Etcd layers.

article thumbnail

Cache Me If You Can: Taming the Caching Complexity of Microservice Call Graphs

Abhishek Tiwari

As microservices architectures have become increasingly prevalent in modern software systems, they’ve brought both tremendous benefits and significant challenges. One of the most pressing challenges has been maintaining performance at scale while dealing with complex service dependencies and network communication overhead. Today, I want to explore MuCache, an innovative framework recently presented by Zhang et. al at USENIX NSDI 2024 that tackles this challenge head-on by providing automat

Cache 59
article thumbnail

How to implement an AIOps strategy at scale

Dynatrace

Imagine a day when your IT team resolves critical incidents before users even notice them. In todays multicloud world, complexity is the norm. But what if an AIOps strategy could transform that complexity into your organizations greatest advantage? IT operations teams must monitor, maintain, and optimize a broader mix of applications, infrastructure, and technologies, with newer digital business solutions only adding to the strain.

Strategy 188
article thumbnail

Part 1: A Survey of Analytics Engineering Work at Netflix

The Netflix TechBlog

This article is the first in a multi-part series sharing a breadth of Analytics Engineering work at Netflix, recently presented as part of our annual internal Analytics Engineering conference. We kick off with a few topics focused on how were empowering Netflix to efficiently produce and effectively deliver high quality, actionable analytic insights across the company.

Analytics 186