Testing, Traffic and Tuning - Technology Performance Pulse

Migrating Critical Traffic At Scale with No Downtime?—?Part 1

The Netflix TechBlog

MAY 4, 2023

Migrating Critical Traffic At Scale with No Downtime — Part 1 Shyam Gala , Javier Fernandez-Ivern , Anup Rokkam Pratap , Devang Shah Hundreds of millions of customers tune into Netflix every day, expecting an uninterrupted and immersive streaming experience. This approach has a handful of benefits.

Traffic

Traffic Latency Tuning Systems

Migrating Critical Traffic At Scale with No Downtime?—?Part 2

The Netflix TechBlog

JUNE 13, 2023

Migrating Critical Traffic At Scale with No Downtime — Part 2 Shyam Gala , Javier Fernandez-Ivern , Anup Rokkam Pratap , Devang Shah Picture yourself enthralled by the latest episode of your beloved Netflix series, delighting in an uninterrupted, high-definition streaming experience. This is where large-scale system migrations come into play.

Traffic

Traffic Metrics Systems Strategy

Migrating Netflix to GraphQL Safely

The Netflix TechBlog

JUNE 14, 2023

The three strategies we will discuss today are AB Testing , Replay Testing, and Sticky Canaries. To launch Phase 1 safely, we used AB Testing. To launch Phase 2 safely, we used Replay Testing and Sticky Canaries. We knew we could test the same query with the same inputs and consistently expect the same results.

Traffic

Traffic Latency Cache Metrics

Keeping Netflix Reliable Using Prioritized Load Shedding

The Netflix TechBlog

NOVEMBER 2, 2020

How viewers are able to watch their favorite show on Netflix while the infrastructure self-recovers from a system failure By Manuel Correa , Arthur Gonigberg , and Daniel West Getting stuck in traffic is one of the most frustrating experiences for drivers around the world. Logs and background requests are examples of this type of traffic.

Traffic

Traffic Metrics Infrastructure Architecture

9 key DevOps metrics for success

Dynatrace

SEPTEMBER 28, 2021

Two important ways to improve this metric are to implement quality assurance testing throughout multiple development environments and to automate testing and DevOps processes. A change failure rate above 40% can indicate poor testing procedures, which means teams will need to make more changes than necessary, eroding efficiency.

DevOps

DevOps Metrics Traffic Efficiency

Efficient SLO event integration powers successful AIOps

Dynatrace

APRIL 5, 2024

For instance, consider how fine-tuned failure rate detection can provide insights for comprehensive understanding. Please refer to How to fine-tune failure detection (dynatrace.com) for further information. SLOs must be evaluated at 100%, even when there is currently no traffic. What characterizes a weak SLO?

Efficiency

Efficiency Traffic Tuning Metrics

How Dynatrace boosts production resilience with Site Reliability Guardian

Dynatrace

MAY 17, 2023

Validation tasks are then extended left to cover performance testing and release validation in a pre-production environment. While the first guardian validates the traffic, the second guardian checks the business transactions generated during the observation period. The functionality is implemented via an automated workflow.

DevOps

DevOps Traffic Latency Best Practices

Kubernetes vs Docker: What’s the difference?

Dynatrace

SEPTEMBER 29, 2021

The time and effort saved with testing and deployment are a game-changer for DevOps. This opens the door to auto-scalable applications, which effortlessly matches the demands of rapidly growing and varying user traffic. Containers can be replicated or deleted on the fly to meet varying end-user traffic. What is Docker?

Open Source

Open Source DevOps Traffic Cloud

What is web application security? Everything you need to know.

Dynatrace

JUNE 9, 2021

Application security tests can be used to tell developers when they have introduced vulnerabilities into the application they’re building. Static Application Security Tests (SAST) analyze applications early in the pipeline when they are being written. Good web application security starts when the application is being developed.

Open Source

Open Source Entertainment Tuning Internet

Dynatrace Application Security detects and blocks attacks automatically in real-time

Dynatrace

FEBRUARY 10, 2022

Static Application Security Testing (SAST) solutions are a traditional way of addressing this. WAFs protect the network perimeter and monitor, filter, or block HTTP traffic. Compared to intrusion detection systems (IDS/IPS), WAFs are focused on the application traffic. Unfortunately, they also introduce risk. How to get started.

Traffic

Traffic Benchmarking Innovation Java

Improving our video encodes for legacy devices

The Netflix TechBlog

AUGUST 10, 2020

264/AVC Main profile family still represents a substantial portion of the members viewing hours and an even larger portion of the traffic. Further tuning of pre-defined encoding parameters. These figures were estimated on 200 full-length titles from our catalog and have been validated through extensive A/B testing.

Innovation

Innovation Traffic Network Mobile

A/B Testing Instant.Page With Netlify and Speedcurve

Tim Kadlec

MAY 21, 2020

It was a great excuse for running a split test. By serving one version of the site with instant.page in place to some traffic, and a site without it to another, I could compare the performance of them both over the same timespan and see how it shakes out. All that was left was to see if the split testing was actually working.

Testing

Testing Traffic Tuning Code

OneAgent for Linux on IBM Z (General Availability)

Dynatrace

NOVEMBER 20, 2019

The initial release of the solution with OneAgent version 1.173 is certified and tested to work on RedHat Enterprise Linux (RHEL) distribution 6.9+ OneAgent for Z/Linux collects a number of network metrics: input and output traffic measured in bytes and packets, retransmissions, and connectivity. Stay tuned for more announcements.

Availability

Availability Hardware Java Tuning

Python at Netflix

The Netflix TechBlog

APRIL 29, 2019

Demand Engineering Demand Engineering is responsible for Regional Failovers , Traffic Distribution, Capacity Operations and Fleet Efficiency of the Netflix cloud. Our Infrastructure Security team leverages Python to help with IAM permission tuning using Repokid. We are proud to say that our team’s tools are built primarily in Python.

Open Source

Open Source Network Infrastructure Big Data

How digital experience monitoring helps deliver business observability

Dynatrace

APRIL 26, 2022

STM generates traffic that replicates the typical path or behavior of a user on a network to measure performance for example, response times, availability, packet loss, latency, jitter, and other variables). Learn more about Dynatrace today with this Power Demo: Dynatrace and Business Observability: Tying IT Metrics to Business Outcomes.

Monitoring

Monitoring Social Media IoT Metrics

Towards a Reliable Device Management Platform

The Netflix TechBlog

AUGUST 30, 2021

By Benson Ma , Alok Ahuja Introduction At Netflix, hundreds of different device types, from streaming sticks to smart TVs, are tested every day through automation to ensure that new software releases continue to deliver the quality of the Netflix experience that our customers enjoy. In this blog post, we will focus on the latter feature set.

Latency

Latency Traffic Transportation Cloud

Building Netflix’s Distributed Tracing Infrastructure

The Netflix TechBlog

OCTOBER 19, 2020

We took a hybrid head-based sampling approach that allows for recording 100% of traces for a specific and configurable set of requests, while continuing to randomly sample traffic per the policy set at ingestion point. This approach paints a richer observability portrait of application health. What’s next?

Infrastructure

Infrastructure Transportation Storage Open Source

Dynatrace Cloud Automation Module provides observability-driven automation across the full lifecycle

Dynatrace

FEBRUARY 10, 2021

Dynatrace Cloud Automation allows easy analysis of the status and impact a release has on your business or on test results in any environment. This capability provides version information along with an additional insight into traffic and problems per version. What’s next.

Cloud

Cloud DevOps Speed Metrics

Zero Configuration Service Mesh with On-Demand Cluster Discovery

The Netflix TechBlog

AUGUST 29, 2023

In order for a service to talk to another, it needs to know two things: the name of the destination service, and whether or not the traffic should be secure. The ability to run in a degraded but available state during an outage is still a marked improvement over completely stopping traffic flow.

Traffic

Traffic Latency Cloud C++

Achieving observability in async workflows

The Netflix TechBlog

MAY 14, 2021

Prodicle Distribution Our service is required to be elastic and handle bursty traffic. Our team was responsible for Google integrations, watermarking, bursty traffic management, and on-call support for this application. Early prototypes and load tests validated that the offering could meet our needs. Things got hairy.

Traffic

Traffic Java Latency Google

Get out-of-the-box visibility into your ARM platform (Early Adopter)

Dynatrace

MAY 1, 2020

The initial release of OneAgent for the ARM platform with OneAgent version 1.191 is certified and tested to work on SUSE Enterprise Linux 15.x, OneAgent for the ARM platform collects a number of network metrics: input and output traffic measured in bytes and packets, retransmissions, and connectivity. Stay tuned for more details.

Java

Java Hardware Tuning Metrics

Introducing Netflix’s Key-Value Data Abstraction Layer

The Netflix TechBlog

SEPTEMBER 18, 2024

Although clock-based token generation can suffer from clock skew, our tests on EC2 Nitro instances show drift is minimal (under 1 millisecond). To mitigate these issues, we implemented adaptive pagination which dynamically tunes the limits based on observed data.

Latency

Latency Storage Cache Servers

OneAgent for Linux on IBM Z now available in Early Adopter Release

Dynatrace

AUGUST 8, 2019

The initial release of the solution with OneAgent version 1.173 is certified and tested to work on RedHat Enterprise Linux (RHEL) distribution 6.9+ OneAgent for Z/Linux collects a number of network metrics: input and output traffic measured in bytes and packets, retransmissions, and connectivity. Stay tuned for more announcements.

Availability

Availability Hardware Java Tuning

Dynatrace Application Security protects your applications in complex cloud environments

Dynatrace

DECEMBER 8, 2020

With DevSecOps processes having shifted security testing “left”, will the teams have enough time to manually analyze, assess, and manage risks based on sampled or scheduled scan results? For example, a test library is never deployed to production. Stay tuned – this is only the start.

Cloud

Cloud Open Source Internet Internet

Why PostgreSQL Is a Top Choice for Enterprise-level Databases

Percona

MARCH 23, 2023

PostgreSQL supports sharding, which allows data to be distributed across multiple servers, making it ideal for high-traffic websites and applications. It has a proven track record of handling large volumes of data and high-traffic websites. Reliability PostgreSQL is known for its reliability and stability.

Database

Database Open Source Traffic Small Business

MySQL 8: Load Fine Tuning With Resource Groups

Percona

AUGUST 27, 2018

Summarizing here the whole set of steps on my environment: 1) Check the CAP_SYS_NICE getcap /opt/mysql_templates/mysql-8P/bin/mysqld setcap cap_sys_nice+ep /opt/mysql_templates/mysql-8P/bin/mysqld 2) Create the user in MySQL and resource groups create user app2@'%' identified by 'test'; GRANT ALL PRIVILEGES ON `windmills2`.* Will this work?

Tuning

Tuning Virtualization Testing Servers

Rebuilding Netflix Video Processing Pipeline with Microservices

The Netflix TechBlog

JANUARY 10, 2024

This release candidate then went through exhaustive testing which attempted to cover as large a surface area as possible. This testing stage took about two weeks. The service also provides options that allow fine-tuning latency, throughput, etc., depending on the use case.

Processing

Processing Media Latency Innovation

KeyCDN Launches New POP in Mexico

KeyCDN

NOVEMBER 4, 2021

Traffic from this POP will be billed towards Latin America according to our pricing. Alternatively, If you would like to request a POP in a certain location or a new feature please let us know and stay tuned for more exciting announcements. Requests from Mexico were previously routed to the US, which is no longer needed.

Latency

Latency Tuning Cache Traffic

How To Calculate a Good MySQL Redo Log Size in MySQL 8

Percona

MARCH 6, 2023

Redo Logs Starting from MySQL 8.0.30, the variable that should be tuned for optimizing the Redo Logs is innodb_redo_log_capacity , and we start with good news here: It’s dynamic! So you won’t have downtime if you need to tune this. This is far from a real scenario test, but it should help us in this particular case.

Tuning

Tuning Open Source Database Engineering

Meet Hydrogen: A React Framework For Dynamic, Contextual And Personalized E-Commerce

Smashing Magazine

NOVEMBER 8, 2021

But then, quickly advances to contextual pricing, juggling complexity of large and frequently updated product catalog, managing continuously running multivariate tests and promotion campaigns, and serving customer-tailored dynamic recommendations. Stay tuned for more in 2022! The journey, often, starts “simple” with localization.

Cache

Cache Best Practices Strategy Servers

Netflix Video Quality at Scale with Cosmos Microservices

The Netflix TechBlog

NOVEMBER 2, 2021

Perceptual quality measurements are used to drive video encoding optimizations , perform video codec comparisons , carry out A/B testing and optimize streaming QoE decisions to mention a few. Stay tuned for more details on these algorithmic innovations. In other words, how do we manage living a life in both worlds?

Media

Media Innovation Metrics Latency

In-product guidance accelerates Service Level Objectives (SLO) setup for confident deployments

Dynatrace

DECEMBER 9, 2020

As defined by Google, a typical SLO consists of following fundamental information: Service-Level Indicator (SLI) : The indicator used to measure successful service delivery—SLIs typically refer to metrics such as service success rate, crash-free mobile app users, successful synthetic test runs, or response time.

Metrics

Metrics Engineering Google Monitoring

MongoDB End of Life: How to Prepare

Scalegrid

JUNE 21, 2024

Although these mandatory updates usually occur without hitches, they have the potential to bring up compatibility issues that necessitate Testing for a trouble-free experience. This guarantees a rapid experience that can efficiently handle the pressures of intense data traffic and intricate queries.

Database

Database Efficiency Strategy Tuning

Bandwidth-friendly Query Profiling for Azure SQL Database

SQL Performance

APRIL 15, 2020

This capability is essential when performance tuning since query events include discrete CPU and IO metrics as well as runtime parameters, which are key for troubleshooting query performance problems such as parameter sniffing. For some reason this thing will not go away! An Updated XE Session. sqlserver.existing_connection.

Azure

Azure Database Tuning Network

Performance Testing - Tools, Steps, and Best Practices

KeyCDN

AUGUST 15, 2019

Web performance is a broad subject, and you’ll find no shortage of performance testing tips and tutorials all over the web. Before you begin tuning your website or application, you must first figure out which metrics matter most to your users and establish some achievable benchmarks. What is Performance Testing?

Testing Tools

Testing Tools Best Practices Performance Testing Testing

HTTP/3: Practical Deployment Options (Part 3)

Smashing Magazine

SEPTEMBER 6, 2021

This explains the challenges involved in deploying and testing HTTP/3 yourself. In our own early tests , I found seriously diminishing returns at about 40 files. As such, a micro-optimization is, again, how you probably need to fine-tune things on a low level to really benefit from it. This is more in-depth and technical.

Network

Network Servers Cache Traffic

From Proprietary to Open Source: The Complete Guide to Database Migration

Percona

OCTOBER 18, 2023

This approach can minimize complexities but requires complete confidence in your preparations, tests, and abilities. This is where you will fine-tune authentication mechanisms, storage paths, security policies, and memory allocation settings to optimize them for your specific use case(s).

Open Source

Open Source Database Hardware Strategy

ML Platform Meetup: Infra for Contextual Bandits and Reinforcement Learning

The Netflix TechBlog

OCTOBER 18, 2019

Contextual and Multi-armed Bandits enable faster and adaptive alternatives to traditional A/B Testing. Some simulation results and some online A/B test results were shared, demonstrating substantial lifts in the primary business metrics, while keeping the secondary metrics above preset guardrails.

Infrastructure

Infrastructure Metrics Architecture Efficiency

Structural Evolutions in Data

O'Reilly

SEPTEMBER 19, 2023

You can download these models to use out of the box, or employ minimal compute resources to fine-tune them for your particular task. You can see a simulation as a temporary, synthetic environment in which to test an idea. ” (We’ll miss rush hour traffic.) “What if I bring my duffel bag instead of the roll-aboard?”

Hardware

Hardware Storage Big Data Blockchain

Crucial Redis Monitoring Metrics You Must Watch

Scalegrid

JANUARY 25, 2024

Understanding Redis Performance Indicators Redis is designed to handle high traffic and low latency with its in-memory data store and efficient data structures. Best Practices for Redis Performance Tuning Optimizing memory allocation is essential for improving Redis’s performance.

Metrics

Metrics Monitoring Latency Cache

The Speed of Time

Brendan Gregg

SEPTEMBER 25, 2021

Since instances of both CentOS and Ubuntu were running in parallel, I could collect flame graphs at the same time (same time-of-day traffic mix) and compare them side by side. There's also a test and println() in the loop to, hopefully, convince the compiler not to optimize-out an otherwise empty loop.

Speed

Speed Java AWS Virtualization

10 Steps to Prepare Your Website for High-Load Days: Are You Ready for Black Friday?

Rigor

SEPTEMBER 3, 2019

What you may be overlooking is that peak-event readiness is about more than just load testing or ensuring that your servers are up throughout a specific timeframe. At a minimum, Optimal recommends that you have RUM, APM, synthetic, and load test tools set up, instrumented, and producing meaningful data.

Website

Website Cache Traffic Ecommerce

HTTP/3 From A To Z: Core Concepts (Part 1)

Smashing Magazine

AUGUST 9, 2021

This explains the challenges involved in deploying and testing HTTP/3 yourself. For example, if the device is a firewall, it might be configured to block all traffic containing (unknown) extensions. In the early days of the Internet, encrypting traffic was quite costly in terms of processing. This is more in depth and technical.

Transportation

Transportation Internet Internet Network

Building a Profitable UberEats Clone_ Your Ultimate Guide to Success

Tech News Gather

JUNE 16, 2023

Stay tuned to learn how to lay the foundation for a successful clone app that converts readers into leads. Choose a robust technology stack that can handle high traffic and accommodate future growth. Stay tuned for valuable insights on designing a captivating UberEats clone.

Social Media

Social Media Tuning Strategy Media

Migrating Critical Traffic At Scale with No Downtime?—?Part 1

Migrating Critical Traffic At Scale with No Downtime?—?Part 2

Trending Sources

Migrating Netflix to GraphQL Safely

Keeping Netflix Reliable Using Prioritized Load Shedding

9 key DevOps metrics for success

Efficient SLO event integration powers successful AIOps

How Dynatrace boosts production resilience with Site Reliability Guardian

Kubernetes vs Docker: What’s the difference?

What is web application security? Everything you need to know.

Dynatrace Application Security detects and blocks attacks automatically in real-time

Improving our video encodes for legacy devices

A/B Testing Instant.Page With Netlify and Speedcurve

OneAgent for Linux on IBM Z (General Availability)

Python at Netflix

How digital experience monitoring helps deliver business observability

Towards a Reliable Device Management Platform

Building Netflix’s Distributed Tracing Infrastructure

Dynatrace Cloud Automation Module provides observability-driven automation across the full lifecycle

Zero Configuration Service Mesh with On-Demand Cluster Discovery

Achieving observability in async workflows

Get out-of-the-box visibility into your ARM platform (Early Adopter)

Introducing Netflix’s Key-Value Data Abstraction Layer

OneAgent for Linux on IBM Z now available in Early Adopter Release

Dynatrace Application Security protects your applications in complex cloud environments

Why PostgreSQL Is a Top Choice for Enterprise-level Databases

MySQL 8: Load Fine Tuning With Resource Groups

Rebuilding Netflix Video Processing Pipeline with Microservices

KeyCDN Launches New POP in Mexico

How To Calculate a Good MySQL Redo Log Size in MySQL 8

Meet Hydrogen: A React Framework For Dynamic, Contextual And Personalized E-Commerce

Netflix Video Quality at Scale with Cosmos Microservices

In-product guidance accelerates Service Level Objectives (SLO) setup for confident deployments

MongoDB End of Life: How to Prepare

Bandwidth-friendly Query Profiling for Azure SQL Database

Performance Testing - Tools, Steps, and Best Practices

HTTP/3: Practical Deployment Options (Part 3)

From Proprietary to Open Source: The Complete Guide to Database Migration

ML Platform Meetup: Infra for Contextual Bandits and Reinforcement Learning

Structural Evolutions in Data

Crucial Redis Monitoring Metrics You Must Watch

The Speed of Time

10 Steps to Prepare Your Website for High-Load Days: Are You Ready for Black Friday?

HTTP/3 From A To Z: Core Concepts (Part 1)

Building a Profitable UberEats Clone_ Your Ultimate Guide to Success

Stay Connected