article thumbnail

Migrating Critical Traffic At Scale with No Downtime?—?Part 1

The Netflix TechBlog

Replay Traffic Testing Replay traffic refers to production traffic that is cloned and forked over to a different path in the service call graph, allowing us to exercise new/updated systems in a manner that simulates actual production conditions. This approach has a handful of benefits.

Traffic 347
article thumbnail

Service level objectives: 5 SLOs to get started

Dynatrace

Fitness app : The fitness app should offer a response time of less than 500 milliseconds for exercise tracking and data recording. This SLO enables a smooth and uninterrupted exercise-tracking experience. Note : you might hear the term latency used instead of response time. Latency primarily focuses on the time spent in transit.

Latency 245
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Seamlessly Swapping the API backend of the Netflix Android app

The Netflix TechBlog

For each route we migrated, we wanted to make sure we were not introducing any regressions: either in the form of missing (or worse, wrong) data, or by increasing the latency of each endpoint. Being able to canary a new route let us verify latency and error rates were within acceptable limits. This meant that data that was static (e.g.

Latency 241
article thumbnail

Interpreting A/B test results: false negatives and power

The Netflix TechBlog

We then used simple thought exercises based on flipping coins to build intuition around false positives and related concepts such as statistical significance, p-values, and confidence intervals. As a result, if the test treatment results in a small reduction in the latency metric, it’s hard to successfully identify?

Testing 211
article thumbnail

Pushy to the Limit: Evolving Netflix’s WebSocket proxy for the future

The Netflix TechBlog

Where aws ends and the internet begins is an exercise left to the reader. Dynomite is a Netflix open source wrapper around Redis that provides a few additional features like auto-sharding and cross-region replication, and it provided Pushy with low latency and easy record expiry, both of which are critical for Pushy’s workload.

Latency 234
article thumbnail

Service level objective examples: 5 SLO examples for faster, more reliable apps

Dynatrace

Fitness app : The fitness app should offer a response time of less than 500 milliseconds for exercise tracking and data recording. This SLO enables a smooth and uninterrupted exercise-tracking experience. Note : you might hear the term latency used instead of response time. Latency primarily focuses on the time spent in transit.

Traffic 173
article thumbnail

Real user monitoring vs. synthetic monitoring: Understanding best practices

Dynatrace

connectivity, access, user count, latency) of geographic regions. For example, real-user monitoring metrics might reveal a user performance issue that you can then apply to synthetic testing to replicate the issue by exercising the same transaction across several different variables. Performance testing based on variable metrics (i.e.,