2024

article thumbnail

The State of Observability 2024: Navigating Complexity With AI-Driven Insights

DZone

In today's fast-paced digital landscape, organizations are increasingly embracing multi-cloud environments and cloud-native architectures to drive innovation and deliver seamless customer experiences. However, the 2024 State of Observability report from Dynatrace reveals that the explosion of data generated by these complex ecosystems is pushing traditional monitoring and analytics approaches to their limits.

Analytics 325
article thumbnail

Mastering Kubernetes deployments with Keptn: a comprehensive guide to enhanced visibility

Dynatrace

Deploying software in Kubernetes is often viewed as a straightforward process—just use kubectl or a GitOps solution like ArgoCD to deploy a YAML file, and you’re all set, right? Unfortunately, Kubernetes deployments can be fraught with challenges beyond the surface level. Numerous hurdles can hinder successful deployments, from resource constraints to external dependencies and monitoring inadequacies.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Introducing SafeTest: A Novel Approach to Front End Testing

The Netflix TechBlog

by Moshe Kolodny In this post, we’re excited to introduce SafeTest, a revolutionary library that offers a fresh perspective on End-To-End (E2E) tests for web-based User Interface (UI) applications. The Challenges of Traditional UI Testing Traditionally, UI tests have been conducted through either unit testing or integration testing (also referred to as End-To-End (E2E) testing).

Testing 244
article thumbnail

District heating: Using data centers to heat communities

All Things Distributed

An inside look at the Tallaght District Heating Scheme, where Heat Works is using recycled heat from an AWS data center to warm a community in Dublin, Ireland.

AWS 147
article thumbnail

Linux Crisis Tools

Brendan Gregg

When you have an outage caused by a performance issue, you don't want to lose precious time just to install the tools needed to diagnose it. Here is a list of "crisis tools" I recommend installing on your Linux servers by default (if they aren't already), along with the (Ubuntu) package names that they come from: Package Provides Notes procps ps(1), vmstat(8), uptime(1), top(1) basic stats util-linux dmesg(1), lsblk(1), lscpu(1) system log, device info sysstat iostat(1), mpstat

Servers 145
article thumbnail

QCon London: Meta Used Monolithic Architecture to Ship Threads in Only Five Months

InfoQ

Zahan Malkani talked during QCon London 2024 about Meta’s journey from identifying the opportunity in the market to shipping the Threads application only five months later. The company leveraged Instagram's existing monolithic architecture and quickly iterated to create a new text-first microblogging service in record time.

article thumbnail

The psychology of site speed and human happiness

Speed Curve

In the fourteen years that I've been working in the web performance industry, I've done a LOT of research, writing, and speaking about the psychology of page speed – in other words, why we crave fast, seamless online experiences. In fact, the entire first chapter of my book, Time Is Money (reprinted here courtesy of the good folks at O'Reilly), is dedicated to the subject.

Speed 134

More Trending

article thumbnail

Redis, Valkey, and Percona’s Ongoing Support of Open Source

Percona

For me, the Redis story starts with… Memcached. Back in the early 2000s, “Web 2.0” was being built following the aftermath of the dot-com crash. The open source LAMP (Linux-Apache-MySQL-PHP/Perl/Python) stack was all the rage.

article thumbnail

Beyond Problem and Solution Space: Better models for modern product development

Strategic Tech

I often encounter the phrases problem space and solution space. People use these words to try and articulate the types of work and activities they are referring to, or where they are in the process of building something new. Unfortunately, rather than aiding communication, I notice that these words are so highly ambiguous that more time is spent debating what they mean than is gained by using them to improve communication and collaboration.

article thumbnail

What is System Testing? – Getting Started, Tips, and Tools

Testlodge

System testing involves analyzing the behavior and functionality of a fully integrated application. It is the third of the four levels of testing, performed after unit and integration testing but before user acceptance testing. A QA team member will usually do the assessing, or occasionally the task will fall to other team members such as product or project managers.

Systems 90
article thumbnail

Getting Started With NCache Java Edition (Using Docker)

DZone

NCache Java Edition with distributed cache technique is a powerful tool that helps Java applications run faster, handle more users, and be more reliable. In today's world, where people expect apps to work quickly and without any problems, knowing how to use NCache Java Edition is very important. It's a key piece of technology for both developers and businesses who want to make sure their apps can give users fast access to data and a smooth experience.

Java 310
article thumbnail

Google Cloud Next 2024: AI innovation for Google Cloud

Dynatrace

In today’s rapidly evolving landscape, incorporating AI innovation into business strategies is vital, enabling organizations to optimize operations, enhance decision-making processes, and stay competitive. The annual Google Cloud Next conference explores the latest innovations for cloud technology and Google Cloud. This year, Google’s event will take place from April 9 to 11 in Las Vegas.

Google 266
article thumbnail

Bending pause times to your will with Generational ZGC

The Netflix TechBlog

The surprising and not so surprising benefits of generations in the Z Garbage Collector. By Danny Thomas, JVM Ecosystem Team The latest long term support release of the JDK delivers generational support for the Z Garbage Collector. More than half of our critical streaming video services are now running on JDK 21 with Generational ZGC, so it’s a good time to talk about our experience and the benefits we’ve seen.

Latency 228
article thumbnail

What I've been reading since re:Invent

All Things Distributed

After a busy conference season, I've taken some time to catch up on reading and make a dent in the pile of books on my nightstand. Here's what I've started, finished, and picked up since re:Invent.

131
131
article thumbnail

eBPF Documentary

Brendan Gregg

eBPF is a crazy technology – like putting JavaScript into the Linux kernel – and getting it accepted had so far been an untold story of strategy and ingenuity. The eBPF documentary, published late last year, tells this story by interviewing key players from 2014 including myself, and touches on new developments including Windows. (If you are new to eBPF, it is the name of a kernel execution engine that runs a variety of new programs in a performant and safe sandbox in the kernel, lik

article thumbnail

Uber Builds Scalable Chat Using Microservices with GraphQL Subscriptions and Kafka

InfoQ

Uber replaced a legacy architecture built using the WAMP protocol with a new solution that takes advantage of GraphQL subscriptions. The main drivers for creating a new architecture were challenges around reliability, scalability, observability/debugibility, as well as technical debt impeding the team’s ability to maintain the existing solution.

article thumbnail

Building the future of performance with SpeedCurve

Speed Curve

I’m beyond excited to announce that I’m joining the SpeedCurve team this year! I’ll still be doing some consulting work, but I’ll be taking on a few less clients this year so I can focus on helping to make an already amazing performance tool even better, working alongside some of my favorite people in the performance community.

article thumbnail

ChatGPT, Author of The Quixote

O'Reilly

TL;DR LLMs and other GenAI models can reproduce significant chunks of training data. Specific prompts seem to “unlock” training data. We have many current and future copyright challenges: training may not infringe copyright, but legal doesn’t mean legitimate—we consider the analogy of MegaFace where surveillance models have been trained on photos of minors, for example, without informed consent.

article thumbnail

Percona Stands Firmly on the Side of Open Source

Percona

Percona is proud to formally state our support for the Valkey project and its community. We will soon offer commercial support for the open source Redis alternative, Valkey.The world of open source is no stranger to controversy.

article thumbnail

DragonCrawl: Generative AI for High-Quality Mobile Testing

Uber Engineering

Learn how Uber improved mobile testing reliability, and increased productivity for thousands of engineers, using machine learning to create DragonCrawl, a highly stable and low-maintenance testing system.

Mobile 79
article thumbnail

Interaction to Next Paint (INP) Explained

Gtmetrix

We explain what Interaction to Next Paint (INP) is, how it impacts performance, and why GTmetrix doesn’t track it (yet). Overview Interaction to Next Paint (INP) is Google’s new “responsiveness” field metric that was first announced as an experimental metric in 2022.

Metrics 81
article thumbnail

Achieving Kubernetes Monitoring Nirvana: Prometheus and Grafana Unleashed

DZone

In the ever-evolving landscape of container orchestration, Kubernetes has emerged as a frontrunner, offering unparalleled flexibility and scalability. However, with great power comes great responsibility — the responsibility to monitor and understand your Kubernetes clusters effectively. This is where Prometheus and Grafana step in, forming a dynamic duo that provides comprehensive insights into Kubernetes clusters.

article thumbnail

The Dynatrace journey toward DORA compliance

Dynatrace

The Digital Operational Resilience Act (DORA) looms large on the horizon, casting its regulatory shadow over financial institutions. DORA is a European Union (EU) regulation that took effect on January 16, 2023, with full implementation starting January 17, 2025. DORA aims to bolster the IT security of financial entities such as banks, insurance companies, and investment firms.

Tuning 247
article thumbnail

Supporting Diverse ML Systems at Netflix

The Netflix TechBlog

David J. Berg , Romain Cledat , Kayla Seeley , Shashank Srikanth , Chaoying Wang , Darin Yu Netflix uses data science and machine learning across all facets of the company, powering a wide range of business applications from our internal infrastructure and content demand modeling to media understanding. The Machine Learning Platform (MLP) team at Netflix provides an entire ecosystem of tools around Metaflow , an open source machine learning infrastructure framework we started, to empower data sc

Systems 226
article thumbnail

Setting Up Your Environment for Kubernetes Operators Using Docker, kubectl, and k3d

Percona Community

If you are just starting out in the world of Kubernetes operators, like me, preparing the environment for their installation should be something we do with not much difficulty. This blog will quickly guide you in setting the minimal environment. Kubernetes operators are invaluable for automating complex database operations, tasks that Kubernetes does not handle directly.

article thumbnail

C++ safety, in context

Sutter's Mill

Scope. To talk about C++’s current safety problems and solutions well, I need to include the context of the broad landscape of security and safety threats facing all software. I chair the ISO C++ standards committee and I work for Microsoft, but these are my personal opinions and I hope they will invite more dialog across programming language and security communities.

C++ 139
article thumbnail

QCon London: Lessons Learned From Building LinkedIn’s AI/ML Data Platform

InfoQ

At the QCon London 2024 conference, Félix GV from LinkedIn discussed the AI/ML platform powering the company’s products. He specifically delved into Venice DB, the NoSQL data store used for feature persistence. The presenter shared the lessons learned from evolving and operating the platform, including cluster management and library versioning.

article thumbnail

Hello INP! Here's everything you need to know about the newest Core Web Vital

Speed Curve

After years of development and testing, Google has added Interaction to Next Paint (INP) to its trifecta of Core Web Vitals – the performance metrics that are a key ingredient in its search ranking algorithm. INP replaces First Input Delay (FID) as the Vitals responsiveness metric. Not sure what INP means or why it matters? No worries – that's what this post is for. :) What is INP?

Mobile 82
article thumbnail

To understand the risks posed by AI, follow the money

O'Reilly

By Rufus Rock , UCL ; Tim O’Reilly , UCL ; Ilan Strauss , UCL ; and Mariana Mazzucato , UCL This article is republished from The Conversation under a Creative Commons license. Time and again, leading scientists, technologists, and philosophers have made spectacularly terrible guesses about the direction of innovation. Even Einstein was not immune, claiming, “There is not the slightest indication that nuclear energy will ever be obtainable,” just ten years before Enrico Fermi completed cons

article thumbnail

Protect Your PostgreSQL Database with pg_tde: Safe and Secure

Percona

Tech Preview release of pg_tde now availableAs organizations collect, store, and analyze vast amounts of data, ensuring its confidentiality and integrity becomes a top priority. For PostgreSQL users, the tech preview release availability of the new encryption extension pg_tde delivers unmatched protection for vital data assets.What is pg_tde?

article thumbnail

How Uber Serves Over 40 Million Reads Per Second from Online Storage Using an Integrated Cache

Uber Engineering

Learn how Uber serves over 40 million reads per second from its in-house, distributed database built on top of MySQL using an integrated caching solution: CacheFront.

Cache 75
article thumbnail

The Return of the Frame Pointers

Brendan Gregg

Sometimes debuggers and profilers are obivously broken, sometimes it's subtle and hard to spot. From my flame graphs page: CPU flame graph (partly broken) (Click for original SVG.) This is pretty common and usually goes unnoticed as the flame graph looks ok at first glance. But there are 15% of samples on the left, above "[unknown]", that are in the wrong place and missing frames.

Java 145
article thumbnail

Automate Application Load Balancers With AWS Load Balancer Controller and Ingress

DZone

Automating AWS Load Balancers is essential for managing cloud infrastructure efficiently. This article delves into the importance of automation using the AWS Load Balancer controller and Ingress template. Whether you're new or experienced, grasping these configurations is vital to streamlining Load Balancer settings on Amazon Web Services, ensuring a smoother and more effective setup.

AWS 300
article thumbnail

Beyond uptime: Unveiling the improved Dynatrace SLA

Dynatrace

To help our customers confidently navigate the complexities of the cloud and innovate faster and more securely, the Dynatrace platform must be delivered reliably. Service Level Agreements (SLA) bridge your business objectives with our commitment to excellence. To transparently manage expectations and maintain trust with our customers, we expanded the Dynatrace SLA beyond accessing the user interface to cover the full range of relevant product categories, such as processing and retaining incoming

Azure 245
article thumbnail

Evolving from Rule-based Classifier: Machine Learning Powered Auto Remediation in Netflix Data…

The Netflix TechBlog

Evolving from Rule-based Classifier: Machine Learning Powered Auto Remediation in Netflix Data Platform by Binbing Hou , Stephanie Vezich Tamayo , Xiao Chen , Liang Tian , Troy Ristow , Haoyuan Wang , Snehal Chennuru , Pawan Dixit This is the first of the series of our work at Netflix on leveraging data insights and Machine Learning (ML) to improve the operational automation around the performance and cost efficiency of big data jobs.

Tuning 210
article thumbnail

Reporting Core Web Vitals With The Performance API

Smashing Magazine

Reporting Core Web Vitals With The Performance API Reporting Core Web Vitals With The Performance API Geoff Graham 2024-02-27T12:00:00+00:00 2024-02-27T19:35:10+00:00 This article is sponsored by DebugBear There’s quite a buzz in the performance community with the Interaction to Next Paint (INP) metric becoming an official Core Web Vitals (CWV) metric in a few short weeks.

article thumbnail

Forming an Architecture Modernization Enabling Team (AMET)

Strategic Tech

This article was co-authored with Eduardo da Silva (also published on his blog ). Architecture modernization initiatives are strategic efforts involving many teams, usually for many months or years. They often compete with product/feature development work, resulting in them falling flat and failing to deliver the promised business benefits that triggered them.