Big Data, Design and Technology - Technology Performance Pulse

In-Stream Big Data Processing

Highly Scalable

AUGUST 20, 2013

The shortcomings and drawbacks of batch-oriented data processing were widely recognized by the Big Data community quite a long time ago. This system has been designed to supplement and succeed the existing Hadoop-based system that had too high latency of data processing and too high maintenance costs.

Big Data

Big Data Processing Lambda Database

The Amazon.com 2010 Shareholder Letter Focusses on Technology.

All Things Distributed

APRIL 27, 2011

The Amazon.com 2010 Shareholder Letter Focusses on Technology. In the 2010 Shareholder Letter Jeff Bezos writes about the unique technologies developed at Amazon.com over the years. Given that I have frequently written about many of these technologies on this blog I asked investor relations to be allowed to reprint it here.

Technology

Technology Technology AWS Storage

Data Engineers of Netflix?—?Interview with Pallavi Phadnis

The Netflix TechBlog

OCTOBER 28, 2021

Netflix’s unique work culture and petabyte-scale data problems are what drew me to Netflix. During earlier years of my career, I primarily worked as a backend software engineer, designing and building the backend systems that enable big data analytics.

Data Engineering

Data Engineering Engineering Big Data Software Engineering

Seven benefits of AIOps to transform your business operations

Dynatrace

JULY 5, 2022

AIOps combines big data and machine learning to automate key IT operations processes, including anomaly detection and identification, event correlation, and root-cause analysis. Like the development and design phases, these applications generate massive data volumes that offer relevant and actionable insights.

Artificial Intelligence

Artificial Intelligence Cloud Innovation Strategy

Driving down the cost of Big-Data analytics - All Things Distributed

All Things Distributed

AUGUST 18, 2011

Driving down the cost of Big-Data analytics. The Amazon Elastic MapReduce (EMR) team announced today the ability to seamlessly use Amazon EC2 Spot Instances with their service, significantly driving down the cost of data analytics in the cloud. a Fast and Scalable NoSQL Database Service Designed for Internet Scale Applications.

Big Data

Big Data Analytics AWS Cloud

Building and Scaling Data Lineage at Netflix to Improve Data Infrastructure Reliability, and…

The Netflix TechBlog

MARCH 25, 2019

Finally, imagine yourself in the role of a data platform reliability engineer tasked with providing advanced lead time to data pipeline (ETL) owners by proactively identifying issues upstream to their ETL jobs. Design a flexible data model ? —?Represent Enable seamless integration?—? push or pull.

Infrastructure

Infrastructure Big Data Transportation Architecture

What is a data lakehouse? Combining data lakes and warehouses for the best of both worlds

Dynatrace

OCTOBER 4, 2022

A data lakehouse provides a cost-effective storage layer for both structured and unstructured data. Therefore, it contains all of an organization’s data. Generally, the storage technology categorizes data into landing, raw, and curated zones depending on its consumption readiness. Data management.

Artificial Intelligence

Artificial Intelligence Storage Analytics Government

Top 15 Software Testing Trends to Watch Out in 2021

DZone

DECEMBER 28, 2020

The introduction of innovative technologies has brought the newest updates in software testing, development, design, and delivery. Nowadays, Big Data tests mainly include data testing, paving the way for the Internet of Things to become the center point. Besides, AI and ML seem to reach a new level.

Software

Software Software Testing Big Data

Data Engineers of Netflix?—?Interview with Kevin Wylie

The Netflix TechBlog

JULY 15, 2021

His favorite TV shows: Ozark, Breaking Bad, Black Mirror, Barry, and Chernobyl Since I joined Netflix back in 2011, my favorite project has been designing and building the first version of our entertainment knowledge graph. I was later hired into my first purely data gig where I was able to deepen my knowledge of big data.

Data Engineering

Data Engineering Engineering Entertainment Big Data

Path to NoOps part 1: How modern AIOps brings NoOps within reach

Dynatrace

OCTOBER 25, 2022

NoOps is an advanced transformation of DevOps where many of the functions needed to manage, optimize and secure IT services and applications are automated within the design. This means organizations can’t be restricted to a single cloud technology. Thus, the concept of NoOps takes DevOps a step further.

DevOps

DevOps Big Data Cloud Innovation

What is ITOps? Why IT operations is more crucial than ever in a multicloud world

Dynatrace

DECEMBER 15, 2022

As more organizations adopt cloud-native technologies, traditional approaches to IT operations have been evolving. Complex cloud computing environments are increasingly replacing traditional data centers. In fact, Gartner estimates that 80% of enterprises will shut down their on-premises data centers by 2025. So, what is ITOps?

Artificial Intelligence

Artificial Intelligence DevOps Hardware Virtualization

DynatraceGo! APAC 2021: Lessons in thick data and keeping pace with the market

Dynatrace

AUGUST 10, 2021

Carrie called out how at Dynatrace we know it takes a village to achieve the extraordinary, from innovating reliable digital services at speed to learning how to adapt and thrive while managing our increasingly complex, dynamic technology environments. Investing in data is easy but using it is really hard”. She wasn’t wrong.

DevOps

DevOps Innovation Big Data Cloud

AIOps observability adoption ascends in healthcare

Dynatrace

MARCH 14, 2022

Every day, healthcare organizations across the globe have embraced innovative technology to streamline the delivery of patient care. Many hospitals adopted telehealth and other virtual technology to deliver care and reduce the spread of disease. During the early months of the COVID-19 pandemic, this trend was undeniably apparent.

Healthcare

Healthcare Artificial Intelligence Innovation Strategy

Bulldozer: Batch Data Moving from Data Warehouse to Online Key-Value Stores

The Netflix TechBlog

OCTOBER 27, 2020

Data scientists and engineers collect this data from our subscribers and videos, and implement data analytics models to discover customer behaviour with the goal of maximizing user joy. The processed data is typically stored as data warehouse tables in AWS S3.

Latency

Latency Storage Big Data Tuning

What is APM?

Dynatrace

JUNE 1, 2020

Application Performance Monitoring and the technologies and use cases it covers, has expanded rapidly. Artificial intelligence for IT operations (AIOps): AIOps platforms combine big data and machine learning functionality to support IT operations. How does Gartner defines Application Performance Monitoring?

Artificial Intelligence

Artificial Intelligence Social Media Monitoring IoT

Expanding the AWS Cloud: Introducing the AWS Europe (London) Region

All Things Distributed

DECEMBER 13, 2016

Today, I'm happy to announce that the AWS Europe (London) Region, our 16th technology infrastructure region globally, is now generally available for use by customers worldwide. The British Government is also helping to drive innovation and has embraced a cloud-first policy for technology adoption.

AWS

AWS Cloud Artificial Intelligence IoT

Smashing Podcast Episode 41 With Eva PenzeyMoog: Designing For Safety

Smashing Magazine

AUGUST 9, 2021

Smashing Podcast Episode 41 With Eva PenzeyMoog: Designing For Safety. Smashing Podcast Episode 41 With Eva PenzeyMoog: Designing For Safety. In this episode, we’re talking about designing for safety. What does it mean to consider vulnerable users in our designs? Design for Safety from A Book Apart. Drew McLellan.

Design

Design Education Processing Network

What is a Distributed Storage System

Scalegrid

FEBRUARY 8, 2024

A distributed storage system is foundational in today’s data-driven landscape, ensuring data spread over multiple servers is reliable, accessible, and manageable. This guide delves into how these systems work, the challenges they solve, and their essential role in businesses and technology.

Storage

Storage Systems Big Data Azure

A Recap of the Data Engineering Open Forum at Netflix

The Netflix TechBlog

JUNE 20, 2024

Creating new development environments is cumbersome: Populating them with data is compute-intensive, and the deployment process is error-prone, leading to higher costs, slower iteration, and unreliable data. To handle errors efficiently, Netflix developed a rule-based classifier for error classification called “Pensive.”

Data Engineering

Data Engineering Engineering Entertainment Software Engineering

Hyper Scale VPC Flow Logs enrichment to provide Network Insight

The Netflix TechBlog

MAY 26, 2020

Requirements There are multiple ways you can solve this problem and many technologies to choose from. As with any sustainable engineering design, focusing on simplicity is very important. These characteristics allow for an on-call response time that is relaxed and more in line with traditional big data analytical pipelines.

Network

Network Tuning AWS Traffic

Data Engineers of Netflix?—?Interview with Samuel Setegne

The Netflix TechBlog

JUNE 1, 2021

I personally loved looking at raw data and using it to understand patterns in the world through technology. clinical data was often small enough to fit into memory on an average computer and only in rare cases would its computation require any technical ingenuity or massive computing power. For example?—?clinical

Data Engineering

Data Engineering Engineering Big Data Healthcare

Data Engineers of Netflix?—?Interview with Dhevi Rajendran

The Netflix TechBlog

JUNE 1, 2021

At Netflix, the work that data engineers do to produce data in a robust, scalable way is incredibly important to provide the best experience to our members as they interact with our service. Through these cross-functional efforts, I’ve also really gotten to learn and appreciate the nuances of payments.

Data Engineering

Data Engineering Engineering Software Engineering Big Data

What is AIOps? Everything you wanted to know

Dynatrace

OCTOBER 14, 2021

Gartner defines AIOps as the combination of “big data and machine learning to automate IT operations processes, including event correlation, anomaly detection, and causality determination.” Only deterministic AIOps technology enables fully automated cloud operations across the entire enterprise development lifecycle.

Artificial Intelligence

Artificial Intelligence DevOps Innovation Metrics

What is Application Performance Monitoring?

Dynatrace

JUNE 1, 2020

Gartner defines APM as: Application Performance Monitoring and the technologies and use cases it covers, has expanded rapidly. Artificial intelligence for IT operations (AIOps): AIOps platforms combine big data and machine learning functionality to support IT operations.

Monitoring

Monitoring Performance Social Media Artificial Intelligence

What is RabbitMQ Used For

Scalegrid

JUNE 28, 2024

It leverages various exchange types to either route messages directly to designated queues following specific routing and binding keys or disperses them broadly like an indiscriminate town herald. Can RabbitMQ handle the high-throughput needs of big data applications? RabbitMQ’s real adaptability emerges with topic exchanges.

IoT

IoT Healthcare Programming Open Source

Music to my Ears - All Things Distributed

All Things Distributed

MARCH 28, 2011

Amazon S3 is used by enterprises of all sizes and is designed to handle scaling extremely well; it stores hundreds of billions of objects and easily performs several hundreds of thousands of storage transaction a second. a Fast and Scalable NoSQL Database Service Designed for Internet Scale Applications. More details at [link].

AWS

AWS Cloud Storage Internet

The End of Programming as We Know It

O'Reilly

FEBRUARY 4, 2025

Yet as the technology grew in capability, successful websites became more and more complex. Big data, web services, and cloud computing established a kind of internet operating system. There is so much to learn about how to apply the new technology. No code became a buzzword. Soon enough, everyone needed a website.

Programming

Programming Google Infrastructure Internet

Mastering Distributed SQL™ Databases in 2025

Scalegrid

JANUARY 10, 2025

They keep the features that developers like but can handle much more data, similar to NoSQL systems. Notably, they simplify handling big data flows, offer consistent transactions, and sustain high performance even when they’re used for real-time data analysis and complex queries. </p>

Database

Database Scalability Best Practices Blockchain

Seer: leveraging big data to navigate the complexity of performance debugging in cloud microservices

The Morning Paper

MAY 14, 2019

Seer: leveraging big data to navigate the complexity of performance debugging in cloud microservices Gan et al., Finally, we show that Seer can identify application level design bugs, and provide insights on how to better architect microservices to achieve predictable performance. ASPLOS’19.

Big Data

Big Data Cloud Performance Hardware

New AWS feature: Run your website from Amazon S3 - All Things.

All Things Distributed

FEBRUARY 17, 2011

a Fast and Scalable NoSQL Database Service Designed for Internet Scale Applications. Driving down the cost of Big-Data analytics. The Amazon.com 2010 Shareholder Letter Focusses on Technology. Countdown to What is Next in AWS. Expanding the Cloud â?? Introducing the AWS South America (Sao Paulo) Region. APAC Summer Tour.

AWS

AWS Website Storage Servers

Post: InterviewCamp.io, Scrapinghub, Fauna, Sisu, Educative, PA File Sight, Etleap, Triplebyte, Stream

High Scalability

MARCH 30, 2020

has hours of system design content. They also do live system design discussions every week. Scrapinghub is hiring a Senior Software Engineer (Big Data/AI). Level up on in-demand technologies and prep for your interviews on Educative.io, featuring popular courses like the bestselling Grokking the System Design Interview.

Education

Education Software Engineering Engineering Big Data

Post: Scrapinghub, Fauna, Sisu, Educative, PA File Sight, Etleap, Triplebyte, Stream

High Scalability

MARCH 24, 2020

Scrapinghub is hiring a Senior Software Engineer (Big Data/AI). You will be designing and implementing distributed systems : large-scale web crawling platform, integrating Deep Learning based web data extraction components, working on queue algorithms, large datasets, creating a development platform for other company departments, etc.

Education

Education Software Engineering Engineering Big Data

Expanding the Cloud ? Managing Cold Storage with Amazon Glacier

All Things Distributed

AUGUST 20, 2012

From selecting the right technology, to maintaining multi-site facilities, to dealing with exponential and often unpredictable growth, to ensuring long-term digital integrity, digital archiving can be a major headache. With Amazon Glacier any organization now has access to the same data archiving capabilities as the worldâ??s

Storage

Storage Cloud AWS Media

No Server Required - Jekyll & Amazon S3 - All Things Distributed

All Things Distributed

AUGUST 17, 2011

It is simple and elegant, as you would expect from someone who has won several design awards. I am still using the same layout and css I used with MT, as I prefer to make one change at the time: design comes next. a Fast and Scalable NoSQL Database Service Designed for Internet Scale Applications. Expanding the Cloud â??

Servers

Servers Social Media AWS Website

The AWS GovCloud (US) Region - All Things Distributed

All Things Distributed

AUGUST 16, 2011

The Cloud First strategy is most visible with new Federal IT programs, which are all designed to be â??Cloud Government and Big Data. One particular early use case for AWS GovCloud (US) will be massive data processing and analytics. a Fast and Scalable NoSQL Database Service Designed for Internet Scale Applications.

AWS

AWS Government Big Data Cloud

Introducing the AWS South America - All Things Distributed

All Things Distributed

DECEMBER 14, 2011

a Fast and Scalable NoSQL Database Service Designed for Internet Scale Applications. Driving down the cost of Big-Data analytics. The Amazon.com 2010 Shareholder Letter Focusses on Technology. Countdown to What is Next in AWS. Expanding the Cloud â?? Introducing the AWS South America (Sao Paulo) Region. APAC Summer Tour.

AWS

AWS Latency Storage Cloud

Amazon EC2 Cluster GPU Instances - All Things Distributed

All Things Distributed

NOVEMBER 15, 2010

These trade-offs have even impacted the way the lowest level building blocks in our computer architectures have been designed. This instance limit is a default usage limit, not a technology limit.Â a Fast and Scalable NoSQL Database Service Designed for Internet Scale Applications. Driving down the cost of Big-Data analytics.

AWS

AWS Programming Latency Architecture

Expanding the Cloud - Cluster Compute Instances for Amazon EC2.

All Things Distributed

JULY 13, 2010

Cluster Computer Instances for Amazon EC2 are a new instance type specifically designed for High Performance Computing applications. During my academic career, I spent many years working on HPC technologies such as user-level networking interfaces, large scale high-speed interconnects, HPC software stacks, etc. Expanding the Cloud â??

Cloud

Cloud AWS Automotive Latency

Expanding the Cloud - Introducing Amazon ElastiCache - All Things.

All Things Distributed

AUGUST 22, 2011

a Fast and Scalable NoSQL Database Service Designed for Internet Scale Applications. Driving down the cost of Big-Data analytics. The Amazon.com 2010 Shareholder Letter Focusses on Technology. Countdown to What is Next in AWS. Expanding the Cloud â?? Introducing the AWS South America (Sao Paulo) Region. APAC Summer Tour.

Cloud

Cloud Cache AWS Storage

Hacking with AWS at The Next Web Hackaton - All Things Distributed

All Things Distributed

MARCH 24, 2011

Over the past years The Next Web Conference has become a premier conference on internet life and its technologies. Up to 200 developers and designers will get together to hack up interesting applications using the Internets APIs and SDKs. If you want to be that team of 5-7 developer/designers you can sign up using this form.

AWS

AWS Internet Internet Storage

Expanding the Cloud with DNS - Introducing Amazon Route 53 - All.

All Things Distributed

DECEMBER 5, 2010

The Domain Name System is a wonderful practical piece of technology; it is a fundamental building block of our modern internet. We have designed Route 53 to propagate updates very quickly and give the customer the tools to find out when all changes have been propagated. Driving down the cost of Big-Data analytics.

Cloud

Cloud Internet Internet AWS

Structural Evolutions in Data

O'Reilly

SEPTEMBER 19, 2023

” That came to mind when a friend raised a point about emerging technology’s fractal nature. Each time, the underlying implementation changed a bit while still staying true to the larger phenomenon of “Analyzing Data for Fun and Profit.” I am wired to constantly ask “what’s next?”

Hardware

Hardware Storage Big Data Blockchain

Why MySQL Could Be Slow With Large Tables

Percona

JANUARY 19, 2023

While the technologies have evolved and matured enough, there are still some people thinking that MySQL is only for small projects or that it can’t perform well with large tables. Make sure you design the data types correctly while planning for the future growth of the table. MyRocks is shipped in Percona Server for MySQL.

Open Source

Open Source Storage Database Big Data

Expanding the Cloud - Introducing the AWS Asia Pacific (Tokyo.

All Things Distributed

MARCH 2, 2011

a Fast and Scalable NoSQL Database Service Designed for Internet Scale Applications. Driving down the cost of Big-Data analytics. The Amazon.com 2010 Shareholder Letter Focusses on Technology. Countdown to What is Next in AWS. Expanding the Cloud â?? Introducing the AWS South America (Sao Paulo) Region. APAC Summer Tour.

AWS

AWS Cloud Games Latency

In-Stream Big Data Processing

The Amazon.com 2010 Shareholder Letter Focusses on Technology.

Trending Sources

Data Engineers of Netflix?—?Interview with Pallavi Phadnis

Seven benefits of AIOps to transform your business operations

Driving down the cost of Big-Data analytics - All Things Distributed

Building and Scaling Data Lineage at Netflix to Improve Data Infrastructure Reliability, and…

What is a data lakehouse? Combining data lakes and warehouses for the best of both worlds

Top 15 Software Testing Trends to Watch Out in 2021

Data Engineers of Netflix?—?Interview with Kevin Wylie

Path to NoOps part 1: How modern AIOps brings NoOps within reach

What is ITOps? Why IT operations is more crucial than ever in a multicloud world

DynatraceGo! APAC 2021: Lessons in thick data and keeping pace with the market

AIOps observability adoption ascends in healthcare

Bulldozer: Batch Data Moving from Data Warehouse to Online Key-Value Stores

What is APM?

Expanding the AWS Cloud: Introducing the AWS Europe (London) Region

Smashing Podcast Episode 41 With Eva PenzeyMoog: Designing For Safety

What is a Distributed Storage System

A Recap of the Data Engineering Open Forum at Netflix

Hyper Scale VPC Flow Logs enrichment to provide Network Insight

Data Engineers of Netflix?—?Interview with Samuel Setegne

Data Engineers of Netflix?—?Interview with Dhevi Rajendran

What is AIOps? Everything you wanted to know

What is Application Performance Monitoring?

What is RabbitMQ Used For

Music to my Ears - All Things Distributed

The End of Programming as We Know It

Mastering Distributed SQL™ Databases in 2025

Seer: leveraging big data to navigate the complexity of performance debugging in cloud microservices

New AWS feature: Run your website from Amazon S3 - All Things.

Post: InterviewCamp.io, Scrapinghub, Fauna, Sisu, Educative, PA File Sight, Etleap, Triplebyte, Stream

Post: Scrapinghub, Fauna, Sisu, Educative, PA File Sight, Etleap, Triplebyte, Stream

Expanding the Cloud ? Managing Cold Storage with Amazon Glacier

No Server Required - Jekyll & Amazon S3 - All Things Distributed

The AWS GovCloud (US) Region - All Things Distributed

Introducing the AWS South America - All Things Distributed

Amazon EC2 Cluster GPU Instances - All Things Distributed

Expanding the Cloud - Cluster Compute Instances for Amazon EC2.

Expanding the Cloud - Introducing Amazon ElastiCache - All Things.

Hacking with AWS at The Next Web Hackaton - All Things Distributed

Expanding the Cloud with DNS - Introducing Amazon Route 53 - All.

Structural Evolutions in Data

Why MySQL Could Be Slow With Large Tables

Expanding the Cloud - Introducing the AWS Asia Pacific (Tokyo.

Stay Connected