Architecture, Big Data and Database - Technology Performance Pulse

What is Greenplum Database? Intro to the Big Data Database

Scalegrid

MAY 13, 2020

Greenplum Database is a massively parallel processing (MPP) SQL database that is built and based on PostgreSQL. It can scale towards a multi-petabyte level data workload without a single issue, and it allows access to a cluster of powerful servers that will work together within a single SQL interface where you can view all of the data.

Big Data

Big Data Database Artificial Intelligence Open Source

In-Stream Big Data Processing

Highly Scalable

AUGUST 20, 2013

The shortcomings and drawbacks of batch-oriented data processing were widely recognized by the Big Data community quite a long time ago. It is clear that distributed in-stream data processing has something to do with query processing in distributed relational databases. Basics of Distributed Query Processing.

Big Data

Big Data Processing Lambda Database

What is IT operations analytics? Extract more data insights from more sources

Dynatrace

MAY 1, 2023

Then, big data analytics technologies, such as Hadoop, NoSQL, Spark, or Grail, the Dynatrace data lakehouse technology, interpret this information. Here are the six steps of a typical ITOA process : Define the data infrastructure strategy. NoSQL database. Why use a data lakehouse for causal AI? Apache Spark.

Analytics

Analytics Artificial Intelligence Big Data Open Source

What Should You Know About Graph Database’s Scalability?

DZone

JANUARY 20, 2023

Having a distributed and scalable graph database system is highly sought after in many enterprise scenarios. Do Not Be Misled Designing and implementing a scalable graph database system has never been a trivial task.

Scalability

Scalability Big Data Hardware Internet

Kubernetes in the wild report 2023

Dynatrace

JANUARY 16, 2023

The strongest Kubernetes growth areas are security, databases, and CI/CD technologies. Strongest Kubernetes growth areas are security, databases, and CI/CD technologies. Of the organizations in the Kubernetes survey, 71% run databases and caches in Kubernetes, representing a +48% year-over-year increase. Java, Go, and Node.js

Open Source

Open Source Java Operating System Programming

What is cloud monitoring? How to improve your full-stack visibility

Dynatrace

JANUARY 11, 2023

As cloud and big data complexity scales beyond the ability of traditional monitoring tools to handle, next-generation cloud monitoring and observability are becoming necessities for IT teams. Database monitoring. This ensures the database queries are performant, while also identifying host problems. Website monitoring.

Cloud

Cloud Monitoring Best Practices Infrastructure

No need to compromise visibility in public clouds with the new Azure services supported by Dynatrace

Dynatrace

JULY 6, 2020

In addition to providing visibility for core Azure services like virtual machines, load balancers, databases, and application services, we’re happy to announce support for the following 10 new Azure services, with many more to come soon: Virtual Machines (classic ones). Effortlessly optimize Azure database performance.

Azure

Azure Cloud Big Data Virtualization

Building and Scaling Data Lineage at Netflix to Improve Data Infrastructure Reliability, and…

The Netflix TechBlog

MARCH 25, 2019

We adopted the following mission statement to guide our investments: “Provide a complete and accurate data lineage system enabling decision-makers to win moments of truth.” Nonetheless, Netflix data landscape (see below) is complex and many teams collaborate effectively for sharing the responsibility of our data system management.

Infrastructure

Infrastructure Big Data Transportation Architecture

Mastering Distributed SQL™ Databases in 2025

Scalegrid

JANUARY 10, 2025

Heading into 2024, SQL databases will remain essential in data management, increasingly using distributed systems to meet growing needs for scalability and reliability. According to 2023 statistics, 49% of web applications use an SQL-based database , with SQL having a 75% adoption rate in the IT industry.

Database

Database Scalability Best Practices Blockchain

Hybrid cloud infrastructure explained: Weighing the pros, cons, and complexities

Dynatrace

JUNE 29, 2022

To drive better outcomes using hybrid cloud architectures, it helps to understand their benefits—and how to orchestrate them seamlessly. What is hybrid cloud architecture? Hybrid cloud architecture is a computing environment that shares data and applications on a combination of public clouds and on-premises private clouds.

Infrastructure

Infrastructure Cloud Azure AWS

Data lakehouse innovations advance the three pillars of observability for more collaborative analytics

Dynatrace

FEBRUARY 16, 2023

As teams try to gain insight into this data deluge, they have to balance the need for speed, data fidelity, and scale with capacity constraints and cost. To solve this problem, Dynatrace launched Grail, its causational data lakehouse , in 2022. And without the encumbrances of traditional databases, Grail performs fast. “In

Analytics

Analytics Innovation Metrics Database

Conducting log analysis with an observability platform and full data context

Dynatrace

APRIL 20, 2023

Logs highlight observability challenges Ingesting, storing, and processing the unprecedented explosion of data from sources such as software as a service, multicloud environments, containers, and serverless architectures can be overwhelming for today’s organizations. Grail enables 100% precision insights into all stored data.

Analytics

Analytics Infrastructure Storage Architecture

Data Movement in Netflix Studio via Data Mesh

The Netflix TechBlog

JULY 26, 2021

This happens at an unprecedented scale and introduces many interesting challenges; one of the challenges is how to provide visibility of Studio data across multiple phases and systems to facilitate operational excellence and empower decision making. End-to-End Schema Evolution Schema is a key component of Data Mesh.

Big Data

Big Data Government Processing Analytics

RSA Guide 2023: Cloud application security remains core challenge for organizations

Dynatrace

APRIL 11, 2023

Cloud application security remains challenging because organizations lack end-to-end visibility into cloud architecture. As organizations migrate applications to the cloud, they must balance the agility that microservices architecture brings with the complexity and lack of transparency that can also come with it.

Cloud

Cloud DevOps Open Source Retail

Bulldozer: Batch Data Moving from Data Warehouse to Online Key-Value Stores

The Netflix TechBlog

OCTOBER 27, 2020

Data scientists and engineers collect this data from our subscribers and videos, and implement data analytics models to discover customer behaviour with the goal of maximizing user joy. The processed data is typically stored as data warehouse tables in AWS S3.

Latency

Latency Storage Big Data Tuning

MySQL vs MongoDB: Best Choice for You

Scalegrid

FEBRUARY 11, 2025

Choosing the right database often comes down to MongoDB vs MySQL. This article will help you understand the core differences in data structure, scalability, and use cases. Whether you need a relational database for complex transactions or a NoSQL database for flexible data storage, weve got you covered.

Scalability

Scalability Database Storage IoT

Optimizing anomaly detection and noise

Dynatrace

MARCH 11, 2021

I took a big-data-analysis approach, which started with another problem visualization. For this visualization I used the same backend architecture as for the real-time visualization I presented previously. The raw event and problem data from Dynatrace for analysis stored in InfluxDB. But that didn’t work for me.

Tuning

Tuning Architecture Monitoring Big Data

Orchestrating Data/ML Workflows at Scale With Netflix Maestro

The Netflix TechBlog

OCTOBER 18, 2022

by Jun He , Akash Dwivedi , Natallia Dzenisenka , Snehal Chennuru , Praneeth Yenugutala , Pawan Dixit At Netflix, Data and Machine Learning (ML) pipelines are widely used and have become central for the business, representing diverse use cases that go beyond recommendations, predictions and data transformations.

Java

Java Scalability Traffic Architecture

What is a Distributed Storage System

Scalegrid

FEBRUARY 8, 2024

At its core, a distributed storage system comprises three main components: a controller for managing the system’s operations, an internal datastore where information is held, and databases geared towards ensuring scalability, partitioning capabilities, and high availability for all types of data.

Storage

Storage Systems Big Data Azure

Helios: hyperscale indexing for the cloud & edge – part 1

The Morning Paper

OCTOBER 26, 2020

Helios also serves as a reference architecture for how Microsoft envisions its next generation of distributed big-data processing systems being built. These two narratives of reference architecture and ingestion/indexing system are interwoven throughout the paper. Why do we need a new reference architecture?

Cloud

Cloud Big Data Latency Architecture

Redis vs Memcached in 2024

Scalegrid

MARCH 28, 2024

Key Takeaways Redis offers complex data structures and additional features for versatile data handling, while Memcached excels in simplicity with a fast, multi-threaded architecture for basic caching needs. Memcached shines in scenarios where a simple, fast, and efficient caching solution is required without data persistence.

Cache

Cache Storage Scalability Architecture

Job Openings in AWS - Senior Leader in Database Services - All.

All Things Distributed

AUGUST 19, 2011

Job Openings in AWS - Senior Leader in Database Services. This week it is an opening for senior leaders with AWS Database Services. AWS Database Services is responsible for setting the database strategy and delivering distributed structured storage services to our AWS customers. Comments (). Contact Info. Werner Vogels.

AWS

AWS Database Storage Scalability

The Amazon.com 2010 Shareholder Letter Focusses on Technology.

All Things Distributed

APRIL 27, 2011

To our shareowners: Random forests, naÃ¯ve Bayesian estimators, RESTful services, gossip protocols, eventual consistency, data sharding, anti-entropy, Byzantine quorum, erasure coding, vector clocks. Look inside a current textbook on software architecture, and youll find few patterns that we dont apply at Amazon.

Technology

Technology Technology AWS Storage

Mastering Hybrid Cloud Strategy

Scalegrid

MARCH 14, 2024

Defining Hybrid Cloud Strategy The decision-making process about where to situate data and applications is vital to any hybrid cloud solution. Defining Hybrid Cloud Strategy The decision-making process about where to situate data and applications is vital to any hybrid cloud solution.

Strategy

Strategy Cloud Infrastructure Artificial Intelligence

The Need for Real-Time Device Tracking

ScaleOut Software

JULY 19, 2021

Today’s streaming analytics architectures are not equipped to make sense of this rapidly changing information and react to it as it arrives. Incoming data is saved into data storage (historian database or log store) for query by operational managers who must attempt to find the highest priority issues that require their attention.

IoT

IoT Big Data Analytics Architecture

Amazon EC2 Cluster GPU Instances - All Things Distributed

All Things Distributed

NOVEMBER 15, 2010

Building general purpose architectures has always been hard; there are often so many conflicting requirements that you cannot derive an architecture that will serve all, so we have often ended up focusing on one side of the requirements that allow you to serve that area really well. From CPU to GPU.

AWS

AWS Programming Latency Architecture

Optimizing data warehouse storage

The Netflix TechBlog

DECEMBER 21, 2020

We built AutoOptimize to efficiently and transparently optimize the data and metadata storage layout while maximizing their cost and performance benefits. This article will list some of the use cases of AutoOptimize, discuss the design principles that help enhance efficiency, and present the high-level architecture.

Storage

Storage Latency Efficiency Data Engineering

Spice up your Analytics: Amazon QuickSight Now Generally Available in N. Virginia, Oregon, and Ireland.

All Things Distributed

NOVEMBER 15, 2016

The reality is that many traditional BI solutions are built on top of legacy desktop and on-premises architectures that are decades old. QuickSight is a cloud-powered BI service built from the ground up to address the big data challenges around speed, complexity, and cost.

Analytics

Analytics Availability Media Social Media

Seer: leveraging big data to navigate the complexity of performance debugging in cloud microservices

The Morning Paper

MAY 14, 2019

Seer: leveraging big data to navigate the complexity of performance debugging in cloud microservices Gan et al., Seer uses a lightweight RPC-level tracing system to collect request traces and aggregate them in a Cassandra database. We’re not told how Seer figures out that a major architectural change has happened.

Big Data

Big Data Cloud Performance Hardware

Music to my Ears - All Things Distributed

All Things Distributed

MARCH 28, 2011

A key part of the Cloud Drive architecture is a Metadata Service that allows customers to quickly search and organize their digital collections within Cloud Drive. a Fast and Scalable NoSQL Database Service Designed for Internet Scale Applications. Job Openings in AWS - Senior Leader in Database Services. Expanding the Cloud â??

AWS

AWS Cloud Storage Internet

AWS Pop-up Loft 2.0: Returning to San Francisco on October 1st

All Things Distributed

SEPTEMBER 26, 2014

Be sure to bring your questions about AWS architecture, cost optimization, services and features, and anything else AWS-related. Topics include Introduction to AWS, Big Data, Compute & Networking, Architecture, Mobile & Gaming, Databases, Operations, Security, and more. AWS Technical Bootcamps.

AWS

AWS Games Education Innovation

Why MySQL Could Be Slow With Large Tables

Percona

JANUARY 19, 2023

Some startups adopted MySQL in its early days such as Facebook, Uber, Pinterest, and many more, which are now big and successful companies that prove that MySQL can run on large databases and on heavily used sites. It was developed for optimizing data storage and access for big data sets.

Open Source

Open Source Storage Database Big Data

Register for AWS re: Invent - All Things Distributed

All Things Distributed

JULY 16, 2012

There are sessions in many different categories: Architecture, Big Data, HPC, Computer & Networking, Storage, Databases, Security, Tools & Languages, Media Sharing & Content Delivery, Managing AWS Resources, Enterprise IT, Mobile, Start-up, and more.

AWS

AWS Big Data Media Storage

Expanding the Cloud - Cluster Compute Instances for Amazon EC2.

All Things Distributed

JULY 13, 2010

Additionally, many high-end HPC applications take advantage of knowing their in-house hardware platforms to achieve major speedup by exploiting the specific processor architecture. a Fast and Scalable NoSQL Database Service Designed for Internet Scale Applications. Job Openings in AWS - Senior Leader in Database Services.

Cloud

Cloud AWS Automotive Latency

Expanding the Cloud with DNS - Introducing Amazon Route 53 - All.

All Things Distributed

DECEMBER 5, 2010

While registrars manage the namespace in the DNS naming architecture, DNS servers are used to provide the mapping between names and the addresses used to identify an access point. a Fast and Scalable NoSQL Database Service Designed for Internet Scale Applications. Job Openings in AWS - Senior Leader in Database Services.

Cloud

Cloud Internet Internet AWS

5 data integration trends that will define the future of ETL in 2018

Abhishek Tiwari

DECEMBER 27, 2017

ETL refers to extract, transform, load and it is generally used for data warehousing and data integration. ETL is a product of the relational database era and it has not evolved much in last decade. There are several emerging data trends that will define the future of ETL in 2018. Unified data management architecture.

Big Data

Big Data Artificial Intelligence Storage Hardware

Fast key-value stores: an idea whose time has come and gone

The Morning Paper

JUNE 23, 2019

Coupled with stateless application servers to execute business logic and a database-like system to provide persistent storage, they form a core component of popular data center service archictectures. We’ve seen similar high marshalling overheads in big data systems too.) Fetching too much data in a single query (i.e.,

Cache

Cache Latency Google Network

Around the World in 28 Days - All Things Distributed

All Things Distributed

SEPTEMBER 30, 2010

Visiting future customers is equally exiting as you get a change to understand their current architecture, if it is a migration, and how they plan to exploit cloud services in their new setup. a Fast and Scalable NoSQL Database Service Designed for Internet Scale Applications. Job Openings in AWS - Senior Leader in Database Services.

AWS

AWS Cloud Storage Best Practices

This week in review: GPUs, Zombies, Biomimicry and Tom Waits.

All Things Distributed

NOVEMBER 19, 2010

Understanding Throughput-Oriented Architectures - background article in CACM on massively parallel and throughput vs latency oriented architectures. The Big Idea: Biomimetic Architecture - The National Geographic came in the mail this week with a beautiful pull-out of GaudÃs Sagrada FamÃlia, the online version is only a summary.

AWS

AWS Cloud Benchmarking Storage

Use Digital Twins for the Next Generation in Telematics

ScaleOut Software

NOVEMBER 24, 2020

However, telematics architectures face challenges in responding to telemetry in real time. Current Telematics Architecture. The volume of incoming telemetry challenges current telematics systems to keep up and quickly make sense of all the data. Challenges for Current Architectures.

Analytics

Analytics Architecture Scalability Software Architecture

A case for ELT

Abhishek Tiwari

DECEMBER 22, 2017

Cheap storage and on-demand compute in the cloud coupled with the emergence of new big data frameworks and tools are forcing us to rethink the whole ETL and data warehousing architecture. Then we perform frequent batch ETL from application databases to a data warehouse. Classic ETL. Late transformation.

Big Data

Big Data Retail Storage Google

How LinkedIn Serves Over 4.8 Million Member Profiles per Second

InfoQ

JULY 3, 2023

LinkedIn introduced Couchbase as a centralized caching tier for scaling member profile reads to handle increasing traffic that has outgrown their existing database cluster. The new solution achieved over 99% hit rate, helped reduce tail latencies by more than 60% and costs by 10% annually. By Rafal Gancarz

Cache

Cache Latency Traffic Database

Engineering SQL Support on Apache Pinot at Uber

Uber Engineering

JANUARY 15, 2020

Uber leverages real-time analytics on aggregate data to improve the user experience across our products, from fighting fraudulent behavior on Uber Eats to forecasting demand on our platform. .

Engineering

Engineering Analytics Big Data Database

Microsoft Engineering loves SQLBits

SQL Server According to Bob

FEBRUARY 15, 2018

Microsoft engineering is actually sending quite a few folks over the Atlantic to come talk about SQL Server 2017, SQL Server on Linux, GDPR, Performance, Security, Azure Data Lake, Azure SQL Database, Azure SQL Data Warehouse, and Azure CosmosDB. Best practices on Building a Big Data Analytics Solution – Michael Rys.

Engineering

Engineering Azure Best Practices Servers

What is Greenplum Database? Intro to the Big Data Database

In-Stream Big Data Processing

Trending Sources

What is IT operations analytics? Extract more data insights from more sources

What Should You Know About Graph Database’s Scalability?

Kubernetes in the wild report 2023

What is cloud monitoring? How to improve your full-stack visibility

No need to compromise visibility in public clouds with the new Azure services supported by Dynatrace

Building and Scaling Data Lineage at Netflix to Improve Data Infrastructure Reliability, and…

Mastering Distributed SQL™ Databases in 2025

Hybrid cloud infrastructure explained: Weighing the pros, cons, and complexities

Data lakehouse innovations advance the three pillars of observability for more collaborative analytics

Conducting log analysis with an observability platform and full data context

Data Movement in Netflix Studio via Data Mesh

RSA Guide 2023: Cloud application security remains core challenge for organizations

Bulldozer: Batch Data Moving from Data Warehouse to Online Key-Value Stores

MySQL vs MongoDB: Best Choice for You

Optimizing anomaly detection and noise

Orchestrating Data/ML Workflows at Scale With Netflix Maestro

What is a Distributed Storage System

Helios: hyperscale indexing for the cloud & edge – part 1

Redis vs Memcached in 2024

Job Openings in AWS - Senior Leader in Database Services - All.

The Amazon.com 2010 Shareholder Letter Focusses on Technology.

Mastering Hybrid Cloud Strategy

The Need for Real-Time Device Tracking

Amazon EC2 Cluster GPU Instances - All Things Distributed

Optimizing data warehouse storage

Spice up your Analytics: Amazon QuickSight Now Generally Available in N. Virginia, Oregon, and Ireland.

Seer: leveraging big data to navigate the complexity of performance debugging in cloud microservices

Music to my Ears - All Things Distributed

AWS Pop-up Loft 2.0: Returning to San Francisco on October 1st

Why MySQL Could Be Slow With Large Tables

Register for AWS re: Invent - All Things Distributed

Expanding the Cloud - Cluster Compute Instances for Amazon EC2.

Expanding the Cloud with DNS - Introducing Amazon Route 53 - All.

5 data integration trends that will define the future of ETL in 2018

Fast key-value stores: an idea whose time has come and gone

Around the World in 28 Days - All Things Distributed

This week in review: GPUs, Zombies, Biomimicry and Tom Waits.

Use Digital Twins for the Next Generation in Telematics

A case for ELT

How LinkedIn Serves Over 4.8 Million Member Profiles per Second

Engineering SQL Support on Apache Pinot at Uber

Microsoft Engineering loves SQLBits

Stay Connected