Architecture, Big Data and Hardware - Technology Performance Pulse

What is Greenplum Database? Intro to the Big Data Database

Scalegrid

MAY 13, 2020

In this blog post, we explain what Greenplum is, and break down the Greenplum architecture, advantages, major use cases, and how to get started. Greenplum Database is an open-source , hardware-agnostic MPP database for analytics, based on PostgreSQL and developed by Pivotal who was later acquired by VMware. The Greenplum Architecture.

Big Data

Big Data Database Artificial Intelligence Open Source

In-Stream Big Data Processing

Highly Scalable

AUGUST 20, 2013

The shortcomings and drawbacks of batch-oriented data processing were widely recognized by the Big Data community quite a long time ago. Incremental computations over sliding windows is a group of techniques that are widely used in digital signal processing, in both software and hardware.

Big Data

Big Data Processing Lambda Database

Hybrid cloud infrastructure explained: Weighing the pros, cons, and complexities

Dynatrace

JUNE 29, 2022

To drive better outcomes using hybrid cloud architectures, it helps to understand their benefits—and how to orchestrate them seamlessly. What is hybrid cloud architecture? Hybrid cloud architecture is a computing environment that shares data and applications on a combination of public clouds and on-premises private clouds.

Infrastructure

Infrastructure Cloud Azure AWS

What Should You Know About Graph Database’s Scalability?

DZone

JANUARY 20, 2023

Having a distributed and scalable graph database system is highly sought after in many enterprise scenarios.

Scalability

Scalability Big Data Hardware Internet

What is IT operations analytics? Extract more data insights from more sources

Dynatrace

MAY 1, 2023

Additionally, ITOA gathers and processes information from applications, services, networks, operating systems, and cloud infrastructure hardware logs in real time. Then, big data analytics technologies, such as Hadoop, NoSQL, Spark, or Grail, the Dynatrace data lakehouse technology, interpret this information.

Analytics

Analytics Artificial Intelligence Big Data Open Source

Auto-Diagnosis and Remediation in Netflix Data Platform

The Netflix TechBlog

JANUARY 13, 2022

This blog will explore these two systems and how they perform auto-diagnosis and remediation across our Big Data Platform and Real-time infrastructure. This has led to a dramatic reduction in the time it takes to detect issues in hardware or bugs in recently rolled out data platform software.

Big Data

Big Data Infrastructure Metrics Games

What is ITOps? Why IT operations is more crucial than ever in a multicloud world

Dynatrace

DECEMBER 15, 2022

Besides the traditional system hardware, storage, routers, and software, ITOps also includes virtual components of the network and cloud infrastructure. Although modern cloud systems simplify tasks, such as deploying apps and provisioning new hardware and servers, hybrid cloud and multicloud environments are often complex.

Artificial Intelligence

Artificial Intelligence DevOps Hardware Virtualization

Kubernetes in the wild report 2023

Dynatrace

JANUARY 16, 2023

On-premises data centers invest in higher capacity servers since they provide more flexibility in the long run, while the procurement price of hardware is only one of many cost factors. Specifically, they provide asynchronous communications within microservices architectures and high-throughput distributed systems.

Open Source

Open Source Java Operating System Programming

Kubernetes for Big Data Workloads

Abhishek Tiwari

DECEMBER 27, 2017

Kubernetes has emerged as go to container orchestration platform for data engineering teams. In 2018, a widespread adaptation of Kubernetes for big data processing is anitcipated. Organisations are already using Kubernetes for a variety of workloads [1] [2] and data workloads are up next. Key challenges. Performance.

Big Data

Big Data Storage Benchmarking Hardware

Tackling the Pipeline Problem in the Architecture Research Community

ACM Sigarch

APRIL 8, 2019

Computer architecture is an important and exciting field of computer science, which enables many other fields (eg. big-data processing, machine learning, quantum computing, and so on). For those of us who pursued computer architecture as a career, this is well understood. Why is that? Should we be alarmed as a community?

Architecture

Architecture Open Source Hardware Software Engineering

What is a Distributed Storage System

Scalegrid

FEBRUARY 8, 2024

Key Takeaways Distributed storage systems benefit organizations by enhancing data availability, fault tolerance, and system scalability, leading to cost savings from reduced hardware needs, energy consumption, and personnel. This strategy reduces the volume needed during retrieval operations.

Storage

Storage Systems Big Data Azure

The Winds of Architecture Changes at the USENIX ATC 2019

ACM Sigarch

NOVEMBER 1, 2019

This blog post gives a glimpse of the computer systems research papers presented at the USENIX Annual Technical Conference (ATC) 2019, with an emphasis on systems that use new hardware architectures. As a consequence, the vast majority of the papers in the past has usually focused on conventional X86 or GPU-accelerated architectures.

Architecture

Architecture Hardware Cache Storage

Amazon EC2 Cluster GPU Instances - All Things Distributed

All Things Distributed

NOVEMBER 15, 2010

Building general purpose architectures has always been hard; there are often so many conflicting requirements that you cannot derive an architecture that will serve all, so we have often ended up focusing on one side of the requirements that allow you to serve that area really well. From CPU to GPU. General Purpose GPU programming.

AWS

AWS Programming Latency Architecture

Seer: leveraging big data to navigate the complexity of performance debugging in cloud microservices

The Morning Paper

MAY 14, 2019

Seer: leveraging big data to navigate the complexity of performance debugging in cloud microservices Gan et al., When a QoS violation is predicted to occur and a culprit microservice located, Seer uses a lower level tracing infrastructure with hardware monitoring primitives to identify the reason behind the QoS violation.

Big Data

Big Data Cloud Performance Hardware

Spice up your Analytics: Amazon QuickSight Now Generally Available in N. Virginia, Oregon, and Ireland.

All Things Distributed

NOVEMBER 15, 2016

The reality is that many traditional BI solutions are built on top of legacy desktop and on-premises architectures that are decades old. The cost and complexity to implement, scale, and use BI makes it difficult for most companies to make data analysis ubiquitous across their organizations. Enter Amazon QuickSight.

Analytics

Analytics Availability Media Social Media

Expanding the Cloud - Cluster Compute Instances for Amazon EC2.

All Things Distributed

JULY 13, 2010

Additionally, many high-end HPC applications take advantage of knowing their in-house hardware platforms to achieve major speedup by exploiting the specific processor architecture. Given the specialized nature of these platforms, they require dedicated resources to maintain and operate and put a big burden on the IT organization.

Cloud

Cloud AWS Automotive Latency

Peloton: Uber’s Unified Resource Scheduler for Diverse Cluster Workloads

Uber Engineering

OCTOBER 30, 2018

Cluster management, a common software infrastructure among technology companies, aggregates compute resources from a collection of physical hosts into a shared resource pool, amplifying compute power and allowing for the flexible use of data center hardware.

Hardware

Hardware Infrastructure Engineering Technology

5 data integration trends that will define the future of ETL in 2018

Abhishek Tiwari

DECEMBER 27, 2017

A common theme across all these trends is to remove the complexity by simplifying data management as a whole. In 2018, we anticipate that ETL will either lose relevance or the ETL process will disintegrate and be consumed by new data architectures. Unified data management architecture.

Big Data

Big Data Artificial Intelligence Storage Hardware

Dutch Enterprises and The Cloud

All Things Distributed

SEPTEMBER 6, 2013

Shell leverages AWS for big data analytics to help achieve these goals. It makes use of the Eagle Genomics platform running on AWS, resulting in that Unilever’s digital data program now processes genetic sequences twenty times faster—without incurring higher compute costs.

Cloud

Cloud Energy AWS Healthcare

Why MySQL Could Be Slow With Large Tables

Percona

JANUARY 19, 2023

ProxySQL: It is a feature-rich open-source MySQL proxy solution, that allows query routing for the most common MySQL architectures (PXC/Galera, Replication, Group Replication, etc.). It was developed for optimizing data storage and access for big data sets. It is available under a paid subscription.

Open Source

Open Source Storage Database Big Data

Current status, needs, and challenges in Heterogeneous and Composable Memory from the HCM workshop (HPCA’23)

ACM Sigarch

MAY 31, 2023

Introduction Memory systems are evolving into heterogeneous and composable architectures. Heterogeneous and Composable Memory (HCM) offers a feasible solution for terabyte- or petabyte-scale systems, addressing the performance and efficiency demands of emerging big-data applications. Using emulation (e.g.

Latency

Latency Hardware Cache Architecture

Peloton: Uber’s Unified Resource Scheduler for Diverse Cluster Workloads

Uber Engineering

OCTOBER 30, 2018

Cluster management, a common software infrastructure among technology companies, aggregates compute resources from a collection of physical hosts into a shared resource pool, amplifying compute power and allowing for the flexible use of data center hardware.

Hardware

Hardware Infrastructure Engineering Technology

Rethinking the 'production' of data

All Things Distributed

DECEMBER 20, 2017

Marketers use big data and artificial intelligence to find out more about the future needs of their customers. should ponder how we can organize the 'production' of data in such a way so that we ultimately come out with a competitive advantage. These mechanisms need to be lean, seamless and effective.

Artificial Intelligence

Artificial Intelligence Social Media Logistics AWS

Will AWS Have Anything New To Say About Sustainability at re:Invent 2024?

Adrian Cockcroft

NOVEMBER 18, 2024

Discover data sources to gain insights into your resource efficiency and environmental impact, including the AWS Customer Carbon Footprint Tool and proxy metrics from the AWS Cost & Usage Reports. This lightning talk explores how companies can cut costs and carbon emissions through architectural best practices and workload optimization.

AWS

AWS Energy Lambda Government

Technology Performance Pulse

What is Greenplum Database? Intro to the Big Data Database

In-Stream Big Data Processing

Trending Sources

Hybrid cloud infrastructure explained: Weighing the pros, cons, and complexities

What Should You Know About Graph Database’s Scalability?

What is IT operations analytics? Extract more data insights from more sources

Auto-Diagnosis and Remediation in Netflix Data Platform

What is ITOps? Why IT operations is more crucial than ever in a multicloud world

Kubernetes in the wild report 2023

Kubernetes for Big Data Workloads

Tackling the Pipeline Problem in the Architecture Research Community

What is a Distributed Storage System

The Winds of Architecture Changes at the USENIX ATC 2019

Amazon EC2 Cluster GPU Instances - All Things Distributed

Seer: leveraging big data to navigate the complexity of performance debugging in cloud microservices

Spice up your Analytics: Amazon QuickSight Now Generally Available in N. Virginia, Oregon, and Ireland.

Expanding the Cloud - Cluster Compute Instances for Amazon EC2.

Peloton: Uber’s Unified Resource Scheduler for Diverse Cluster Workloads

5 data integration trends that will define the future of ETL in 2018

Dutch Enterprises and The Cloud

Why MySQL Could Be Slow With Large Tables

Current status, needs, and challenges in Heterogeneous and Composable Memory from the HCM workshop (HPCA’23)

Peloton: Uber’s Unified Resource Scheduler for Diverse Cluster Workloads

Rethinking the 'production' of data

Will AWS Have Anything New To Say About Sustainability at re:Invent 2024?

Stay Connected