article thumbnail

What is Greenplum Database? Intro to the Big Data Database

Scalegrid

Greenplum Database is an open-source , hardware-agnostic MPP database for analytics, based on PostgreSQL and developed by Pivotal who was later acquired by VMware. This feature-packed database provides powerful and rapid analytics on data that scales up to petabyte volumes. What Exactly is Greenplum? At a glance – TLDR.

Big Data 321
article thumbnail

In-Stream Big Data Processing

Highly Scalable

The shortcomings and drawbacks of batch-oriented data processing were widely recognized by the Big Data community quite a long time ago. The article is based on a research project developed at Grid Dynamics Labs. Towards Unified Big Data Processing. Partitioning and Shuffling.

Big Data 154
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Write Optimized Spark Code for Big Data Applications

DZone

Apache Spark is a powerful open-source distributed computing framework that provides a variety of APIs to support big data processing. PySpark is the Python API for Apache Spark , which allows Python developers to write Spark applications using Python instead of Scala or Java.

Big Data 173
article thumbnail

Sustainability: Thoughts from a software engineer

Dynatrace

Until recently, improvements in data center power efficiency compensated almost entirely for the increasing demand for computing resources. The rise of big data, cryptocurrencies, and AI means the IT sector contributes significantly to global greenhouse gas emissions. However, this trend is now reversing.

article thumbnail

Driving down the cost of Big-Data analytics - All Things Distributed

All Things Distributed

Driving down the cost of Big-Data analytics. The Amazon Elastic MapReduce (EMR) team announced today the ability to seamlessly use Amazon EC2 Spot Instances with their service, significantly driving down the cost of data analytics in the cloud. The posting on the AWS developer blog also has some more background.

article thumbnail

Moving HPC to the Cloud: A Guide for 2020

High Scalability

This is a guest post by Limor Maayan-Wainstein , a senior technical writer with 10 years of experience writing about cybersecurity, big data, cloud computing, web development, and more. High performance computing (HPC) enables you to solve complex problems which cannot be solved by regular computing.

Cloud 261
article thumbnail

Understanding gRPC Concepts, Use Cases, and Best Practices

DZone

As we are progressing with application development, among various things, there is one primary thing we are less worried about: computing power. Because with the advent of cloud providers, we are less worried about managing data centers. This leads to an increase in the size of data as well.