article thumbnail

What is Greenplum Database? Intro to the Big Data Database

Scalegrid

When handling large amounts of complex data, or big data, chances are that your main machine might start getting crushed by all of the data it has to process in order to produce your analytics results. Greenplum features a cost-based query optimizer for large-scale, big data workloads. Query Optimization.

Big Data 321
article thumbnail

3 Performance Tricks for Dealing With Big Data Sets

DZone

This article describes 3 different tricks that I used in dealing with big data sets (order of 10 million records) and that proved to enhance performance dramatically. Trick 1: CLOB Instead of Result Set.

Big Data 246
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

In-Stream Big Data Processing

Highly Scalable

The shortcomings and drawbacks of batch-oriented data processing were widely recognized by the Big Data community quite a long time ago. Other flows are more sophisticated: one Storm topology can pass the data to another topology via Kafka or Cassandra. Towards Unified Big Data Processing. Apache Spark [10].

Big Data 154
article thumbnail

Introduction to Azure Data Lake Storage Gen2

DZone

Built on Azure Blob Storage, Azure Data Lake Storage Gen2 is a suite of features for big data analytics. Azure Data Lake Storage Gen1 and Azure Blob Storage's capabilities are combined in Data Lake Storage Gen2. For instance, Data Lake Storage Gen2 offers scale, file-level security, and file system semantics.

Azure 250
article thumbnail

Understanding gRPC Concepts, Use Cases, and Best Practices

DZone

Because with the advent of cloud providers, we are less worried about managing data centers. Everything is available within seconds on-demand. This leads to an increase in the size of data as well. Big data is generated and transported using various mediums in single requests.

article thumbnail

What is IT operations analytics? Extract more data insights from more sources

Dynatrace

Then, big data analytics technologies, such as Hadoop, NoSQL, Spark, or Grail, the Dynatrace data lakehouse technology, interpret this information. Here are the six steps of a typical ITOA process : Define the data infrastructure strategy. Why use a data lakehouse for causal AI? Why is ITOA important? Apache Spark.

Analytics 246
article thumbnail

Kubernetes for Big Data Workloads

Abhishek Tiwari

Kubernetes has emerged as go to container orchestration platform for data engineering teams. In 2018, a widespread adaptation of Kubernetes for big data processing is anitcipated. Organisations are already using Kubernetes for a variety of workloads [1] [2] and data workloads are up next. Key challenges. Performance.