Big Data and Development - Technology Performance Pulse

What is Greenplum Database? Intro to the Big Data Database

Scalegrid

MAY 13, 2020

Greenplum Database is an open-source , hardware-agnostic MPP database for analytics, based on PostgreSQL and developed by Pivotal who was later acquired by VMware. This feature-packed database provides powerful and rapid analytics on data that scales up to petabyte volumes. What Exactly is Greenplum? At a glance – TLDR.

Big Data

Big Data Database Artificial Intelligence Open Source

In-Stream Big Data Processing

Highly Scalable

AUGUST 20, 2013

The shortcomings and drawbacks of batch-oriented data processing were widely recognized by the Big Data community quite a long time ago. The article is based on a research project developed at Grid Dynamics Labs. Towards Unified Big Data Processing. Partitioning and Shuffling.

Big Data

Big Data Processing Lambda Database

Write Optimized Spark Code for Big Data Applications

DZone

MARCH 7, 2023

Apache Spark is a powerful open-source distributed computing framework that provides a variety of APIs to support big data processing. PySpark is the Python API for Apache Spark , which allows Python developers to write Spark applications using Python instead of Scala or Java.

Big Data

Big Data Code Tuning Open Source

Sustainability: Thoughts from a software engineer

Dynatrace

MARCH 17, 2025

Until recently, improvements in data center power efficiency compensated almost entirely for the increasing demand for computing resources. The rise of big data, cryptocurrencies, and AI means the IT sector contributes significantly to global greenhouse gas emissions. However, this trend is now reversing.

Software Engineering

Software Engineering Engineering Software Software

Driving down the cost of Big-Data analytics - All Things Distributed

All Things Distributed

AUGUST 18, 2011

Driving down the cost of Big-Data analytics. The Amazon Elastic MapReduce (EMR) team announced today the ability to seamlessly use Amazon EC2 Spot Instances with their service, significantly driving down the cost of data analytics in the cloud. The posting on the AWS developer blog also has some more background.

Big Data

Big Data Analytics AWS Cloud

Moving HPC to the Cloud: A Guide for 2020

High Scalability

SEPTEMBER 14, 2020

This is a guest post by Limor Maayan-Wainstein , a senior technical writer with 10 years of experience writing about cybersecurity, big data, cloud computing, web development, and more. High performance computing (HPC) enables you to solve complex problems which cannot be solved by regular computing.

Cloud

Cloud Big Data Virtualization Efficiency

Understanding gRPC Concepts, Use Cases, and Best Practices

DZone

JANUARY 19, 2023

As we are progressing with application development, among various things, there is one primary thing we are less worried about: computing power. Because with the advent of cloud providers, we are less worried about managing data centers. This leads to an increase in the size of data as well.

Best Practices

Best Practices Transportation Big Data Latency

What is IT operations analytics? Extract more data insights from more sources

Dynatrace

MAY 1, 2023

Then, big data analytics technologies, such as Hadoop, NoSQL, Spark, or Grail, the Dynatrace data lakehouse technology, interpret this information. Here are the six steps of a typical ITOA process : Define the data infrastructure strategy. Identify data use cases and develop a scalable delivery model with documentation.

Analytics

Analytics Artificial Intelligence Big Data Open Source

Kubernetes for Big Data Workloads

Abhishek Tiwari

DECEMBER 27, 2017

Kubernetes has emerged as go to container orchestration platform for data engineering teams. In 2018, a widespread adaptation of Kubernetes for big data processing is anitcipated. Organisations are already using Kubernetes for a variety of workloads [1] [2] and data workloads are up next. Key challenges. Performance.

Big Data

Big Data Storage Benchmarking Hardware

What is software automation? Optimize the software lifecycle with intelligent automation

Dynatrace

JUNE 26, 2023

Software automation enables digital supply chain stakeholders — such as digital operations, DevSecOps, ITOps, and CloudOps teams — to orchestrate resources across the software development lifecycle to bring innovative, high-quality products and services to market faster. What is software analytics?

Software

Software Software Analytics Big Data

An overview of end-to-end entity resolution for big data

The Morning Paper

DECEMBER 13, 2020

An overview of end-to-end entity resolution for big data , Christophides et al., It’s an important part of many modern data workflows, and an area I’ve been wrestling with in one of my own projects. A variety of supervised, semi-supervised, and unsupervised matching techniques have also been developed. 2020, Article No.

Big Data

Big Data Open Source Processing Analytics

Path to NoOps part 1: How modern AIOps brings NoOps within reach

Dynatrace

OCTOBER 25, 2022

The need for developers and innovation is now even greater. NoOps is a concept in software development that seeks to automate processes and eliminate the need for an extensive IT operations team. But it might also result in the entire software development process falling apart.

DevOps

DevOps Big Data Cloud Innovation

Data Engineers of Netflix?—?Interview with Pallavi Phadnis

The Netflix TechBlog

OCTOBER 28, 2021

Netflix’s unique work culture and petabyte-scale data problems are what drew me to Netflix. During earlier years of my career, I primarily worked as a backend software engineer, designing and building the backend systems that enable big data analytics. How’s data engineering similar and different from software engineering?

Data Engineering

Data Engineering Engineering Big Data Software Engineering

Data Engineers of Netflix?—?Interview with Kevin Wylie

The Netflix TechBlog

JULY 15, 2021

I stumbled into data engineering rather than making an intentional career move into the field. I started my career as an application developer with basic familiarity with SQL. I was later hired into my first purely data gig where I was able to deepen my knowledge of big data. What drew you to Netflix?

Data Engineering

Data Engineering Engineering Entertainment Big Data

Seven benefits of AIOps to transform your business operations

Dynatrace

JULY 5, 2022

AIOps combines big data and machine learning to automate key IT operations processes, including anomaly detection and identification, event correlation, and root-cause analysis. A truly modern AIOps solution also serves the entire software development lifecycle to address the volume, velocity, and complexity of multicloud environments.

Artificial Intelligence

Artificial Intelligence Cloud Innovation Strategy

Building and Scaling Data Lineage at Netflix to Improve Data Infrastructure Reliability, and…

The Netflix TechBlog

MARCH 25, 2019

We adopted the following mission statement to guide our investments: “Provide a complete and accurate data lineage system enabling decision-makers to win moments of truth.” Nonetheless, Netflix data landscape (see below) is complex and many teams collaborate effectively for sharing the responsibility of our data system management.

Infrastructure

Infrastructure Big Data Transportation Architecture

Python at Netflix

The Netflix TechBlog

APRIL 29, 2019

Subsequently, many useful libraries get developed, making the language even more desirable to learn and use. We’ve developed a time series correlation system used both inside and outside the team as well as a distributed worker system to parallelize large amounts of analytical work to deliver results quickly.

Open Source

Open Source Network Infrastructure Big Data

What is cloud monitoring? How to improve your full-stack visibility

Dynatrace

JANUARY 11, 2023

As cloud and big data complexity scales beyond the ability of traditional monitoring tools to handle, next-generation cloud monitoring and observability are becoming necessities for IT teams. What is cloud monitoring? ” The post What is cloud monitoring?

Cloud

Cloud Monitoring Best Practices Infrastructure

What is IT automation?

Dynatrace

JULY 6, 2022

Developing automation takes time. This kind of automation can support key IT operations, such as infrastructure, digital processes, business processes, and big-data automation. Big data automation tools. Automating routine IT tasks eliminates the human element—and the potential mistakes that come with it.

Artificial Intelligence

Artificial Intelligence Tuning Strategy Big Data

No need to compromise visibility in public clouds with the new Azure services supported by Dynatrace

Dynatrace

JULY 6, 2020

As adoption rates for Microsoft Azure continue to skyrocket, Dynatrace is developing a deeper integration with the platform to provide even more value to organizations that run their businesses on Azure or use it as a part of their multi-cloud strategy. See the health of your big data resources at a glance. Azure Front Door.

Azure

Azure Cloud Big Data Virtualization

Big / Bug Data: Analyzing the Apache Flink Source Code

DZone

DECEMBER 21, 2020

Applications used in the field of Big Data process huge amounts of information, and this often happens in real time. Naturally, such applications must be highly reliable so that no error in the code can interfere with data processing. It is an open-source framework for distributed processing of large amounts of data.

Code

Code Java Big Data Open Source

Top 15 Software Testing Trends to Watch Out in 2021

DZone

DECEMBER 28, 2020

The introduction of innovative technologies has brought the newest updates in software testing, development, design, and delivery. Nowadays, Big Data tests mainly include data testing, paving the way for the Internet of Things to become the center point. Besides, AI and ML seem to reach a new level.

Software

Software Software Testing Big Data

Business Insights extends support for optimizing Core Web Vitals

Dynatrace

APRIL 21, 2021

On the Dynatrace Business Insights team, we have developed analytical views and an approach to help you get started. To do this effectively, you need a big data processing approach. Not all pages are equally important, and development resources are top priority. How do you know where to focus first with failing pages?

Traffic

Traffic Mobile Metrics Analytics

DynatraceGo! APAC 2021: Lessons in thick data and keeping pace with the market

Dynatrace

AUGUST 10, 2021

I had the privilege of setting everyone up for the day alongside my co-host Carrie Mott , Head of Marketing and Business Development for APAC. BPAY is in the midst of its digital transformation journey in which it is discovering the critical importance of developing “contemporary ways of designing, operating, and using” its software.

DevOps

DevOps Innovation Big Data Cloud

Experiences with approximating queries in Microsoft’s production big-data clusters

The Morning Paper

SEPTEMBER 8, 2019

Experiences with approximating queries in Microsoft’s production big-data clusters Kandula et al., Microsoft’s big data clusters have 10s of thousands of machines, and are used by thousands of users to run some pretty complex queries. The accuracy was considered adequate by the developer. VLDB’19.

Big Data

Big Data Analytics Latency Azure

Introduction to Grafana, Prometheus, and Zabbix

DZone

FEBRUARY 6, 2024

If the data sources are not available then customized plugins can be developed to integrate these data sources. Grafana is used widely these days to monitor and visualize the metrics for 100s or 1000s of servers, Kubernetes Platforms, Virtual Machines, Big Data Platforms, etc.

Big Data

Big Data Open Source Virtualization Metrics

A Recap of the Data Engineering Open Forum at Netflix

The Netflix TechBlog

JUNE 20, 2024

On April 18th, 2024, we hosted the inaugural Data Engineering Open Forum at our Los Gatos office, bringing together data engineers from various industries to share, learn, and connect. At the conference, our speakers share their unique perspectives on modern developments, immediate challenges, and future prospects of data engineering.

Data Engineering

Data Engineering Engineering Entertainment Software Engineering

Formulating ‘Out of Memory Kill’ Prediction on the Netflix App as a Machine Learning Problem

The Netflix TechBlog

JULY 21, 2022

We at Netflix, as a streaming service running on millions of devices, have a tremendous amount of data about device capabilities/characteristics and runtime data in our big data platform. With large data, comes the opportunity to leverage the data for predictive and classification based analysis.

Big Data

Big Data Cache Engineering Data Engineering

What is container orchestration?

Dynatrace

MARCH 24, 2023

By embracing public cloud and hybrid cloud computing environments, IT teams can further accelerate development and automate software deployment and management. Container technology enables organizations to efficiently develop cloud-native applications or to modernize legacy applications to take advantage of cloud services.

Infrastructure

Infrastructure Open Source Operating System Cloud

What is ITOps? Why IT operations is more crucial than ever in a multicloud world

Dynatrace

DECEMBER 15, 2022

We’ll discuss how the responsibilities of ITOps teams changed with the rise of cloud technologies and agile development methodologies. Adding application security to development and operations workflows increases efficiency. So, what is ITOps? What is ITOps? CloudOps teams are one step further in the digital supply chain.

Artificial Intelligence

Artificial Intelligence DevOps Hardware Virtualization

Kubernetes in the wild report 2023

Dynatrace

JANUARY 16, 2023

The immense growth of Kubernetes presents new security challenges in runtime and increased complexity in hardening CI/CD pipelines in development. Big data : To store, search, and analyze large datasets, 32% of organizations use Elasticsearch. Andreas Berger, Dynatrace Senior Principal Application Security.

Open Source

Open Source Java Operating System Programming

Data Movement in Netflix Studio via Data Mesh

The Netflix TechBlog

JULY 26, 2021

Processors with Different Inputs/Outputs Data Mesh allows developers to contribute processors to the platform. Processors are not necessarily centrally developed and managed. However, the Data Mesh platform team strives to provide and manage the most highly leveraged processors (e.g. Iceberg, ElasticSearch, etc).

Big Data

Big Data Government Processing Analytics

Bulldozer: Batch Data Moving from Data Warehouse to Online Key-Value Stores

The Netflix TechBlog

OCTOBER 27, 2020

Data scientists and engineers collect this data from our subscribers and videos, and implement data analytics models to discover customer behaviour with the goal of maximizing user joy. The processed data is typically stored as data warehouse tables in AWS S3.

Latency

Latency Storage Big Data Tuning

A guide to Autonomous Performance Optimization

Dynatrace

SEPTEMBER 15, 2020

In my recent Performance Clinic with Stefano Doni , CTO & Co-Founder of Akamas , I made the statement, “Application development and release cycles today are measured in days, instead of months. Supported technologies include cloud services, big data, databases, OS, containers, and application runtimes like the JVM.

Performance

Performance Java Metrics Cloud

End-to-end observability provides deep insights into user behavior for British Columbia Lottery Corporation

Dynatrace

APRIL 19, 2023

“We also have some big data analytics use cases that help extend the Dynatrace platform, see how performance is affecting behavior, and identify long-term trends,” Rushlo said. ” BCLC has started onboarding developers into Dynatrace and monitoring additional services.

Entertainment

Entertainment Analytics Healthcare Games

RSA Guide 2023: Cloud application security remains core challenge for organizations

Dynatrace

APRIL 11, 2023

The focus on bringing various organizational teams together—such as development, business, and security teams — makes sense as observability data, security data, and business event data coalesce in these cloud-native environments. As organizations develop new applications, vulnerabilities will continue to emerge.

Cloud

Cloud DevOps Open Source Retail

What is behavior analytics?

Dynatrace

AUGUST 14, 2023

As organizations become more familiar with baseline trends in user behavior, teams can develop hypotheses for why users are taking certain actions, and later verify with relevant analytics. When rolling out new features, user behavior analysts can use these capabilities to track customer adoption and make sure goals are proceeding on track.

Analytics

Analytics Social Media Website IoT

Expanding the AWS Cloud: Introducing the AWS Europe (London) Region

All Things Distributed

DECEMBER 13, 2016

With the launch of the AWS Europe (London) Region, AWS can enable many more UK enterprise, public sector and startup customers to reduce IT costs, address data locality needs, and embark on rapid transformations in critical new areas, such as big data analysis and Internet of Things. Fraud.net is a good example of this.

AWS

AWS Cloud Artificial Intelligence IoT

Applying real-world AIOps use cases to your operations

Dynatrace

OCTOBER 17, 2022

Artificial intelligence for IT operations, or AIOps, combines big data and machine learning to provide actionable insight for IT teams to shape and automate their operational strategy. This makes developing, operating, and securing modern applications and the environments they run on practically impossible without AI.

DevOps

DevOps Artificial Intelligence Healthcare Innovation

Exploratory analytics and collaborative analytics capabilities democratize insights across teams

Dynatrace

APRIL 25, 2023

Exploratory analytics with collaborative analytics capabilities can be a lifeline for CloudOps, ITOps, site reliability engineering, and other teams struggling to access, analyze, and conquer the never-ending deluge of big data. These analytics can help teams understand the stories hidden within the data and share valuable insights.

Analytics

Analytics Big Data Media Operating System

The next generation of developer productivity

O'Reilly

AUGUST 15, 2023

To follow up on our previous survey about low-code and no-code tools, we decided to run another short survey about tools specifically for software developers—including, but not limited to, GitHub Copilot and ChatGPT. We’re interested in how “developer enablement” tools of all sorts are changing the workplace.

Development

Development Programming Speed Open Source

AIOps observability adoption ascends in healthcare

Dynatrace

MARCH 14, 2022

With containers, microservices, applications, and other components that are constantly broken down and rebuilt as part of the software development lifecycle (SDLC), IT teams struggle to identify true issues and deliver software that enables doctors and nurses to deliver care. Today’s enterprise environments are dynamic and complex.

Healthcare

Healthcare Artificial Intelligence Innovation Strategy

Evolving from Rule-based Classifier: Machine Learning Powered Auto Remediation in Netflix Data…

The Netflix TechBlog

MARCH 4, 2024

Operational automation–including but not limited to, auto diagnosis, auto remediation, auto configuration, auto tuning, auto scaling, auto debugging, and auto testing–is key to the success of modern data platforms.

Tuning

Tuning Efficiency Big Data Engineering

How Our Paths Brought Us to Data and Netflix

The Netflix TechBlog

SEPTEMBER 18, 2020

I bring my breadth of big data tools and technologies while Julie has been building statistical models for the past decade. My work is typically developed in R or Python. [Chris] Julie and I joined the Streaming DSE team at Netflix a few years ago and have been close colleagues and friends since then. Do they cause less errors?

Analytics

Analytics Education Innovation Engineering

What is Greenplum Database? Intro to the Big Data Database

In-Stream Big Data Processing

Trending Sources

Write Optimized Spark Code for Big Data Applications

Sustainability: Thoughts from a software engineer

Driving down the cost of Big-Data analytics - All Things Distributed

Moving HPC to the Cloud: A Guide for 2020

Understanding gRPC Concepts, Use Cases, and Best Practices

What is IT operations analytics? Extract more data insights from more sources

Kubernetes for Big Data Workloads

What is software automation? Optimize the software lifecycle with intelligent automation

An overview of end-to-end entity resolution for big data

Path to NoOps part 1: How modern AIOps brings NoOps within reach

Data Engineers of Netflix?—?Interview with Pallavi Phadnis

Data Engineers of Netflix?—?Interview with Kevin Wylie

Seven benefits of AIOps to transform your business operations

Building and Scaling Data Lineage at Netflix to Improve Data Infrastructure Reliability, and…

Python at Netflix

What is cloud monitoring? How to improve your full-stack visibility

What is IT automation?

No need to compromise visibility in public clouds with the new Azure services supported by Dynatrace

Big / Bug Data: Analyzing the Apache Flink Source Code

Top 15 Software Testing Trends to Watch Out in 2021

Business Insights extends support for optimizing Core Web Vitals

DynatraceGo! APAC 2021: Lessons in thick data and keeping pace with the market

Experiences with approximating queries in Microsoft’s production big-data clusters

Introduction to Grafana, Prometheus, and Zabbix

A Recap of the Data Engineering Open Forum at Netflix

Formulating ‘Out of Memory Kill’ Prediction on the Netflix App as a Machine Learning Problem

What is container orchestration?

What is ITOps? Why IT operations is more crucial than ever in a multicloud world

Kubernetes in the wild report 2023

Data Movement in Netflix Studio via Data Mesh

Bulldozer: Batch Data Moving from Data Warehouse to Online Key-Value Stores

A guide to Autonomous Performance Optimization

End-to-end observability provides deep insights into user behavior for British Columbia Lottery Corporation

RSA Guide 2023: Cloud application security remains core challenge for organizations

What is behavior analytics?

Expanding the AWS Cloud: Introducing the AWS Europe (London) Region

Applying real-world AIOps use cases to your operations

Exploratory analytics and collaborative analytics capabilities democratize insights across teams

The next generation of developer productivity

AIOps observability adoption ascends in healthcare

Evolving from Rule-based Classifier: Machine Learning Powered Auto Remediation in Netflix Data…

How Our Paths Brought Us to Data and Netflix

Stay Connected