Data Engineering and Database - Technology Performance Pulse

Financial Data Engineering in SAS

DZone

JANUARY 8, 2024

Financial data engineering in SAS involves the management, processing, and analysis of financial data using the various tools and techniques provided by the SAS software suite. Here are some key aspects of financial data engineering in SAS: 1.

Data Engineering

Data Engineering Engineering Database Software

Data Engineers of Netflix?—?Interview with Kevin Wylie

The Netflix TechBlog

JULY 15, 2021

Data Engineers of Netflix?—?Interview Interview with Kevin Wylie This post is part of our “Data Engineers of Netflix” series, where our very own data engineers talk about their journeys to Data Engineering @ Netflix. Kevin, what drew you to data engineering?

Data Engineering

Data Engineering Engineering Entertainment Big Data

Leveraging Infrastructure as Code for Data Engineering Projects: A Comprehensive Guide

DZone

JULY 3, 2023

Data engineering projects often require the setup and management of complex infrastructures that support data processing, storage, and analysis. In this article, we will explore the benefits of leveraging IaC for data engineering projects and provide detailed implementation steps to get started.

Data Engineering

Data Engineering Infrastructure Code Engineering

Ready-to-go sample data pipelines with Dataflow

The Netflix TechBlog

DECEMBER 3, 2022

Optionally, this step can use the Write-Audit-Publish pattern to ensure that data is correct before it is made available to the rest of the company. See example below: - template: id: wap type: wap tables: - ${CATALOG}/${DATABASE}/${TABLE} write_jobs: - job: id: write type: Spark spark: script: $S3{./src/sparksql_write.sql}

Best Practices

Best Practices Code Testing Data Engineering

Presentation: LIquid: A Large-Scale Relational Graph Database

InfoQ

JUNE 7, 2024

Scott Meyer discusses LIquid, the graph database built to host LinkedIn, serving a ~15Tb graph at ~2M QPS. By Scott Meyer

Database

Database Data Engineering Scalability Engineering

Building and Scaling Data Lineage at Netflix to Improve Data Infrastructure Reliability, and…

The Netflix TechBlog

MARCH 25, 2019

We adopted the following mission statement to guide our investments: “Provide a complete and accurate data lineage system enabling decision-makers to win moments of truth.” Netflix’s diverse data landscape made it challenging to capture all the right data and conforming it to a common data model.

Infrastructure

Infrastructure Big Data Transportation Architecture

Presentation: Why a Hedge Fund Built Its Own Database

InfoQ

AUGUST 20, 2024

James Munro discusses ArcticDB and the practicalities of building a performant time-series datastore and why transactions, particularly the Isolation in ACID is just not worth it. By James Munro

Database

Database Performance Data Engineering Scalability

5 key areas for tech leaders to watch in 2020

O'Reilly

FEBRUARY 18, 2020

The results for data-related topics are both predictable and—there’s no other way to put it—confusing. Starting with data engineering, the backbone of all data work (the category includes titles covering data management, i.e., relational databases, Spark, Hadoop, SQL, NoSQL, etc.).

Software Architecture

Software Architecture DevOps Data Engineering Architecture

Optimizing data warehouse storage

The Netflix TechBlog

DECEMBER 21, 2020

Some of the optimizations are prerequisites for a high-performance data warehouse. Sometimes Data Engineers write downstream ETLs on ingested data to optimize the data/metadata layouts to make other ETL processes cheaper and faster. Both automatic (event-driven) as well as manual (ad-hoc) optimization.

Storage

Storage Latency Efficiency Data Engineering

How Data Inspires Building a Scalable, Resilient and Secure Cloud Infrastructure At Netflix

The Netflix TechBlog

MARCH 5, 2019

While our engineering teams have and continue to build solutions to lighten this cognitive load (better guardrails, improved tooling, …), data and its derived products are critical elements to understanding, optimizing and abstracting our infrastructure. Give us a holler if you are interested in a thought exchange.

Infrastructure

Infrastructure Cloud Scalability AWS

SQL Extensions for Time-Series Data in QuestDB

DZone

JANUARY 13, 2023

Using some sample data sets, you will learn how designated timestamps work and how to use extended SQL syntax to write queries on time-series data. Introduction Traditionally, SQL has been used for relational databases and data warehouses.

IoT

IoT Analytics Database Metrics

Microservices Adoption in 2020

O'Reilly

JULY 15, 2020

Technical roles represented in the “Other” category include IT managers, data engineers, DevOps practitioners, data scientists, systems engineers, and systems administrators. Perhaps; we’ll take a look at that next, specifically with respect to containers, centrally managed databases, and monolithic UIs.

Database

Database Architecture Education Systems

Zendesk Moves from DynamoDB to MySQL and S3 to Save over 80% in Costs

InfoQ

DECEMBER 29, 2023

Zendesk reduced its data storage costs by over 80% by migrating from DynamoDB to a tiered storage solution using MySQL and S3. The company considered different storage technologies and decided to combine the relational database and the object store to strike a balance between querybility and scalability while keeping the costs down.

Storage

Storage Scalability Database Technology

Sponsored Post: Fauna, Sisu, Educative, PA File Sight, Etleap, PerfOps, Triplebyte, Stream

High Scalability

NOVEMBER 12, 2019

Etleap is analyst-friendly , enterprise-grade ETL-as-a-service , built for Redshift and Snowflake data warehouses and S3/Glue data lakes. Our intuitive software allows data engineers to maintain pipelines without writing code, and lets analysts gain access to data in minutes instead of months.

Education

Education Java Software Engineering Engineering

Sponsored Post: Fauna, Sisu, Educative, PA File Sight, Etleap, PerfOps, Triplebyte, Stream

High Scalability

OCTOBER 29, 2019

Etleap is analyst-friendly , enterprise-grade ETL-as-a-service , built for Redshift and Snowflake data warehouses and S3/Glue data lakes. Our intuitive software allows data engineers to maintain pipelines without writing code, and lets analysts gain access to data in minutes instead of months.

Education

Education Java Software Engineering Engineering

Sponsored Post: Fauna, Sisu, Educative, PA File Sight, Etleap, PerfOps, Triplebyte, Stream

High Scalability

JANUARY 7, 2020

Etleap is analyst-friendly , enterprise-grade ETL-as-a-service , built for Redshift and Snowflake data warehouses and S3/Glue data lakes. Our intuitive software allows data engineers to maintain pipelines without writing code, and lets analysts gain access to data in minutes instead of months.

Education

Education Java Software Engineering Engineering

Sponsored Post: Fauna, Sisu, Educative, PA File Sight, Etleap, PerfOps, Triplebyte, Stream

High Scalability

DECEMBER 12, 2019

Etleap is analyst-friendly , enterprise-grade ETL-as-a-service , built for Redshift and Snowflake data warehouses and S3/Glue data lakes. Our intuitive software allows data engineers to maintain pipelines without writing code, and lets analysts gain access to data in minutes instead of months.

Education

Education Java Software Engineering Engineering

Orchestrating Data/ML Workflows at Scale With Netflix Maestro

The Netflix TechBlog

OCTOBER 18, 2022

It is a general-purpose workflow orchestrator that provides a fully managed workflow-as-a-service (WAAS) to the data platform at Netflix. It serves thousands of users, including data scientists, data engineers, machine learning engineers, software engineers, content producers, and business analysts, for various use cases.

Java

Java Scalability Traffic Architecture

Expanding the Cloud: Introducing Amazon QuickSight

All Things Distributed

OCTOBER 7, 2015

However, the data infrastructure to collect, store and process data is geared toward developers (e.g., In AWS’ quest to enable the best data storage options for engineers, we have built several innovative database solutions like Amazon RDS, Amazon RDS for Aurora, Amazon DynamoDB, and Amazon Redshift.

Cloud

Cloud Big Data AWS Analytics

How LinkedIn Serves Over 4.8 Million Member Profiles per Second

InfoQ

JULY 3, 2023

LinkedIn introduced Couchbase as a centralized caching tier for scaling member profile reads to handle increasing traffic that has outgrown their existing database cluster. The new solution achieved over 99% hit rate, helped reduce tail latencies by more than 60% and costs by 10% annually. By Rafal Gancarz

Cache

Cache Latency Traffic Database

5 data integration trends that will define the future of ETL in 2018

Abhishek Tiwari

DECEMBER 27, 2017

ETL refers to extract, transform, load and it is generally used for data warehousing and data integration. ETL is a product of the relational database era and it has not evolved much in last decade. There are several emerging data trends that will define the future of ETL in 2018.

Big Data

Big Data Artificial Intelligence Storage Hardware

ETL Workflow Modeling

Abhishek Tiwari

APRIL 14, 2018

An ETL workflow is responsible for the extraction of data from the source systems, their cleaning, transformation, and loading into the target data warehouse. There are existing formal methods to model the schema of source systems or databases such as entity-relationship diagram (ERD). Closing thoughts.

Best Practices

Best Practices Database Design Efficiency

A case for ELT

Abhishek Tiwari

DECEMBER 22, 2017

In classic Extract, transform, and load (ETL) model, we store entities in their corresponding application databases i.e. as rows in the relational tables. Then we perform frequent batch ETL from application databases to a data warehouse.

Big Data

Big Data Retail Storage Google

Presentation: Azure Cosmos DB: Low Latency and High Availability at Planet Scale

InfoQ

JULY 14, 2023

Mei-Chin Tsai, Vinod discuss the internal architecture of Azure Cosmos DB and how it achieves high availability, low latency, and scalability. By Mei-Chin Tsai, Vinod Sridharan

Latency

Latency Azure Availability Scalability

Spice up your Analytics: Amazon QuickSight Now Generally Available in N. Virginia, Oregon, and Ireland.

All Things Distributed

NOVEMBER 15, 2016

They require teams of data engineers to spend months building complex data models and synthesizing the data before they can generate their first report. Let’s walk through some of the core experiences of QuickSight that make it so easy to set up, connect to your data sources, and build visualizations in minutes.

Analytics

Analytics Availability Media Social Media

Technology Performance Pulse

Financial Data Engineering in SAS

Data Engineers of Netflix?—?Interview with Kevin Wylie

Trending Sources

Leveraging Infrastructure as Code for Data Engineering Projects: A Comprehensive Guide

Ready-to-go sample data pipelines with Dataflow

Presentation: LIquid: A Large-Scale Relational Graph Database

Building and Scaling Data Lineage at Netflix to Improve Data Infrastructure Reliability, and…

Presentation: Why a Hedge Fund Built Its Own Database

5 key areas for tech leaders to watch in 2020

Optimizing data warehouse storage

How Data Inspires Building a Scalable, Resilient and Secure Cloud Infrastructure At Netflix

SQL Extensions for Time-Series Data in QuestDB

Microservices Adoption in 2020

Zendesk Moves from DynamoDB to MySQL and S3 to Save over 80% in Costs

Sponsored Post: Fauna, Sisu, Educative, PA File Sight, Etleap, PerfOps, Triplebyte, Stream

Sponsored Post: Fauna, Sisu, Educative, PA File Sight, Etleap, PerfOps, Triplebyte, Stream

Sponsored Post: Fauna, Sisu, Educative, PA File Sight, Etleap, PerfOps, Triplebyte, Stream

Sponsored Post: Fauna, Sisu, Educative, PA File Sight, Etleap, PerfOps, Triplebyte, Stream

Orchestrating Data/ML Workflows at Scale With Netflix Maestro

Expanding the Cloud: Introducing Amazon QuickSight

How LinkedIn Serves Over 4.8 Million Member Profiles per Second

5 data integration trends that will define the future of ETL in 2018

ETL Workflow Modeling

A case for ELT

Presentation: Azure Cosmos DB: Low Latency and High Availability at Planet Scale

Spice up your Analytics: Amazon QuickSight Now Generally Available in N. Virginia, Oregon, and Ireland.

Stay Connected