Transformation pipeline

Free consultation available!
Feel free to contact us. We are happy to help you with the data transformations pipeline setup.

The Dataddo transformation pipeline is a framework for data transformation modeled on the concept of data processing pipelines. API responses from 3rd party services enter a multi-stage pipeline that transforms the response into a dataset ingestible by the Dataddo platform.

Stages

The Dataddo transformation pipeline consists of multiple stages. Each stage transforms the API response as it passes through the pipeline. Pipeline stages can appear multiple times in the pipeline.

Stage Description
$project Reshapes API response in the stream. e.g. adding new fields or removing existing fields.
$unwind Deconstructs an array field from the API response to object for each element. Each object replaces the array with an element value. For each input, outputs N objects where N is the number of array elements and can be zero for an empty array.
$match Filters the stream to allow only matching objects to pass unmodified into the next pipeline stage. For each input object, outputs either one object (a match) or removes the object from stream (no match).
$group Groups input objects by a specified identifier expression and applies the accumulator expression(s), if specified, to each group. Consumes all input objects and outputs one object per each distinct group. The output objects only contain the identifier field and, if specified, accumulated fields.

 

Expressions

Some pipeline stages take a pipeline expression as the operand. Pipeline expressions specify the transformation to apply to the input objects. Expressions have a standard JSON object structure and can contain other expressions.

Pipeline expressions can only operate on the current object in the pipeline and cannot refer to data from other objects: expression operations provide in-memory transformation of objects.

Expression operators

Arithmetic

Arithmetic expressions perform mathematic operations on numbers.

Operator Description
$abs Returns the absolute value of a number.
$add Adds numbers to return the sum.
$subtract Returns the result of subtracting the second value from the first.
$ceil Returns the smallest integer greater than or equal to the specified number.
$floor Returns the largest integer less than or equal to the specified number.

 

Boolean

Boolean expressions evaluate their argument expressions as booleans and return a boolean as a result.

Operator Description
$and Returns true only when all its expressions evaluate to true. Accepts any number of argument expressions.
$not Returns the boolean value that is the opposite of its argument expression. Accepts a single argument expression.
$or Returns true when any of its expressions evaluates to true. Accepts any number of argument expressions.

 

Comparison

Operator Description
$cmp Returns 0 if the two values are equivalent, 1 if the first value is greater than the second, and -1 if the first value is less than the second.
$eq Returns true if the values are equivalent.
$gt Returns true if the first value is greater than the second.
$gte Returns true if the first value is greater than or equal to the second.
$lt Returns true if the first value is less than the second.
$lte Returns true if the first value is less than or equal to the second.
$ne Returns true if the values are not equivalent.