Data Wrangling
Single Line to Multiline
Description :
Single Line to Multiline operation is used to convert Single Line JSON to Multiline JSON.
The resulting JSON string can be used to store or transmit data in a structured format.
Number of Parameters : 1
Parameter : Chopkey
Chopkey Adds a new key on the fly with its value as a static value or dynamic value.
Below is an example where we are using chopkey operation to add new key.
['bizdata_dataset_response']['data']
Delimiter to JSON
Description :
Delimiter to JSON operation is used to convert Delimiter data to JSON data.
The resulting JSON string can be used to store or transmit data in a structured format.
Number of Parameters : 6
Parameter : Key Data
Key Data holds the value of delimiter data which we want to convert into JSON data.
Below is an example where we are using DataTable as key.
['DataTable']
Parameter : Delimiter
Delimiter is used to separate values in a list, record or file.
Below is an example where we are using , as delimiter.
,
Parameter : Fields
Fields is used in to specify headers name, it is not mandatory if the user wants then they can specify the fields else it will be passed as empty and pre defined fields will be generated.
Below is an example where we are using headers Customer ID, Organization Name, Month and Item as header name.
"Customer ID","Organization Name","Month","Item"
Parameter : Autodetect Column Names
Autodetect Column Names if the user defines fields then it will be false else it will be true.
Below is an example we are taking true which is for pre defined field.
true
Parameter : Skip Header
If user defines the fields it will be true else it will be false.
Below is an example we are taking false which is for predefined fields.
false
Parameter : Response Key
Response Key is the key in which your delimiter data will be stored.
Below is an example where the JSON data will be store datatable.
datatable
JSON to Delimiter
Description :
JSON to Delimiter is used to convert JSON data to Delimiter data.
The resulting Delimiter data can be used to store or transmit data in a structured format.
Number of Parameters : 6
Parameter : Key Data
Key Data holds the value of JSON data which needs to be converted into delimiter data.
Below is an example where you provide the key name ['items'] which holds your JSON data.
['items']
Parameter : Delimiter
Delimiter is used to separate values in a list, record, or file.
Below is an example where we using \t as delimiter.
\t
Parameter : Fields
Files is used in to specify headers name, it is not mandatory if the user wants then they can specify the fields else it will be passed as empty and pre defined fields will be generated.
Below is an example where we are using values like Item1,Item2 and Item3 as header name.
"Item1","Item2","Item3"
Parameter : Autodetect Column Names
Autodetect Column Names if the user defines fields then it will be false else it will be true.
Below is an example we are taking true which is for pre defined field.
true
Parameter : Skip Header
If user defines the fields it will be true else it will be false.
Below is an example we are taking false which is for predefined fields.
false
Parameter : Response Key
Response Key is the key in which your delimiter data will be stored.
Below is an example where the JSON data will be store delimiter_data.
delimiter_data
Data Aggregation
Description:
Data aggregation operation is used for the processing of raw data, this will also help in the grouping, summarizing, and processing of the data to make it easier to understand and analyze.
Number of Parameters : 3
Parameter : Agg Data Key
Agg Data Key is passed as empty for multiline data and will have data key in case of single line data.
Use this parameter when you have multi-line json data else leave blank
Below is an example where we using ['bizdata_dataset_response']['items'] as Agg Data Key for singleline data
['bizdata_dataset_response']['items']
Parameter : Groupby Key
Groupby Key gives key name that the user wants to group by, basically this is a unique identifier in the dataset.
Below is an example where we using Orders as Groupby Key as a unique identifier of a dataset which is having Order and Order Line Items, for the one Order we can have multiple items.
"Orders"
Parameter : Array Key
Array Key gives the key where the user wants to hold the common keys, the user can provide any key name
Below is an example where Order Lines holds a common key.
"Order Lines"
Parameter : Array Key Nested Columns
Give the key name as comma separated keys that should be available inside the Array Key
Below is an example where we using id, name and year as column names
"id","name","year"
Unpivot
Description:
An unpivot operation is used to convert single object into list of object based on transposed values tracking transpose key name parameters.
Number of Parameters : 2
Parameter : Transposed Key Name
Transposed Key Name used in specify name of key which needs to be transposed
Below is an example where we using bucket_type and bucket_value as values that needs to be transpose.
"bucket_type","bucket_value"
Parameter : Transposed Value
Transposed Value used in specify the values which needs to be transposed
Below is an example where the transpose values are on_hand, purchase_orders and goods_in_transit
"on_hand", "purchase_orders","goods_in_transit"
Pivot
Description:
In pivot operation we are combining multiple dictionaries (object) into a single dictionary (object) based on the transposed value and the transposed key name provided by the user.
Number of Parameters : 3
Parameter : Get Key
Get key is will be passed as empty for multiline data and will have get key in case of single line data.
Below is an example where we are using get key as items because of singleline data.
items
Parameter : Transposed Key Name
Transposed Key Name specifies name of the key which needs to be transposed.
Below is an example where we are using Item Id and Item Name as key name.
"Item Id","Item Name"
Parameter : Transposed Value
Transposed value specifies which values needs to be transposed.
Below is an example where we are using some particular item ids to be transposed OrderID-1, OrderID-2, OrderID-3
"OrderID-1", "OrderID-2"
Example Scenario :
Consider the following input data representing a dataset with various attributes.
{"bizdata_dataset":{
"id":123,
"name":"sample",
"lastname":"dataset",
"attributes":[
{
"attributename":"item",
"attributevalue":"27",
"attribute_code":12234
},
{
"attributename":"item2",
"attributevalue":"47",
"attribute_code":12334
},
{
"attributename":"item1",
"attributevalue":"37",
"attribute_code":13234
}]}}
Pivot Operation Parameters:
Parameter : Get Key
attributes
Parameter : Transposed Key Name
"attributename"
Parameter : Transposed Value
"attributevalue"
Sample Output :
After applying the Pivot Operation, the input data will be transformed as follows:
{"bizdata_dataset":{
"id": 123,
"name": "sample",
"lastname": "dataset",
"attributes":[
{
"attributename": "item",
"attributevalue": "27",
"attribute_code": 12234
},
{
"attributename": "item2",
"attributevalue": "47",
"attribute_code": 12334
},
{
"attributename": "item1",
"attributevalue": "37",
"attribute_code": 13234
}],
"item": "27",
"item2": "47",
"item1": "37"}}
Explanation of Output :
- The original array of attributes remains unchanged.
- The attributename values ("item", "item2", "item1") have been transposed to the root level of the bizdata_dataset object as keys.
- The corresponding attributevalue values ("27", "47", "37") are now associated with the newly created keys ("item", "item2", "item1").
Single Line to Tuple
Description:
Single Line to Tuple operation is used to convert a single line of data to a tuple.
Number of Parameters : 3
Parameter : Singleline Key
Singleline Key helps in reading the dataset from a single line.
Below is an example where we are using a key DataTable holds the single line data.
['DataTable']
Parameter : Table Headers
Table Headers specifies the sequence of the converted tuple data.
Below is an example where we are using a key names Item, Customer and Month the values of these key names will appear in same sequence in the tuple data.
"Item","Customer","Month"
Parameter : Tuple Key
Tuple Key is the key which is holds tuple data.
Below is an example where we are using data as tuple key.
data
Tuple to Single line
Description:
Tuple to Single line operation is used in convert a tuple into a single line, which involves taking a tuple and converting it to a single line string.
Number of Parameters : 3
Parameter : Tuple Key
Tuple Key is used to read the user's tuple data.
Below is an example where we are using Data which holds the tuple data .
['Data']
Parameter : Headers
Table Headers are the headers or key name of the new JSON.
Below is an example where we are using key name's Item, Customer and Month as the values of these key names will appear in same sequence in the singleline data.
"Item","Customer","Month"
Parameter : Singleline Key
Singleline Key helps in to storing the converted singleline data.
Below is an example where we are using datatable to store the singleline data.
datatable
Delimiter to Array
No of Parameters:- 2
Parameter:- Dl Key
Converts a given key's delimited value into array.
Below is the example how we can use Dl key.
"email"
Parameter:- Delimiter
Give the delimiter used to separate the delimited values. Delimiter can be any of “,” , “/t”, “|” etc.
Below is the example how we can use Delimiter.
“,”
Example:-
Input = {“data”:”a,b,c,d”}
Output = [a,b,c,d] .
In above example, Delimeter will be “,” and Dl key will be “data”.
How it works : From the above example we can understand that this operation is used to convert delimeter separated values to a array.
Zipfile in Base64
No of Parameters:- 5
Parameter: Source Key
This key contains all the records.
In the example below, "items" serves as the Source Key.
"items"
Parameter: File Name Key
This key holds the file name.
In the given example, "FILE_NAME" will serve as the key for the File Name Key.
"FILE_NAME"
Parameter: File Extension Key
This key contains the file extension.
In the example below, "EXTENSION" will act as the key for the file extension.
"EXTENSION"
Parameter: File Data Key
This key contains the file's data.
In the example below, "FILE_DATA" is designated as the key for the file's data.
"FILE_DATA"
Parameter: Base64 Response Key
This key holds the ultimate base64 encoded string of a zip file.
In the example below, "File_string" is designated as the key for the Base64 Response key.
"File_string"
Example:
Input = {"data ": {"items": [{"FILE_NAME": "file_01", "EXTENSION": ".csv", "FILE_DATA": "bnIsdGVzdGluZyxvcHMNCjEsZmlsZTEsemlwb3BzDQoyLGZpbGUxLHppcG9wcw0KMyxmaWxlMSx6aXBvcHM="},
{"FILE_NAME": "file_02", "EXTENSION": ".tsv", "FILE_DATA": "bnIJdGVzdGluZwlvcHMNCjEJZmlsZTIJemlwb3BzDQoyCWZpbGUyCXppcG9wcw0KMwlmaWxlMgl6aXBvcHM="},
{"FILE_NAME": "file_03", "EXTENSION": ".psv", "FILE_DATA": "bnJ8dGVzdGluZ3xvcHMNCjF8ZmlsZTJ8emlwb3BzDQoyfGZpbGUyfHppcG9wcw0KM3xmaWxlMnx6aXBvcHM="}]}}
Output = [“File_string”: “Encoded zip file string”] .