Filter Operation Use Cases
In this guide, we will be discussing various scenarios in which Filter operation can be used and how the users can add them to their integration bridge.
To show these scenarios, a dataset of 12 records has been used in multiple ways. The dataset consists of records:
Scenario 1: With Target as Datalake and 1 filter condition
In this scenario, one filter condition will be applied on the data coming from the source and then the filtered record will be sent to datalake.
Operations Used:
- To send the records to datalake, we apply Singleline to Multiline operation as the data coming from source is a singleline JSON data.
- Next, the filter condition is applied as per the requirement (Here, Filter condition applied is ID not equal to 2, which means records associated with ID 2 should not be sent to target).
- Once the filter operation is applied, create a new column using Append, which will help represent the status of the records sent, end the filter condition using Filter Ends, and aggregate the data using Data Aggregation for sending to the Target.
Example: Data in Source
{
"id": 1,
"email": "george.bluth@reqres.in",
"first_name": "George",
"last_name": "Bluth"
},
{
"id": 2,
"email": "janet.weaver@reqres.in",
"first_name": "Janet",
"last_name": "Weaver"
},
{
"id": 3,
"email": "emma.wong@reqres.in",
"first_name": "Emma",
"last_name": "Wong"
}
Filter Condition:
data['id'] !=2
Output in Datalake:
id |
|
first_name |
last_name |
Status |
1 |
george.bluth@reqres.in |
George |
Bluth |
Success |
3 |
emma.wong@reqres.in |
Emma |
Wong |
Success |
Scenario 2: With Target as API and 1 filter condition
In this scenario, one filter condition will be applied to the data coming from the source and then the filtered record will be sent to the target which is API.
Operations Used:
- To send the records to the target (API), we apply Singleline to Multiline operation as the data coming from source is a singleline JSON data.
- Next, the filter condition is applied as per the requirement (Here, Filter condition applied is ID not equal to 3, which means records associated with ID 3 should not be sent to target).
- Once the filter operation is applied, create a new column using Append, which will help represent the status of the records sent, end the filter condition using Filter Ends, and now send the records to the target.
Note: Further operations can be used in the integration bridge depending on the Target and data type.
Example: Data in Source
{
"id": 1,
"email": "george.bluth@reqres.in",
"first_name": "George",
"last_name": "Bluth"
},
{
"id": 2,
"email": "janet.weaver@reqres.in",
"first_name": "Janet",
"last_name": "Weaver"
},
{
"id": 3,
"email": "emma.wong@reqres.in",
"first_name": "Emma",
"last_name": "Wong"
}
Filter Condition:
data['id'] !=3
Output in Target:
{
"id": 1,
"email": "george.bluth@reqres.in",
"first_name": "George",
"last_name": "Bluth"
},
{
"id": 2,
"email": "janet.weaver@reqres.in",
"first_name": "Janet",
"last_name": "Weaver"
}
Scenario 3: With Target as Database and 1 filter condition
In this scenario, one filter condition will be applied to the data coming from the source and then the filtered record will be sent to the Database.
Operations Used:
- To send the records to database, we apply Singleline to Multiline operation on the data coming from source, as it is a singleline JSON data.
- Next, the filter condition is applied as per the requirement (Here, Filter condition applied is ID not equal to 1, which means records associated with ID 1 should not be sent to target).
- Once the filter operation is applied, create a new column using Append, which will help represent the status of the records sent, and end the filter condition using Filter Ends.
- To send the data to the Database, we need to create a Tuple. But before creating the tuple, aggregate all the columns using Data Aggregation and store it in a tuple key.
- Use Singleline to Tuple and send the records to the Database.
Example: Data in Source
{
"id": 1,
"email": "george.bluth@reqres.in",
"first_name": "George",
"last_name": "Bluth"
},
{
"id": 2,
"email": "janet.weaver@reqres.in",
"first_name": "Janet",
"last_name": "Weaver"
},
{
"id": 3,
"email": "emma.wong@reqres.in",
"first_name": "Emma",
"last_name": "Wong"
}
Filter Condition:
data['id'] !=1
Output in Datalake:
id |
|
first_name |
last_name |
Status |
2 |
janet.weaver@reqres.in |
Janet |
Weaver |
Success |
3 |
emma.wong@reqres.in |
Emma |
Wong |
Success |
Scenario 4: With Target as Datalake but more than 1 filter condition
In this scenario, more than one filter condition will be applied to the data coming from the source and then the filtered record will be sent to the Datalake.
Operations Used:
- To send the records to Datalake, we will use the operation Singleline to Multiline as your records are in singleline JSON data.
- Now using Append operation, we will add one dummy key-value pair as this will help to aggregate all records after using Filter operation more than once.
- Here we will Aggregate Data based on a dummy key.
- Once the key is created, apply the filter conditions along with the Append operation to create a new column for showing the status of the conditions applied.
- Now, aggregate all the records using the dummy key created earlier and send the records to the target.
Example: Data in Source
{
"id": 1,
"email": "george.bluth@reqres.in",
"first_name": "George",
"last_name": "Bluth"
},
{
"id": 2,
"email": "janet.weaver@reqres.in",
"first_name": "Janet",
"last_name": "Weaver"
}
Condition 1
· Filter Condition:
data['api_status'] == 200
· Append Operation:
"Status": "Success"
Condition 2
· Filter Condition:
data['api_status'] != 200
· Append Operation:
"Status": "Failure"
Output in Datalake:
id |
|
first_name |
last_name |
Status |
1 |
george.bluth@reqres.in |
George |
Bluth |
Success |
2 |
janet.weaver@reqres.in |
Janet |
Weaver |
Success |
Scenario 5: With Target as Database but more than 1 filter condition
In this scenario, more than one filter condition will be applied to the data coming from the source and then the filtered record will be sent to the Database.
Operations Used:
- To send the records to Datalake, we will use the operation Singleline to Multiline as your records are in singleline JSON data.
- Now using Append operation, we will add one dummy key-value pair as this will help to aggregate all records after using Filter operation more than once.
- Here we will Aggregate Data based on a dummy key.
- Once the key is created, apply the filter conditions along with the Append operation to create a new column for showing the status of the conditions applied.
- Now, aggregate all the records using the dummy key created earlier.
- To send the records to the Database, create a tuple using Singleline to Tuple and send the records to the target.
Example: Data in Source
{
"id": 4,
"email": "eve.holt@reqres.in",
"first_name": "Eve",
"last_name": "Holt"
},
{
"id": 5,
"email": "charles.morris@reqres.in",
"first_name": "Charles",
"last_name": "Morris"
}
Condition 1
· Filter Condition:
data['api_status'] == 200
· Append Operation:
"Status": "Success"
Condition 2
· Filter Condition:
data['api_status'] != 200
· Append Operation:
"Status": "Failure"
Output in Database:
id |
|
first_name |
last_name |
Status |
4 |
eve.holt@reqres.in |
Eve |
Holt |
Success |
5 |
charles.morris@reqres.in |
Charles |
Morris |
Success |
Scenario 6: With Target as API but more than 1 filter condition
In this scenario, more than one filter condition will be applied to the data coming from the source and then the filtered record will be sent to the target (API).
Operations Used:
- To send the records to Datalake, we will use the operation Singleline to Multiline as your records are in singleline JSON data.
- Now using the Append operation, we will add one dummy key-value pair as this will help to aggregate all records after using the Filter operation more than once.
- Here we will Aggregate Data based on a dummy key.
- Once the key is created, apply the filter conditions along with the Append operation to create a new column for showing the status of the conditions applied.
- Now, aggregate all the records using the dummy key created earlier and send the records to the target which is API.
Note: Other required operations can also be used in the integration process based on the target and data type.
Example: Data in Source
{
"id": 7,
"email": "michael.lawson@reqres.in",
"first_name": "Michael",
"last_name": "Lawson"
},
{
"id": 8,
"email": "lindsay.ferguson@reqres.in",
"first_name": "Lindsay",
"last_name": "Ferguson"
}
Condition 1
· Filter Condition:
data['api_status'] == 200
· Append Operation:
"Status": "Success"
Condition 2
· Filter Condition:
data['api_status'] != 200
· Append Operation:
"Status": "Failure"
Output in Target:
id |
|
first_name |
last_name |
Status |
7 |
michael.lawson@reqres.in |
Michael |
Lawson |
Success |
8 |
lindsay.ferguson@reqres.in |
Lindsay |
Ferguson |
Success |
Scenario 7: With more than 1 filter condition and multiple Targets
Case A: Database and Datalake as Target
In this scenario, more than one filter condition will be applied to the data coming from the source and then the filtered record will be sent to multiple targets which are Database and Datalake.
Operations Used:
In this scenario, we have two targets, Database and Datalake. Therefore, we will be sending the records to Database, followed by Datalake.
- We will start by using the operation Singleline to Multiline as your records are in singleline JSON data.
- Now using the Append operation, we will add one dummy key-value pair as this will help to aggregate all records after using the multiple Filter operation.
- Once the key is created, apply the filter conditions along with the Append operation to create a new column for showing the status of the conditions applied.
- Using the same Append Operation, create another dummy key for storing the filtered records of the first condition.
- Use the Filter end to close the first condition and use the Filter operation to apply the next condition.
- Close the second condition using Filter end.
- Using Append, create two keys, as created for first filter condition.
- Again, use the filter condition to send records to the target.
- Since the records for first condition will be sent to Database, we will use the Data aggregation operation and aggregate the records using the key created for first condition.
- Use Singleline to Tuple to create a tuple for records to be sent to database.
- Use Data Aggregation and aggregate the filtered values of second condition using the key created earlier.
- Use Singleline to Multiline for sending records to the Datalake.
Note: In the two keys created using Append operation, first key will represent the filter condition whereas the second key will help in aggregating the records for second filter condition.
Example: Data in Source
{
"id": 1,
"email": "george.bluth@reqres.in",
"first_name": "George",
"last_name": "Bluth"
},
{
"id": 2,
"email": "janet.weaver@reqres.in",
"first_name": "Janet",
"last_name": "Weaver"
},
{
"id": 3,
"email": "emma.wong@reqres.in",
"first_name": "Emma",
"last_name": "Wong"
},
{
"id": 4,
"email": "eve.holt@reqres.in",
"first_name": "Eve",
"last_name": "Holt"
},
{
"id": 5,
"email": "charles.morris@reqres.in",
"first_name": "Charles",
"last_name": "Morris"
},
{
"id": 6,
"email": "tracey.ramos@reqres.in",
"first_name": "Tracey",
"last_name": "Ramos",
"avatar": "https://reqres.in/img/faces/6-image.jpg"
},
{
"id": 7,
"email": "michael.lawson@reqres.in",
"first_name": "Michael",
"last_name": "Lawson"
},
{
"id": 8,
"email": "lindsay.ferguson@reqres.in",
"first_name": "Lindsay",
"last_name": "Ferguson"
}
Condition 1 (For Database)
· Filter Condition:
data['id'] == 4 or data['id'] == 5
· Append Operation:
"Status": "Success",
"filter_1": "yes"
Condition 2 (For Datalake)
· Filter Condition:
data['id'] !=4 and data['id']!=5
· Append Operation:
"Status": "Failure",
"filter_2": "no"
- Filter Operation to Aggregate Data:
data['k1'] =='v1'
Note: The filter operation to aggregate data is being used in 3rd and 4th filter operation in the bridge.
Output in Target (Database):
id |
|
first_name |
last_name |
Status |
4 |
eve.holt@reqres.in |
Eve |
Holt |
Success |
5 |
charles.morris@reqres.in |
Charles |
Morris |
Success |
Output in Target (Datalake):
id |
|
first_name |
last_name |
Status |
1 |
george.bluth@reqres.in |
George |
Bluth |
Failure |
2 |
janet.weaver@reqres.in |
Janet |
Weaver |
Failure |
3 |
emma.wong@reqres.in |
Emma |
Wong |
Failure |
6 |
tracey.ramos@reqres.in |
Tracey |
Ramos |
Failure |
7 |
michael.lawson@reqres.in |
Michael |
Lawson |
Failure |
8 |
lindsay.ferguson@reqres.in |
Lindsay |
Ferguson |
Failure |
Case B: Datalake and Database as Target
In this scenario, more than one filter condition will be applied to the data coming from the source and then the filtered record will be sent to multiple targets which are Datalake and Database.
Operations Used:
In this scenario, we have two targets, Datalake and Database. Therefore, here first we will be sending the records to Datalake, followed by Database.
- We will start by using the operation Singleline to Multiline as your records are in singleline JSON data.
- Now using the Append operation, we will add one dummy key-value pair as this will help to aggregate all records after using the multiple Filter operation.
- Once the key is created, apply the filter conditions along with the Append operation to create a new column for showing the status of the conditions applied.
- Using the same Append Operation, create another dummy key for storing the filtered records of the first condition.
- Use the Filter end to close the first condition and use the Filter operation to apply the next condition.
- Close the second condition using Filter end.
- Using Append, create two keys, as created for first filter condition.
- Again, use the filter condition to send records to the target.
- Since the records for first condition will be sent to Datalake, we will use the Data aggregation operation and aggregate the records using the key created for first condition.
- Use Singleline to Multiline to convert the records into Multiline JSON data and using DL ingestion send the records to datalake.
- End the filter condition for first condition using Filter end.
- Use another filter operation for sending the records to Database.
- Use Data Aggregation and aggregate the filtered values of the second condition using the key created earlier.
- Use Singleline to Tuple for sending records to the Database.
Note: In the two keys created using Append operation, first key will represent the filter condition whereas the second key will help in aggregating the records for second filter condition.
Example: Data in Source
{
"id": 1,
"email": "george.bluth@reqres.in",
"first_name": "George",
"last_name": "Bluth"
},
{
"id": 2,
"email": "janet.weaver@reqres.in",
"first_name": "Janet",
"last_name": "Weaver"
},
{
"id": 3,
"email": "emma.wong@reqres.in",
"first_name": "Emma",
"last_name": "Wong"
},
{
"id": 4,
"email": "eve.holt@reqres.in",
"first_name": "Eve",
"last_name": "Holt"
},
{
"id": 5,
"email": "charles.morris@reqres.in",
"first_name": "Charles",
"last_name": "Morris"
},
{
"id": 6,
"email": "tracey.ramos@reqres.in",
"first_name": "Tracey",
"last_name": "Ramos",
"avatar": "https://reqres.in/img/faces/6-image.jpg"
},
{
"id": 7,
"email": "michael.lawson@reqres.in",
"first_name": "Michael",
"last_name": "Lawson"
},
{
"id": 8,
"email": "lindsay.ferguson@reqres.in",
"first_name": "Lindsay",
"last_name": "Ferguson"
}
Condition 1 (For Datalake)
· Filter Condition:
data['id'] == 3
· Append Operation:
"Status": "Success",
"filter_1": "yes"
Condition 2 (For Database)
· Filter Condition:
data['id'] !=3
· Append Operation:
"Status": "Failure",
"filter_2": "no"
- Filter Operation to Aggregate Data:
data['k1'] =='v1'
Note: The filter operation to aggregate data is being used in 3rd and 4th filter operation in the bridge.
Output in Target (Datalake):
id |
|
first_name |
last_name |
Status |
3 |
emma.wong@reqres.in |
Emma |
Wong |
Success |
Output in Target (Database):
id |
|
first_name |
last_name |
Status |
1 |
george.bluth@reqres.in |
George |
Bluth |
Failure |
2 |
janet.weaver@reqres.in |
Janet |
Weaver |
Failure |
4 |
eve.holt@reqres.in |
Eve |
Holt |
Failure |
5 |
charles.morris@reqres.in |
Charles |
Morris |
Failure |
6 |
tracey.ramos@reqres.in |
Tracey |
Ramos |
Failure |
7 |
michael.lawson@reqres.in |
Michael |
Lawson |
Failure |
8 |
lindsay.ferguson@reqres.in |
Lindsay |
Ferguson |
Failure |
Case C: Database and API as Target
In this scenario, more than one filter condition will be applied to the data coming from the source and then the filtered record will be sent to multiple targets which are Database and API.
Operations Used:
In this scenario, we have two targets, API and Database. Therefore, here first we will be sending the records to API, followed by Database.
- We will start by using the operation Singleline to Multiline as your records are in singleline JSON data.
- Now using the Append operation, we will add one dummy key-value pair as this will help to aggregate all records after using the multiple Filter operation.
- Once the key is created, apply the filter conditions along with the Append operation to create a new column for showing the status of the conditions applied.
- Using the same Append Operation, create another dummy key for storing the filtered records of the first condition.
- Use the Filter end to close the first condition and use the Filter operation to apply the next condition.
- Close the second condition using Filter end.
- Using Append, create two keys, as created for first filter condition.
- Again, use the filter condition to send records to the target. Since the records for first condition will be sent to API as target, we will use the Data aggregation operation and aggregate the records using the key created for first condition.
- Use Singleline to Multiline to convert the records into Multiline JSON data and using API operation send the records to required target.
- End the filter condition for first condition using Filter end.
- Use another filter operation for sending the records to Database.
- Use Data Aggregation and aggregate the filtered values of the second condition using the key created earlier.
- Use Singleline to Tuple for sending records to the Database.
Note: In the two keys created using Append operation, first key will represent the filter condition whereas the second key will help in aggregating the records for second filter condition.
Example: Data in Source
{
"id": 1,
"email": "george.bluth@reqres.in",
"first_name": "George",
"last_name": "Bluth"
},
{
"id": 2,
"email": "janet.weaver@reqres.in",
"first_name": "Janet",
"last_name": "Weaver"
},
{
"id": 3,
"email": "emma.wong@reqres.in",
"first_name": "Emma",
"last_name": "Wong"
},
{
"id": 4,
"email": "eve.holt@reqres.in",
"first_name": "Eve",
"last_name": "Holt"
},
{
"id": 5,
"email": "charles.morris@reqres.in",
"first_name": "Charles",
"last_name": "Morris"
},
{
"id": 6,
"email": "tracey.ramos@reqres.in",
"first_name": "Tracey",
"last_name": "Ramos",
"avatar": "https://reqres.in/img/faces/6-image.jpg"
},
{
"id": 7,
"email": "michael.lawson@reqres.in",
"first_name": "Michael",
"last_name": "Lawson"
},
{
"id": 8,
"email": "lindsay.ferguson@reqres.in",
"first_name": "Lindsay",
"last_name": "Ferguson"
}
Condition 1 (For Database)
· Filter Condition:
data['id'] == 3
· Append Operation:
"Status": "Success",
"filter_1": "yes"
Condition 2 (For API)
· Filter Condition:
data['id'] !=3
· Append Operation:
"Status": "Failure",
"filter_2": "no"
- Filter Operation to Aggregate Data:
data['k1'] =='v1'
Note: The filter operation to aggregate data is being used in 3rd and 4th filter operation in the bridge.
Output in Target (API):
id |
|
first_name |
last_name |
Status |
3 |
emma.wong@reqres.in |
Emma |
Wong |
Success |
Output in Target (Database):
id |
|
first_name |
last_name |
Status |
1 |
george.bluth@reqres.in |
George |
Bluth |
Failure |
2 |
janet.weaver@reqres.in |
Janet |
Weaver |
Failure |
4 |
eve.holt@reqres.in |
Eve |
Holt |
Failure |
5 |
charles.morris@reqres.in |
Charles |
Morris |
Failure |
6 |
tracey.ramos@reqres.in |
Tracey |
Ramos |
Failure |
7 |
michael.lawson@reqres.in |
Michael |
Lawson |
Failure |
8 |
lindsay.ferguson@reqres.in |
Lindsay |
Ferguson |
Failure |
Case D: API and Datalake as Target
In this scenario, more than one filter condition will be applied to the data coming from the source and then the filtered record will be sent to multiple targets which are API and Datalake.
Operations Used:
In this scenario, we have two targets, API and Database. Therefore, here first we will be sending the records to API, followed by Datalake.
- We will start by using the operation Singleline to Multiline as your records are in singleline JSON data.
- Now using the Append operation, we will add one dummy key-value pair as this will help to aggregate all records after using the multiple Filter operation.
- Once the key is created, apply the filter conditions along with the Append operation to create a new column for showing the status of the conditions applied.
- Using the same Append Operation, create another dummy key for storing the filtered records of the first condition.
- Use the Filter end to close the first condition and use the Filter operation to apply the next condition.
- Close the second condition using Filter end.
- Using Append, create two keys, as created for first filter condition.
- Again, use the filter condition to send records to the target.
- Since the records for first condition will be sent to API as target, we will use the Data aggregation operation and aggregate the records using the key created for first condition.
- Use Singleline to Multiline to convert the records into Multiline JSON data and using API operation send the records to required target.
- End the filter condition for first condition using Filter end.
- Use another filter operation for sending the records to Datalake.
- Use Data Aggregation and aggregate the filtered values of the second condition using the key created earlier.
- Use Singleline to Multiline for sending records to the Datalake.
Note: In the two keys created using Append operation, first key will represent the filter condition whereas the second key will help in aggregating the records for second filter condition.
Example: Data in Source
{
"id": 1,
"email": "george.bluth@reqres.in",
"first_name": "George",
"last_name": "Bluth"
},
{
"id": 2,
"email": "janet.weaver@reqres.in",
"first_name": "Janet",
"last_name": "Weaver"
},
{
"id": 3,
"email": "emma.wong@reqres.in",
"first_name": "Emma",
"last_name": "Wong"
},
{
"id": 4,
"email": "eve.holt@reqres.in",
"first_name": "Eve",
"last_name": "Holt"
},
{
"id": 5,
"email": "charles.morris@reqres.in",
"first_name": "Charles",
"last_name": "Morris"
}
Condition 1 (For API)
· Filter Condition:
data['id'] !=5
· Append Operation:
"Status": "Success",
"filter_1": "yes"
Condition 2 (For Datalake)
· Filter Condition:
data['id'] == 5
· Append Operation:
"Status": "Failure",
"filter_2": "no"
- Filter Operation to Aggregate Data:
data['k1'] =='v1'
Note: The filter operation to aggregate data is being used in 3rd and 4th filter operation in the bridge.
Output in Target (API):
id |
|
first_name |
last_name |
Status |
1 |
george.bluth@reqres.in |
George |
Bluth |
Success |
2 |
janet.weaver@reqres.in |
Janet |
Weaver |
Success |
3 |
emma.wong@reqres.in |
Emma |
Wong |
Success |
4 |
eve.holt@reqres.in |
Eve |
Holt |
Success |
Output in Target (Datalake):
id |
|
first_name |
last_name |
Status |
5 |
charles.morris@reqres.in |
Charles |
Morris |
Failure |