Tutorial
Request
The SaaS ETL platform required a tutorial on adding a new source to a module that was already up and running.
Solution
"Add statistics from another source Integration" page.
Having a certain demo scenario data set developed earlier, I extended it so that it could be used in this tutorial, too, forming a series of them. Since the platform's UI was rather techno-dependent, this tutorial had to also include examples with JSON-code, some parameters and screenshots that would show where exactly they should be used in the UI.
Platform and tools used
Docs site run on a static site generator; Git, VSCode, Markdown, Mac OS image editors.
Related: Merge Module Visual Editor guide
Getting ready
Before connecting a new data source to the consolidated report, do determine which data you have to collect from the source, eg: the data source feeds date, platform, gender, currency, country whereas you need all of the above except for the gender. You will need that list for the first step of connection process.
Note:
Keep in mind that the input integration can either be a raw data integration OR another Merge Module.
Depending on that entities
and metrics
list, determine which one of the existing data flows is/are more suitable for the one you’re setting (common entities
, metrics
, even if named differently, will help).
Example: two data sources of have have data by installs, three others - by inapp events. The integration you’re connecting is collecting data by inapp events, so it’s better to integrate it into the flow with three corresponding integrations.
1. List the new data columns
1.1. Pause the integration
Change the integration status to Pause, see how to do that here.
1.2. Add the fields
In the integration config’ entities
and metrics
, list the new columns you want to add to your MM integration from the new data source.
Example:
We’re adding purchases
.
2. Input Integration node
Add the Input Integration component by dragging the corresponding node from nodes panel onto diagram area (see the editor UI).
- Labeling
Label it using these templates:
_mediaSourceName_integrationId_anyNote_
_dataSourceName_integrationId_mediaSourceName_
- for the Files module
Examples:
appsflyer_999_android_by_installs
files_998_someSemiAutomatedDataSource
Every node’s label must tell user that component’s role accurately and also be handy to copy-paste it into another component’s code when needed.
- Data to collect
-
In the component’s settings sidebar, select those Breakdowns (
entities
) and Metrics (metrics
) that you want to collect from that data source and put into the data columns of your MM integration.Example: we’re adding the integration 206 as a new data source and collecting “Date”, “Campaign” and “Purchases” from it.
-
If there are any filters or macros you’d like to apply, switch to the JSON-code view and write them out in
filters
. -
If there is some meta data fields that you wish to collect, enter the paths in
meta
.Example: we don’t have to use macros or filter the stats, or to collect the meta data, so the corresponding fields are empty.
-
3. data_source
For debugging purposes, it is recommended to add the data_source
tech field (“Data Source” column) using the New Entities component:
key
-data_source
value
must be the same as the Input Integration node’s label, eg:_mediaSourceName_integrationId_
,_dataSourceName_integrationId_mediaSourceName_
- node label:
data_source_integrationId_
Example: we don’t have to use macros or filter the stats, or to collect the meta data, so the corresponding fields are empty.
4. Integrate into the data flow
4.1. Fields renaming
The data source you’re adding to your MM integration may give out same type data as the existing data sources do, but under different keys, eg: campaign
and camp_name
, spend
and cost
. In that case, renaming the new data source’s column keys using the Rename component would help:
{
"camp_name": "campaign",
"cost": "spend"
}
Same goes for the dates entities
groups: rename them depending on the existing data flow, for merging purposes.
Thus, when renaming metrics while connecting a new raw data source integration to a few already connected ones, it is better to stay true to data flows and titles of those already being in use as all the fields, Conditional Switches and Value Mappers etc are already set to those names.
Example: the new data source has campaign name in campaign_name
field whereas in our data flow such data is stored in the campaign
field, hence the renaming.
Config’s JSON-code example
Example config of the data flow above.
{
"name": "mergeModule_205",
"title": "Merge Module",
"type": "rewritable",
"entities": [
{
"key": "date",
"title": "Date",
"type": "date",
"primary": true
},
{
"key": "campaign",
"title": "Campaign"
}
],
"metrics": [
{
"key": "installs",
"title": "Installs",
"type": "absolute"
},
{
"key": "clicks",
"title": "Clicks",
"type": "absolute"
},
{
"key": "purchases",
"title": "Purchases",
"type": "absolute"
}
],
"params": {
"macros": {},
"components": {
"facebook": {
"name": "input/core",
"params": {
"moduleId": 176,
"breakdowns": [
"date",
"campaign"
],
"metrics": [
"installs",
"clicks"
],
"timezone": "Europe/Moscow",
"dateColumn": "date",
"filters": {}
},
"diagramComponentLocation": "14 14"
},
"tiktok": {
"name": "input/core",
"params": {
"moduleId": 177,
"breakdowns": [
"date",
"cmp"
],
"metrics": [
"clicks",
"converts"
],
"timezone": "Europe/Moscow",
"dateColumn": "date",
"filters": {}
},
"diagramComponentLocation": "14 239"
},
"rename": {
"name": "transform/rename",
"params": {
"map": {
"cmp": "campaign",
"converts": "installs",
"campaign_name": "campaign"
}
},
"diagramComponentLocation": "511.9951171875 188.99999999999994"
},
"output": {
"name": "output/core",
"params": {
"fillMissedColumns": true,
"ignoreExtraColumns": true
},
"diagramComponentLocation": "760.99267578125 188.99999999999994"
},
"new_entities": {
"name": "transform/addConst",
"params": [
{
"targetField": "data_source",
"value": "facebook_176",
"replace": false,
"skipIfExists": true
}
],
"diagramComponentLocation": "262.99755859375 89"
},
"new_entities_copy": {
"name": "transform/addConst",
"params": [
{
"targetField": "data_source",
"value": "tiktok_177",
"replace": false,
"skipIfExists": true
}
],
"diagramComponentLocation": "262.99755859375 238.99999999999994"
},
"new_entities_copy_copy": {
"name": "transform/addConst",
"params": [
{
"targetField": "data_source",
"value": "googleads_206",
"replace": false,
"skipIfExists": true
}
],
"diagramComponentLocation": "268 432"
},
"googleads": {
"name": "input/core",
"params": {
"moduleId": 206,
"breakdowns": [
"date",
"campaign_name"
],
"metrics": [
"purchases"
],
"timezone": "Europe/Moscow",
"dateColumn": "date",
"filters": {},
"meta": {}
},
"diagramComponentLocation": "20 432"
}
},
"relations": {
"output": {
"IN": [
[
"rename",
"OUT"
]
]
},
"new_entities": {
"IN": [
[
"facebook",
"OUT"
]
]
},
"new_entities_copy": {
"IN": [
[
"tiktok",
"OUT"
]
]
},
"rename": {
"IN": [
[
"new_entities_copy",
"OUT"
],
[
"new_entities",
"OUT"
],
[
"new_entities_copy_copy",
"OUT"
]
]
},
"new_entities_copy_copy": {
"IN": [
[
"googleads",
"OUT"
]
]
}
}
},
"version": 2
}
4.2. Data transformations
In accordance with the the goals you want this Merge Module to help you with, add and configure other MM components. Find a full list of them and their descriptions here: MM Components.