ProcessSegments
- 2 Minutes to read
- Print
- DarkLight
- PDF
ProcessSegments
- 2 Minutes to read
- Print
- DarkLight
- PDF
Article summary
Did you find this summary helpful?
Thank you for your feedback
Loads a file containing arrays of segmentation data into a segment table and creates a file of unique segment IDs
Key | Value(s) | Description |
---|---|---|
method | "ProcessSegments" | Loads a file containing arrays of segmentation data into a segment table and creates a file of unique segment IDs |
sourcepath | "Path" | Path to directory containing raw data to be processed. If data are stored in more than one folder, this should be the root folder immediately above the individual data folders. |
targetpath | "Path" | Root folder for storage of processed data. Generally the same as sourcepath. Location where HASH-KEY file is stored. Filename is maindatafile.dat |
segspath | "Path" | Root folder for processed segment files. |
dirs[] | [ "subfolder1" "subfolder2" "..." ] | List of folders containing raw data. |
childfolder | "FolderName" | optional. name of child folders if the daily folders contain them. |
verbose | true/false | default = false. If true, provides additional logging in "info" section of API response. |
finalConvert | true/false | default = true. If true creates maindatafile.txt from maindatafile.dat, ready for loading into a datajet table. |
cleanOnStart | true/false | default = false. If true, removes maindatafile.dat before starting processing. |
sample | true/false | default = false. If true loads first file in each specified folder. |
writekeys | true/false | default = true. If true, writes out the segment files. Set to false in order to just test generation of the hash-key file. |
numericFoldersOnly | true/false | ignore non-numeric folders when identifying sub-folders to process |
ignoreCompressedFolders | true/false | ignore folders containing only compressed data (*.gz, *.zip, *.rar) |
project | ||
lastFolder | maximum number of folders to process, starting with the most recent folder name and going backwards (assuming that folders have date names, e.g., 20240213) | |
maxLines | Deprecated in v 6.11.11.01 maximum number of lines to process - up to 2,000,000,000 (2 billion) | |
collated | true/false | Deprecated in v 6.11.11.01 Default = true. Processing optimizes audience calculation performance. |
{
"method": "ProcessSegments",
"sourcepath": "/home/engine/datasources/OneTouch/Eyeota/mft/US/",
"targetpath": "/home/engine/datasources/OneTouch/Eyeota/mft/US/",
"segspath": "/home/engine/datasources/OneTouch/Eyeota/mft/US/segs/",
"dirs": [
"20240318",
"20240319",
"20240320",
"20240321",
"20240322",
"20240323",
"20240324"
],
"finalConvert": true,
"cleanOnStart": true,
"sample": true,
"writekeys": true,
"numericFoldersOnly": true,
"ignoreCompressedFolders": true,
"description": "Process Eyeota Segments",
"project": "eyeota",
"tooltip": "Takes raw data in unzipped format and turns into segment files and hash key file"
}
The following shows how to use ProcessSegments with data stored in a sub-folder of the primary folders:
{
"method": "ProcessSegments",
"sourcepath": "/home/engine/datasources/OneTouch/Eyeota/mft/US/",
"targetpath": "/home/engine/datasources/OneTouch/Eyeota/mft/US/",
"segspath": "/home/engine/datasources/OneTouch/Eyeota/mft/US/segs/",
"dirs": [
"20240318",
"20240319",
"20240320",
"20240321",
"20240322",
"20240323"
],
"childfolder": "HEMSHA2",
"finalConvert": true,
"cleanOnStart": true,
"sample": false,
"verbose": false,
"writekeys": true,
"description": "Process Eyeota Segments",
"project": "Q1Patch1Eyeota_Pro"
}
Sample reportFile contents:
{
"ProcessSegment": "2024.7.22.1",
"sourcepath": "...mft/US/",
"targetpath": ".../datasource-audience/",
"segspath": ".../datasource-audience/segs/",
"childfolder": "HEMSHA2",
"dirs": [
"20240707",
"20240706",
"20240705",
"20240704",
"20240703",
"20240702",
"20240701"
],
"writekeys": true,
"checking": false,
"sample": false,
"verbose": false,
"count": false,
"followonly": false,
"collated": true,
"files_processed": 757,
"folders_processed": [
"20240707/HEMSHA2",
"20240706/HEMSHA2",
"20240705/HEMSHA2",
"20240704/HEMSHA2",
"20240703/HEMSHA2",
"20240702/HEMSHA2",
"20240701/HEMSHA2"
],
"folders_processed_lines": [
535469843,
836427346,
109439092,
598006467,
738713871,
46902468,
93804936
],
"processedDirs": [
"20240707"
],
"totalLines": 2000000000,
"uniqueLines": 1179447368,
"maxLines": 2000000000
}
{
"sourcepath": "/home/engine/datasources/OneTouch/Eyeota/mft/US/",
"targetpath": "/home/engine/datasources/campaignRoot/onetouch-dev01/eyeota-audience/",
"segspath": "/home/engine/datasources/campaignRoot/onetouch-dev01/eyeota-audience/segs/",
"childfolder": "HEMSHA2",
"dirs": [
"20240508",
"20240507",
"20240506",
"20240505",
"20240504",
"20240503",
"20240502",
"20240501",
"20240430",
"20240429",
"20240428",
"20240427",
"20240426",
"20240425",
"20240424"
],
"writekeys": true,
"checking": false,
"sample": false,
"verbose": false,
"count": false,
"followonly": false,
"processedDirs": [
"20240508",
"20240507",
"20240506",
"20240505",
"20240504",
"20240503",
"20240502"
]
}
Was this article helpful?