ProcessSegments
  • 2 Minutes to read
  • Dark
    Light
  • PDF

ProcessSegments

  • Dark
    Light
  • PDF

Article summary

Article in progress...

PROTOTYPE

TEMPORARY API
NOTE: This API call WILL change in the next release - any scripts developed with this function will need to be reviewed and modified with the next release.  Expect to have to rebuild any data created using this function.


Loads a file containing arrays of segmentation data into a segment table and creates a file of unique segment IDs

KeyValue(s)Description
method"ProcessSegments"Loads a file containing arrays of segmentation data into a segment table and creates a file of unique segment IDs
sourcepath"Path"Path to directory containing raw data to be processed.   If data are stored in more than one folder, this should be the root folder immediately above the individual data folders.
targetpath"Path"Root folder for storage of processed data.  Generally the same as sourcepath.  Location where HASH-KEY file is stored.
Filename is maindatafile.dat
segspath"Path"Root folder for processed segment files. 
dirs[][
"subfolder1"
"subfolder2"
"..."
]
List of folders containing raw data.
childfolder"FolderName"optional.  name of child folders if the daily folders contain them.
verbosetrue/falsedefault = false.  If true, provides additional logging in "info" section of API response.
finalConverttrue/falsedefault = true.  If true creates maindatafile.txt from maindatafile.dat, ready for loading into a datajet table.
cleanOnStarttrue/falsedefault = false.  If true, removes maindatafile.dat before starting processing.
sampletrue/falsedefault = false.  If true loads first file in each specified folder.
writekeystrue/falsedefault = true.  If true, writes out the segment files.  Set to false in order to just test generation of the hash-key file.
numericFoldersOnlytrue/false
ignore non-numeric folders when identifying sub-folders to process
ignoreCompressedFolderstrue/false
ignore folders containing only compressed data (*.gz, *.zip, *.rar)
project

lastFolder
maximum number of folders to process, starting with the most recent folder name  and going backwards  (assuming that folders have date names, e.g., 20240213)
maxLines
maximum number of lines to process - up to 2,000,000,000 (2 billion)


{
  "method": "ProcessSegments",
  "sourcepath": "/home/engine/datasources/OneTouch/Eyeota/mft/US/",
  "targetpath": "/home/engine/datasources/OneTouch/Eyeota/mft/US/",
  "segspath": "/home/engine/datasources/OneTouch/Eyeota/mft/US/segs/",
  "dirs": [
    "20240318",
    "20240319",
    "20240320",
    "20240321",
    "20240322",
    "20240323",
    "20240324"
  ],
  "finalConvert": true,
  "cleanOnStart": true,
  "sample": true,
  "writekeys": true,
  "numericFoldersOnly": true,
  "ignoreCompressedFolders": true,
  "description": "Process Eyeota Segments",
  "project": "eyeota",
  "tooltip": "Takes raw data in unzipped format and turns into segment files and hash key file"
}

The following shows how to use ProcessSegments with data stored in a sub-folder of the primary folders:

{
  "method": "ProcessSegments",
  "sourcepath": "/home/engine/datasources/OneTouch/Eyeota/mft/US/",
  "targetpath": "/home/engine/datasources/OneTouch/Eyeota/mft/US/",
  "segspath": "/home/engine/datasources/OneTouch/Eyeota/mft/US/segs/",
  "dirs": [
    "20240318",
    "20240319",
    "20240320",
    "20240321",
    "20240322",
    "20240323"
  ],
  "childfolder": "HEMSHA2",
  "finalConvert": true,
  "cleanOnStart": true,
  "sample": false,
  "verbose": false,
  "writekeys": true,
  "description": "Process Eyeota Segments",
  "project": "Q1Patch1Eyeota_Pro"
}


Sample reportFile contents:

{  
  "sourcepath": "/home/engine/datasources/OneTouch/Eyeota/mft/US/",
  "targetpath": "/home/engine/datasources/campaignRoot/onetouch-dev01/eyeota-audience/",
  "segspath": "/home/engine/datasources/campaignRoot/onetouch-dev01/eyeota-audience/segs/",
  "childfolder": "HEMSHA2",
  "dirs": [
    "20240508",
    "20240507",
    "20240506",
    "20240505",
    "20240504",
    "20240503",
    "20240502",
    "20240501",
    "20240430",
    "20240429",
    "20240428",
    "20240427",
    "20240426",
    "20240425",
    "20240424"
  ],
  "writekeys": true,
  "checking": false,
  "sample": false,
  "verbose": false,
  "count": false,
  "followonly": false,
  "processedDirs": [
    "20240508",
    "20240507",
    "20240506",
    "20240505",
    "20240504",
    "20240503",
    "20240502"
  ]
}

Was this article helpful?