ProcessSegments
  • 2 Minutes to read
  • Dark
    Light
  • PDF

ProcessSegments

  • Dark
    Light
  • PDF

Article summary

Loads a file containing arrays of segmentation data into a segment table and creates a file of unique segment IDs

KeyValue(s)Description
method"ProcessSegments"Loads a file containing arrays of segmentation data into a segment table and creates a file of unique segment IDs
sourcepath"Path"Path to directory containing raw data to be processed.   If data are stored in more than one folder, this should be the root folder immediately above the individual data folders.
targetpath"Path"Root folder for storage of processed data.  Generally the same as sourcepath.  Location where HASH-KEY file is stored.
Filename is maindatafile.dat
segspath"Path"Root folder for processed segment files. 
dirs[][
"subfolder1"
"subfolder2"
"..."
]
List of folders containing raw data.
childfolder"FolderName"optional.  name of child folders if the daily folders contain them.
verbosetrue/falsedefault = false.  If true, provides additional logging in "info" section of API response.
finalConverttrue/falsedefault = true.  If true creates maindatafile.txt from maindatafile.dat, ready for loading into a datajet table.
cleanOnStarttrue/falsedefault = false.  If true, removes maindatafile.dat before starting processing.
sampletrue/falsedefault = false.  If true loads first file in each specified folder.
writekeystrue/falsedefault = true.  If true, writes out the segment files.  Set to false in order to just test generation of the hash-key file.
numericFoldersOnlytrue/false
ignore non-numeric folders when identifying sub-folders to process
ignoreCompressedFolderstrue/false
ignore folders containing only compressed data (*.gz, *.zip, *.rar)
project

lastFolder
maximum number of folders to process, starting with the most recent folder name  and going backwards  (assuming that folders have date names, e.g., 20240213)
maxLines
Deprecated in v 6.11.11.01
maximum number of lines to process - up to 2,000,000,000 (2 billion)
collatedtrue/falseDeprecated in v 6.11.11.01
Default = true.  Processing optimizes audience calculation performance.


{
  "method": "ProcessSegments",
  "sourcepath": "/home/engine/datasources/OneTouch/Eyeota/mft/US/",
  "targetpath": "/home/engine/datasources/OneTouch/Eyeota/mft/US/",
  "segspath": "/home/engine/datasources/OneTouch/Eyeota/mft/US/segs/",
  "dirs": [
    "20240318",
    "20240319",
    "20240320",
    "20240321",
    "20240322",
    "20240323",
    "20240324"
  ],
  "finalConvert": true,
  "cleanOnStart": true,
  "sample": true,
  "writekeys": true,
  "numericFoldersOnly": true,
  "ignoreCompressedFolders": true,
  "description": "Process Eyeota Segments",
  "project": "eyeota",
  "tooltip": "Takes raw data in unzipped format and turns into segment files and hash key file"
}

The following shows how to use ProcessSegments with data stored in a sub-folder of the primary folders:

{
  "method": "ProcessSegments",
  "sourcepath": "/home/engine/datasources/OneTouch/Eyeota/mft/US/",
  "targetpath": "/home/engine/datasources/OneTouch/Eyeota/mft/US/",
  "segspath": "/home/engine/datasources/OneTouch/Eyeota/mft/US/segs/",
  "dirs": [
    "20240318",
    "20240319",
    "20240320",
    "20240321",
    "20240322",
    "20240323"
  ],
  "childfolder": "HEMSHA2",
  "finalConvert": true,
  "cleanOnStart": true,
  "sample": false,
  "verbose": false,
  "writekeys": true,
  "description": "Process Eyeota Segments",
  "project": "Q1Patch1Eyeota_Pro"
}


Sample reportFile contents:

{
  "ProcessSegment": "2024.7.22.1",
  "sourcepath": "...mft/US/",
  "targetpath": ".../datasource-audience/",
  "segspath": ".../datasource-audience/segs/",
  "childfolder": "HEMSHA2",
  "dirs": [
    "20240707",
    "20240706",
    "20240705",
    "20240704",
    "20240703",
    "20240702",
    "20240701"
  ],
  "writekeys": true,
  "checking": false,
  "sample": false,
  "verbose": false,
  "count": false,
  "followonly": false,
  "collated": true,
  "files_processed": 757,
  "folders_processed": [
    "20240707/HEMSHA2",
    "20240706/HEMSHA2",
    "20240705/HEMSHA2",
    "20240704/HEMSHA2",
    "20240703/HEMSHA2",
    "20240702/HEMSHA2",
    "20240701/HEMSHA2"
  ],
  "folders_processed_lines": [
    535469843,
    836427346,
    109439092,
    598006467,
    738713871,
    46902468,
    93804936
  ],
  "processedDirs": [
    "20240707"
  ],
  "totalLines": 2000000000,
  "uniqueLines": 1179447368,
  "maxLines": 2000000000
}
{  
  "sourcepath": "/home/engine/datasources/OneTouch/Eyeota/mft/US/",
  "targetpath": "/home/engine/datasources/campaignRoot/onetouch-dev01/eyeota-audience/",
  "segspath": "/home/engine/datasources/campaignRoot/onetouch-dev01/eyeota-audience/segs/",
  "childfolder": "HEMSHA2",
  "dirs": [
    "20240508",
    "20240507",
    "20240506",
    "20240505",
    "20240504",
    "20240503",
    "20240502",
    "20240501",
    "20240430",
    "20240429",
    "20240428",
    "20240427",
    "20240426",
    "20240425",
    "20240424"
  ],
  "writekeys": true,
  "checking": false,
  "sample": false,
  "verbose": false,
  "count": false,
  "followonly": false,
  "processedDirs": [
    "20240508",
    "20240507",
    "20240506",
    "20240505",
    "20240504",
    "20240503",
    "20240502"
  ]
}

Was this article helpful?