Processes *.tsv input files to create smaller input data files for ProcessSegments
| Key | Value(s) | Description |
|---|---|---|
| method | "PreProcessSegments" | Processes *.tsv input files to create smaller input data files for ProcessSegments |
| sourcepath | "Path to input data" | Location of input files (Note: this should match the "sourcepath" key for ProcessSegments) |
Input files are in format *.tsv
Two types of pre-processed file are generated:
- *.tsv.hash
- *.tsv.hash.seg
Once pre-processing files have been generated, *.tsv files can be deleted. Pre-processed files are ~1/3 the size of *.tsv input files.
To use pre-processing files and enable performance enhancements, ProcessSegments must have the following property set:
"intermediateMode": true,

{
"method": "PreProcessSegments",
"sourcepath": "/home/datasources/campaignsource",
"project": "campaign-full"
}