Documentation Index

Fetch the complete documentation index at: https://docs.datajet.app/llms.txt

Use this file to discover all available pages before exploring further.

PreProcessSegments

Prev Next

Processes *.tsv input files to create smaller input data files for ProcessSegments

KeyValue(s)Description
method"PreProcessSegments"Processes *.tsv input files to create smaller input data files for ProcessSegments
sourcepath"Path to input data"Location of input files (Note: this should match the "sourcepath" key for ProcessSegments)

Input files are in format *.tsv

Two types of pre-processed file are generated:

  • *.tsv.hash
  • *.tsv.hash.seg

Once pre-processing files have been generated, *.tsv files can be deleted.  Pre-processed files are ~1/3 the size of *.tsv input files.

To use pre-processing files and enable performance enhancements, ProcessSegments must have the following property set:

"intermediateMode": true,

{
  "method": "PreProcessSegments",
  "sourcepath": "/home/datasources/campaignsource",
  "project": "campaign-full"
}