Documentation Index

Fetch the complete documentation index at: https://docs.datajet.app/llms.txt

Use this file to discover all available pages before exploring further.

XProcess

Prev Next

Calls out to a 3rd party process - for example, PScript will run a python script.

KeyValue(s)Description
method"XProcess"
name"NameOfExternalProcessor"Available Processes:
  • NDJSONProc.exe
  • PScript.exe
  • KaggleDownload.exe
  • DJFScanProc.exe
  • {TODO}
waittrue/falseIf true, script will wait for XProcess to return before continuing.
param{}{
"script": "ScriptPathAndFileName",
"inFile":"InputPathAndFileName",
"outFile": "OutputPathAndFileName",
"params":[
"param1: value1",
"param2: value2"
"param3: value3" 
"..."
]
}
  • "script" - the path and name of the script to execute.    Root location is DataJet bin folder.  Best practise is to store scripts in "plugins" folder
  • "inFile" - input file to the process.   Input file, if produced by DataJet, is usually generated using the Export method.  Passed in as sys.argv[2]
  • "outFile" - output file path.   Passed in as sys.argv[3].   Best practise is to output to a subfolder in Data Sources and use the %DATAPATH% variable.  NOTE: %OUTPUT% system variable is NOT resolved by XProcess and so cannot be used.
  • "params":[]  - output file path.   Passed in as sys.argv[4]...sys.argv[N]  
project"ActiveProjectName"



Overview

XProcess allows external data processing to be added to scripts - i.e., data processing that takes place outside of the core datajet server.  When used in conjunction with a programming language like python, this provides a simple but powerful way of extending the core Datajet functionality.

A call to XProcess generally uses these principles:

  1. Export data from Datajet (for example, Export into file)
  2. Use external process to:
    1. Open data file and process
    2. Create output file
  3. Import Output file back into Datajet

Exporting Data

Data can be exported in the following ways:

  • Using one of the export methods (e.g., Export, ExportIntoWorkbook)
  • Export to file from an analytical process (e.g., AnalyseSegments, Overlap, IProfile...etc)

Processing

The most common process to use is python:

{
  "method": "XProcess",
  "name": "PScript",
  "wait": true,
  "param": {
    "language": "python",
    "script": "%PLUGINS%analyse-segments-top-and-bottom.py",
    "inFile": "%DATAPATH%AnalyseSegments/AS_composite_index_full.csv",
    "outFile": "%DATAPATH%AnalyseSegments/",
    "params": [
      "interleaved: true",
      "sort_type: Index",
      "output_type: ITO",
      "topFileName: TableC_TopN",
      "bottomFileName: TableD_BottomN",
      "thresholds: I 1.2, T 500, O 50",
      "create_suffix: false"
    ]
  },
  "description": "PScript - Analyse Segments Top & Bottom Targets",
  "project": "AnalyseSegments"
}

Sample processing within python script:


#Write errors to filename in sys.argv[1]
err_file = sys.argv[1]

...

# Get input file and output location from command line arguments
input_file = sys.argv[2]
output_location = sys.argv[3]


# Parse parameters from sys.argv[4] to sys.argv[10]
for i in range(4, min(11, len(sys.argv))):
    param = sys.argv[i]


Output File Format

Output format can be whatever is required by the XProcess script.   The following outputs can be easily imported back into datajet:

  • Excel - use CreateTableFromWorkbook - use this if the file format generated by XProcess is unknown.
  • Delimited File - use CreateTableFromFile - use this if the field format generated by XProcess is known, or have the XProcess script also generate an excel definition file
  • Direct from file - use CreateFieldsFromFile - NOTE: be very cautious using this method as it can lead to unbalanced tables. 

Importing Data

Data can be imported in the following ways:

  • Into a new table - using CreateTableFromFile, CreateTableFromWorkbook etc
  • Into an existing table using CreateFieldsFromFile

Debugging

If XProcess returns with success, but the process has failed, first look in the engine server log to see if any unhandled errors have been reported:

If errors are being correctly written by the process to the error file contained in sys.argv[1] then errors will be reported back in script editor

Configuring a server to use XProcess

Examples

Python

JSON

{
  "method": "XProcess",
  "name": "PScript",
  "wait": true,
  "param": {
    "script": "%PLUGINS%dmetaphone.py",
    "inFile": "%DATAPATH%%SFIELD%.txt",
    "outFile": "%DATAPATH%%SFIELD%_metaphone.txt",
    "params": []
  },
  "project": "#project#"
}
{
  "method": "XProcess",
  "name": "PScript",
  "wait": true,
  "param": {
    "language": "python",
    "script": "%PLUGINS%external_process.py",   
    "inFile": "%DATAPATH%nationwide/nwi/dj_source/allant_individual_A_ACX.txt",
    "outFile": "%DATAPATH%nationwide/nwi/dj_source/amp_footprint.txt",
    "params": []
  },
  "project": "Nationwide-NWI-B"
}

Sample Python Script (*.py)

For full details of implementing XProcess, look in the plugins folders for examples.

errFile = ""

try:
    errFile = sys.argv[1]
    # ... (your main code goes here) ...

except:
    try:
        if errFile != "":
            eFile = open(errFile, 'w')
            eFile.write("Regression Failed\n")
            eFile.close()
    except:
        pass  # Silently fail if we can't even write the error file