Model Processing - Quick Reference
  • 3 Minutes to read
  • Dark
    Light
  • PDF

Model Processing - Quick Reference

  • Dark
    Light
  • PDF

Article summary

Article In Progress...

Before model processing in R, first make sure that R has been installed on the DataJet Server (or desktop if working on a local machine).

Before model processing in python, make sure that python has been installed on the DataJet server. 

Model Processing Overview

TODO: Video - Integrated Modelling in DataJet

Prepare Source Data

  • The output of any profile report
  • Any discrete field from the data catalog 

Develop Model

  • Use Engineering | User Defined Fields | Discrete Processor | R  to open the R Model Processor and Engineering | User Defined Field | Discrete Processor | python to open the python model processor
  • Select a model template from the System or User library and load into the code panel using <- use
  • Modify model template as necessary or create a new model from a blank template by editing code in the processing panel
  • Test model by dropping source data into the source panel and verifying model output in the output panel using Test
  • Make model available to analysis reports by using Push Model to Library

 Apply Model

  • Open an existing Multi-Function Profile report from Reports Tab

or

  • Create a new Analysis | Multi-Function Profile report
  • Select Model from the Model Drop Down list
  • Calculate Model
  • Save Report, and/or Export Results to table or file

 Model Processor Console

Source Panel

Drag and drop DataJet objects into the panel to provide source data for the model.    The following objects can be added to the source panel:

  • Reports:
    • Profile
    • Index Profile
    • Multi-function profile
  • Database fields – discrete fields only

Processing Panel

Load an existing model from the SYSTEM or USER libraries. To load a model:

  1. Select Library tab, 
  2. Select a library – USER or SYSTEM
  3. Select a model 
  4. Select <- use.
  5. Edit/Develop model in the Code Tab

TODO: Review {

Libraries

SYSTEM Library

The SYSTEM library is a standard library, and contains various example models.

SYSTEM Models are stored in Github, and are automatically refreshed whenever the Model Processor starts up.

Only models for the selected programming language (i.e. python or R) will be displayed in the libraries tab.

If the DataJet server is not connected to the internet, no system models will be available.

For SYSTEM models to be available in the multi-function profile report, they must first be “loaded” into the system:

  1. From libraries tab, Select SYSTEM model  
  2. Select <- use
  3. From Code tab, select Push To Library

 

USER Library

The USER Library contains models that are specific to the current installation.   These could consist of:

  • custom imported models, 
  • SYSTEM models that have been configured for the active installation or edited in some way
  • new models which have been developed using the R Model Processing report.

USER models are stored in the DataJet mongo database on the DataJet server.

For USER models to be available in the multi-function profile report, they must first be “loaded” into the system:

  1. From libraries tab, Select USER model  
  2. Select <- use
  3. From Code tab, select Push To Library 

Output Panel

The output panel displays the model output, as generated by the Test option.

TODO: What does it mean if Test is disabled?    How do you fix this?

Results Panel

R

Displays the content of the “grid” object for the model output:

 

dataModel$grid = grid

finalModel = toJSON(dataModel)

write_file(finalModel,myargs[2])

Python

 TODO

 

Data Panel

R

Displays the content of the “associatedData” item for the model output:

dataModel$associatedData = adata

finalModel = toJSON(dataModel)

write_file(finalModel,myargs[2])

Python

 TODO

 

 

Data Model Overview

  • Load Data from command line/file
  • Create JSON data model object
  • Access dataModel contents

TODO: Complete Grid

object

description

properties

grid

Content of results grid

 

grid[[1]] = setup grid

grid[[2]] = results grid

grid[[3]] = ???  TODO

  • name
  • tag
  • headers
  • data

 

TODO:  Is there more than 1 grid?

suggestedChart


 

chart

 

  • objectType
  • name
  • categories
  • values
  • chartType
  • x
  • y
  • z

associatedData

 

  • DataRows
  • DataColumns
  • cluster_centers
  • plottableCentroids
  • inertia

headerInfo

 

  • name
  • type   [label, value]
  • datatype
  • fieldType
  • axisOverride
  • graphAble

rows

 

 

hasTotalRow

 

[TRUE, FALSE]

headers

 

 

hasTotalColumn

 

 

hasNullRow

 

 

hasNullColumn

 

 

  •  Process data
  • Update dataModel contents
  • Write JSON data out to file 

Working with R

Accessing dataModel Input

Data is read in from the model interface/JSON input/source panel using commandArgs  (myargs[1]) or by reading directly from file: 

#Ex1.1 Using readLines and commandArgs to load data from source panel:

myargs = commandArgs(trailingOnly=TRUE)

data = readLines(myargs[1])

 

#Ex1.2 Using read_file and commandArgs to load data from source panel:

myargs = commandArgs(trailingOnly=TRUE)

data = read_file(myargs[1], locale = default_locale())

 

#Ex1.3 Reading data directly from file:

data = read_file("d:/model_in_6379.json", locale = default_locale())

Additional input parameters can be accessed using myargs[3], myargs[4] etc… 

 

Model Processing in R

R objects such as dataModels and dataFrames are used to apply modelling to source data:

#Ex2.1 Using for loop to process data

for(i in 1:length(data)) {

  data[i] = data[i]

}

Model Output in R

Data is output to model interface/output panel by outputting to file or using the commandArgs character vector myargs:

 

#Ex3.1 Outputting data using commandArgs and writeLines:

writeLines(data, myargs[2])

 

#Ex3.2 Outputting data using commandArgs and write_file
 finalModel = toJSON(dataModel)

write_file(finalModel, myargs[2]) 

 

 

 


Was this article helpful?

What's Next