DataJet has various statistical functions for generating and preparing data, including normalization and standardization. These functions can be found in Engineering | Functions | Statistical:
Data Preparation
Function | Description | Reference |
---|---|---|
Standardize | Standardizes data around a mean of 0 and a standard deviation of 1 | (x - avg) / stdev |
Normalize | Normalizes data to a min of 0 and a max of 1 | (x - min) / range; range = max-min |
Normalize@Zero | Normalizes data to a min of -1 and a max of 1 | ((2.0 * (x - min)) / range) - 1 |
MeanCentric | Centres data around the mean | x - avg |
Access this functionality by:
- Selecting a source field in the Database Tree
- Using the Engineering | Function | Designer
- Using Engineering | Functions | Statistical menu
- Creating a BuildBakedField method in script editor
Database Tree
Function Designer and Engineering Functions
Script Editor
{
"method": "BuildBakedField",
"project": "D5",
"targetTable": "UKDeaths",
"overwrite": true,
"name": "stand_Totaldeaths",
"function": "statistical",
"p1": "STANDARDIZE(A)",
"p2": "Totaldeaths"
}
Data Generation
DataJet also provides the ability to generate data according to various distributions:
Distribution | Description | Reference |
---|---|---|
Normal | produces real values on a standard normal (Gaussian) distribution | Normal distribution - Wikipedia |
LogNormal | produces real values on a lognormal distribution | Log-normal distribution - Wikipedia |
ChiSquared | produces real values on a chi-squared distribution | Chi-squared distribution - Wikipedia |
Cauchy | produces real values on a Cauchy distribution. | Cauchy distribution - Wikipedia |
Poisson | produces integer values on a poisson distribution. | Poisson distribution - Wikipedia |
Binomial | produces integer values on a binomial distribution. | Binomial distribution - Wikipedia |
Bernoulli | produces bool values on a Bernoulli distribution | Bernoulli distribution - Wikipedia |
UniformI | produces integer values evenly distributed across a range | |
UniformR | produces real values evenly distributed across a range |