- 2 Minutes to read
- Print
- DarkLight
- PDF
CTOD
- 2 Minutes to read
- Print
- DarkLight
- PDF
String: CTOD
Attempts to create a new discrete field, usually by having applied a filter to a continuous string, integer or double.
Purpose
- Create a discrete field from a continuous field
- Create a copy of a continuous field for use as a dimension in a profile
- Create a copy of a field for use in a join
- Turn a C (continuous) into a D (discrete)
- Large number of records on the table (> 50 million)
- Large number of discrete values (>300,000)
Return Value
Property | Value |
---|---|
FieldType | String / Integer / Long(Integer) / Double |
FieldSize | Low / Med / High |
DataType | Discrete |
DataSize | Short / Integer / LongInteger / Double |
Example Return Value: Return values are unchanged from Input value. CTOD is not applicable for Dates or DateTimes.
Parameters
Parameter | JSON | Description |
---|---|---|
Table | “targetTable”: “MyTableName” | The target table on which the new field will be created |
Filter | “dataset”: {DataSet_JSON} | Optional. If a filter is applied, records not in the filter recordset will be returned as null. |
Function | "function":"" | String |
String Function | “p1”: “CTOD(A)” | CTOD(A) |
A | “p2”: | Required. The discrete field to copy as a continuous field. Supports:
|
MAX_ENTRIES | The threshold for crossing from discrete data type to continuous data type. |
JSON Sample
{
"method": "BuildBakedField",
"targetTable": "BikeTripData",
"overwrite": true,
"name": "Journey_Random2",
"function": "string",
"p1": "CTOD(A)",
"p2": "Journey_ID",
"MAX_ENTRIES": 1000000,
"project": "Model1"
}
Usage Notes
On very large tables, it may be necessary to use a filter in order to extract a discrete field from a continuous field. In the example below, a random sampe of size 800,000 has been applied to the BikeTripData table in order to create the Journey_Random field.
Another approach is to use the BuildBakedField method and extend the MAX_ENTRIES threshold. This approach should be used with care, as DataJet's threshold limits are designed to maximize performance across the application. Having a large number of very high cardinality indexed fields (i.e., string fields above 200,000 unique values) may result in a perceptible (i.e., greater than 1 second) response time for operations such as populating the Context Panel grid.
<<TODO: Analyse field function >>
See Also:
- QQ - How do I use the integrated model processor on very large tables?
- Getting Started - Data Types
- Expression Field
Example
Example | Details |
---|---|
Description | Convert a continuous string field to a discrete string |
Input |
|
Sample |