- 4 Minutes to read
- Print
- DarkLight
- PDF
Delimited File
- 4 Minutes to read
- Print
- DarkLight
- PDF
Filename
Use the file selector button to navigate to a file in the DataSources directory
This will bring up the Remote Data Source dialog which allows the user to navigate through the Data Sources structure:
(NOTE: To navigate to data on the local machine, press SHIFT when pressing the file selector button. Generally this is not advised as it could lead to very slow load times, depending on the client-server architecture, as well as creating issues when sharing scripts with other users.)
Filename can be an individual file, or the contents of a directory. To load the content of a directory, the method must be edited in the JSON window of Script Editor. For more information on this see CreateTableFromFile
When using the file selector, the filename will include the System Variable %DATAPATH%. For more information on System Variables and their use in script, see System Variables. If specifying data files that are not in the DataSources directory or sub-directory, be sure that the server has the necessary permissions to access and load data.
Headers
If selected, headers will be created from the first line of the file. If unselected, automatic field names will be created of the form Field[N] where [N] is the ordinal position of the column in file. Note that a definition file can be used instead to define header names - for more information see CreateTableFromFile | definitionFile.
Delimiter
The following delimiters are supported:
- Tab
- Comma
- Pipe
- Semi-Colon
- HSTART
Other characters can be specified as a delimiter by editing the method JSON directly. For more information on this see CreateTableFromFile
File Format
Supported File formats are
- ASCII8
- UTF16ASUTF8
Table Name
A table name must be provided. By default, the filename will be used. If the table already exists, the process will generate an error. See Field and Table Names for more information.
Field Definition
By default each field will be assigned a field type of SKIP - this means that the field will not be loaded.
Field-Data Types can be specified by highlighting a row and right-clicking in the Field Type column. The field-data type selector will be displayed. Supported field-data types are:
- SKIP (field will not be loaded)
- STRING DISCRETE
- STRING CONTINUOUS
- INTEGER DISCRETE
- INTEGER CONTINUOUS
- LONG DISCRETE
- LONG CONTINUOUS
- DATE
- DOUBLE DISCRETE
- DOUBLE CONTINUOUS
- DATETIME
All rows that are selected in the Field-Type grid will have the selected type assigned to them.
To auto-assign field-types, use Delimited File Importer | Tools | Auto-Guess Data Types. Depending on the size of the file, this can take some minutes to run. Note that if loading a single table from multiple files, Auto-Guess may assume a field is discrete based on the content of the file used to detect types, but when all files for the table are processed, the data-type may need to change. This could cause a load to fail - to assess data types, try loading a single file, and then run a data audit to assess the percentage of records that are discrete for each field.
Preview
Once a file has been selected, the Delimited File Importer will load a sample of the data into the preview grid.
NOTE: Very wide files may not have a preview displayed.
Menus
Tools | Alter Field Names
- To Lower Case - casts all field names to lower case
- To Upper Case - casts all field names to upper case
Tools | Set Pre-Header Skip
If header data is on line 4, enter 3 in "Set Pre-Header Skip"
Tools | Reset Data Types
Resets all Field-Data types to SKIP
Tools | Auto Guess Data Types
Analyses file to guess field-data types.
NOTE: This may take some minutes to run.
File | Upload
File Upload will open the Remote File Manager in the DataSources directory, allowing the user to upload a local file to the server:
Strip
Strips the provided character from all data as it is loaded
Date Format
Specifies the default format of dates in the file. Available formats are:
- YYYY-MM-DD
- MM-DD-YYYY
- DD-MM-YYYY
- YYYYMMDD
- MMDDYYYY
- DDMMYYYY
- M/D/Y
- D/M/Y
The default is YYYY-MM-DD. Formats can be over-ridden on a field by field basis by editing the Field-Data Type for the column.
Datetime Format
Specifies the default format of datetime fields in the file. Available formats are:
- YYYY-MM-DD HH:MM:SS
- DD-MM-YYYY HH:MM:SS
- YYYY-MM-DD HH:MM
- DD-MM-YYYY HH:MM
- DD-MM-YYYY HH:MM:SS
The default is YYYY-MM-DD HH:MM:SS. Formats can be over-ridden on a field by field basis by editing the Field-Data Type for the column.
See also QQ - Can I load DateTime data in format DD/MM/YYYY HH:SS?
Save
Clicking Save will create a new CreateTableFromFile entry in script:
{
"project": "Demonstration",
"method": "CreateTableFromFile",
"action": "LOAD",
"table": "01aJourneyDataExtract10Jan16-23J",
"filename": "%DATAPATH%BikeData/BikeTripData/*",
"definition": [
"Rental Id|CONTINUOUS|INTEGER|",
"Duration|DISCRETE|INTEGER|BYTE",
"Bike Id|CONTINUOUS|INTEGER|",
"End Date|CONTINUOUS|DATETIME|",
"EndStation Id|CONTINUOUS|INTEGER|",
"EndStation Name|DISCRETE|STRING|BYTE",
"Start Date|CONTINUOUS|DATETIME|",
"StartStation Id|CONTINUOUS|INTEGER|",
"StartStation Name|DISCRETE|STRING|BYTE"
],
"loading": [
"Rental Id",
"Duration",
"Bike Id",
"End Date",
"EndStation Id",
"EndStation Name",
"Start Date",
"StartStation Id",
"StartStation Name"
],
"sample": false,
"preHeaderSkip": 0,
"dateFormat": "YYYY-MM-DD",
"fileFormat": "ASCII8",
"skipFirstLine": true,
"delimiter": "COMMA",
"stripCharacter": "",
"dateTimeFormat": "YYYY-MM-DD HH:MM:SS"
}
To Edit the JSON using the editor, select the CreateTableFromFile method in the Script Panel, right-click and choose Edit.
To modify parameters, select the CreateTableFromFile method in the Script Panel, right-click in the JSON window and select the desired options.
Sample
NOTE: Before starting a load on an unknown data source, consider using the Sample option in script editor to load a subset of the data.