Delimited File
  • 4 Minutes to read
  • Dark
    Light
  • PDF

Delimited File

  • Dark
    Light
  • PDF

Article summary

Filename

Use the file selector button to navigate to a file in the DataSources directory


This will bring up the Remote Data Source dialog which allows the user to navigate through the Data Sources structure:

(NOTE: To navigate to data on the local machine, press SHIFT when pressing the file selector button.    Generally this is not advised as it could lead to very slow load times, depending on the client-server architecture, as well as creating issues when sharing scripts with other users.)

Filename can be an individual file, or the contents of a directory. To load the content of a directory, the method must be edited in the JSON window of Script Editor.   For more information on this see CreateTableFromFile

When using the file selector, the filename will include the System Variable %DATAPATH%.  For more information on System Variables and their use in script, see System Variables.  If specifying data files that are not in the DataSources directory or sub-directory, be sure that the server has the necessary permissions to access and load data.

Headers

If selected, headers will be created from the first line of the file.   If unselected, automatic field names will be created of the form Field[N] where [N] is the ordinal position of the column in file.  Note that a definition file can be used instead to define header names - for more information see CreateTableFromFile | definitionFile

Delimiter

The following delimiters are supported:

  • Tab
  • Comma
  • Pipe
  • Semi-Colon
  • HSTART

Other characters can be specified as a delimiter by editing the method JSON directly. For more information on this see CreateTableFromFile

File Format

Supported File formats are

  • ASCII8
  • UTF16ASUTF8

Table Name

A table name must be provided.   By default, the filename will be used.   If the table already exists, the process will generate an error.  See Field and Table Names for more information.

Field Definition

By default each field will be assigned a field type of SKIP - this means that the field will not be loaded. 

Field-Data Types can be specified by highlighting a row and right-clicking in the Field Type column.  The field-data type selector will be displayed.  Supported field-data types are:

  • SKIP (field will not be loaded)
  • STRING DISCRETE
  • STRING CONTINUOUS
  • INTEGER DISCRETE
  • INTEGER CONTINUOUS
  • LONG DISCRETE
  • LONG CONTINUOUS
  • DATE
  • DOUBLE DISCRETE
  • DOUBLE CONTINUOUS
  • DATETIME

All rows that are selected in the Field-Type grid will have the selected type assigned to them.

To auto-assign field-types, use Delimited File Importer | Tools | Auto-Guess Data Types.     Depending on the size of the file, this can take some minutes to run.   Note that if loading a single table from multiple files, Auto-Guess may assume a field is discrete based on the content of the file used to detect types, but when all files for the table are processed, the data-type may need to change.   This could cause a load to fail - to assess data types, try loading a single file, and then run a data audit to assess the percentage of records that are discrete for each field.    

Preview

Once a file has been selected, the Delimited File Importer will load a sample of the data into the preview grid.   

NOTE: Very wide files may not have a preview displayed. 


Tools | Alter Field Names

  • To Lower Case - casts all field names to lower case
  • To Upper Case - casts all field names to upper case

Tools | Set Pre-Header Skip

If header data is on line 4, enter 3 in "Set Pre-Header Skip"

Tools | Reset Data Types

Resets all Field-Data types to SKIP

Tools | Auto Guess Data Types

Analyses file to guess field-data types.

NOTE: This may take some minutes to run.

File | Upload

File Upload will open the Remote File Manager in the DataSources directory, allowing the user to upload a local file to the server:

Strip

Strips the provided character from all data as it is loaded

Date Format

Specifies the default format of dates in the file.  Available formats are:

  • YYYY-MM-DD
  • MM-DD-YYYY
  • DD-MM-YYYY
  • YYYYMMDD
  • MMDDYYYY
  • DDMMYYYY
  • M/D/Y
  • D/M/Y

The default is YYYY-MM-DD.  Formats can be over-ridden on a field by field basis by editing the Field-Data Type for the column.

Datetime Format

Specifies the default format of datetime fields in the file.  Available formats are:

  • YYYY-MM-DD HH:MM:SS
  • DD-MM-YYYY HH:MM:SS
  • YYYY-MM-DD HH:MM
  • DD-MM-YYYY HH:MM
  • DD-MM-YYYY HH:MM:SS 

The default is YYYY-MM-DD HH:MM:SS.  Formats can be over-ridden on a field by field basis by editing the Field-Data Type for the column.

See also QQ - Can I load DateTime data in format DD/MM/YYYY HH:SS?

Save

Clicking Save will create a new CreateTableFromFile entry in script:

{
  "project": "Demonstration",
  "method": "CreateTableFromFile",
  "action": "LOAD",
  "table": "01aJourneyDataExtract10Jan16-23J",
  "filename": "%DATAPATH%BikeData/BikeTripData/*",
  "definition": [
    "Rental Id|CONTINUOUS|INTEGER|",
    "Duration|DISCRETE|INTEGER|BYTE",
    "Bike Id|CONTINUOUS|INTEGER|",
    "End Date|CONTINUOUS|DATETIME|",
    "EndStation Id|CONTINUOUS|INTEGER|",
    "EndStation Name|DISCRETE|STRING|BYTE",
    "Start Date|CONTINUOUS|DATETIME|",
    "StartStation Id|CONTINUOUS|INTEGER|",
    "StartStation Name|DISCRETE|STRING|BYTE"
  ],
  "loading": [
    "Rental Id",
    "Duration",
    "Bike Id",
    "End Date",
    "EndStation Id",
    "EndStation Name",
    "Start Date",
    "StartStation Id",
    "StartStation Name"
  ],
  "sample": false,
  "preHeaderSkip": 0,
  "dateFormat": "YYYY-MM-DD",
  "fileFormat": "ASCII8",
  "skipFirstLine": true,
  "delimiter": "COMMA",
  "stripCharacter": "",
  "dateTimeFormat": "YYYY-MM-DD HH:MM:SS"
}

To Edit the JSON using the editor, select the CreateTableFromFile method in the Script Panel, right-click and choose Edit.

To modify parameters, select the CreateTableFromFile method in the Script Panel, right-click in the JSON window and select the desired options.

Sample

NOTE: Before starting a load on an unknown data source, consider using the Sample option in script editor to load a subset of the data. 










Was this article helpful?

What's Next