Release Note: XXX

Prev Next

Article In Progress...

Key Features

The focus of this release is on improving analytical workflows for audience creation.   Analytical reports and algorithms have been extended to provide additional processing capability and to allow for scripted flow of data from one process to another. 

  • Workflows: Sample workflow scripts and documentation for Audience Creation, Target Selection, Statistical Analysis...etc.
  • Enhanced Segment Correlation (AnalyseSegments): Now includes:
    • Indexing Columns to show over- and under-representation. 
    • Direct extraction of underlying records from the generated output report via Data Table Viewer
    • Multiple output formats: parquet, excel, table, data table.
    • Integration with Multi-Field Statistics report  
    • Custom processing of results available via Secondary Processing, allowing for thresholds and other filters to be applied (i.e., integration with python model processor)
    • Extensible charting via integration with python matplotlib.
  • Index Profile: Now includes:
    • Custom processing of results available via Secondary Processing, allowing for thresholds and other filters to be applied  (i.e., integration with python model processor)
    • Target Dataset Iteration - repeat the calculation for a set of target datasets and output results in a single grid, or as a dashboard
    • Dimension Iteration - repeat the calculation for a set of dimensions (i.e., columns) and output results to a dashboard, or extract segments that meet specified criteria into a dataset collection
    • Caching - Integration with Data Table Viewer
    • Field Template can be used to provide Multiple Dimensions
  • Overlap ReportNew Analysis Report.  Allows the calculation of intersections across multiple datasets. Supports:
    • Up to 19 datasets, base filter, target filter, record extraction, result filters, resolution tables, measures, export to table, export to file, copy and paste, hit counts
    • Visualisation - Upset plot of segment inclusion/exclusion and hit-count pie-chart also included.
    • Caching - Integration with Data Table Viewer
    • Target Dataset Iteration - repeat the calculation for a set of target datasets and output results in a single grid 
    • Integration with Venn Report?
  • Data Table Viewer: Provides the ability to view pre-calculated data tables as grids.  Multiple models - AnalyseSegments, Overlap, Index Profile - can be calculated during data refresh/load, ready for exploration and review by analysts.  Export Data to Table, Excel and Parquet.  
    • With Data Extraction:  AnalyseSegments
    • Without Data Extraction: CreateSegmentModel, GenProfile, IProfile, Overlap, QueryMatrix, Venn
  • Segment Dataset Type: Enables segments to be created directly from the Campaign report, seamlessly integrating them into all analysis and reporting workflows.  Removes the need to pivot data from the campaign system to the Primary Contact Table.
  • Venn: Improved labelling, overlap analysis grid, extraction of data from the grid, integration with overlap report
  • Automated Data Generation: multiple methods of automating and modifying repeated output from analytical processes 
    • foreach : Automates calculation of analytical processes and reports by iterating through input lists of fields and datasets.  Creates groups of objects (fields, templates and dataset collections) for use in downstream processing.  Available initially for: 
      • Index Profile - Iterate by Target Dataset or Dimension
      • Overlap Analysis - Iterate by Target Dataset
      • Aggregate - Iterate by Source Field
    • BuildReport: Builds a multi-report dashboard by iterating through a supplied field list.  Initially for:
      • Index Profile
    • secondaryProcess: Runs custom processing on output from analytical processes.  Initially for:
      • Index Profile
      • Profile
      • AnalyseSegments
      • Overlap
      • Query Matrix
    • CreateCollectionFromProcess: Takes an Analytical Object and creates datasets from output rows. Includes support for secondary processing. NOTE: Only extracts data from single-dimension axes. Data is taken from first column. Generic support for all processes that create a data table model.
  • Administrator: Direct access to Data Tables and Context sensitive menus from Runner.   Added DataTableModel to packages.  Packages available in Remote File Manager. "help" command added to command line. "state" command added to help line.
  • Pre-Load Processing:  Various improvements to improve handling of poor quality data during load, or as part of pre-load processing. Delimiter check during load, filename prefix, DelimiterCheck, MultiDelimiterCheck, Pre-load processing scripts, Load from Parquet
  • Extended Python Integration:  Analysis reports now support export to parquet file, export to JSON, and have extensible plotting features. Model Processor now accepts multiple report types.
    • Model Processor Data Inputs: discrete fields, multi-function profiles, profiles.  Now also accepts: index profiles and query matrices 
    • Export to Parquet: AnalyseSegments, ExecuteReport, GenProfile, IProfile, Overlap
  • General UI:  CreateTemplateFromFile, Display of Dataset Collections and Field Templates in Context Panel, TranslateDatasetCollection, TranslateTemplate, Import Collections from Packages, SaveCollectionAsField, Open in Discovery
  • General Bug Fixes: 
  • Module Updates: Mongo P1 Security patch

Install Requirements

The following modules need to be installed in order to support all features of this release:


Documentation TODO:

DONE:

  • secondaryProcess
  • ModelSnippetAdd
  • Model Processor - rewrite to fit User Defined Field format
  • Model Processor - where are models stored?  Git Hub


BLOCKED:

  • Index Profile - Create Collection From Collection
  • CreateCollectionFromProcess - can this with with dropped dataset
  • BuildReport - can this work with dropped dataset
  • Shortcuts - Waiting for build
  • TranslateTemplate - waiting for build
  • TranslateCollection - waiting for build
  • AnalyseSegments 
    • plotter: chart TODO - remove references to this - create a plotter JSON object
    • implement TopbyBottomBy as a secondary process and document.

TODO:

  • Identify high risk changes and add flag to the documentation
  • Target Dataset Selection Workflow
  • dataset JSON object
  • plotter JSON object
  • Update Datasets and Collections
  • Iterating Calculations
  • Data Table Viewer
  • Intrinsic Functions - update to include segment[]
  • Runner 
  • Database build and preparation - add DelimiterCheck, MultiDelimiterCheck, Pre-load processing scripts, load from parquet
  • Workflows 
    • Audience Creation
    • Data Preparation
    • Basic Segment Analysis
    • Noise/Signal Detection
    • Statistical Analysis
  • Deep Dive - Data Table Format
  • Parquet export - add to: Profile, Overlap, IProfile, AnalyseSegments, ExecuteReport??
  • Exporting Data
  • Segment Correlation (AnalyseSegments) - finish plotter and secondaryProcess.   re-locate
  • Multi-Field Statistics
  • Scripting
    • Overlap
    • Delimiter Check
    • MultiDelimiterCheck
  • Command Line - help, state
  • CopyUpFromTemplate - article incomplete
  • CopyDownFromTemplate - article incomplete
  • Summary of API changes



New Feature Summary

Enhanced Segment Correlation



New Features

  • DM-639 P7 AMP-220 Extend support for ranking to support the use of a continuous fields as the ranking field (REMAP function)
  • Segment Dataset Type - Datasets can now be created directly from campaign segments and used interchangeably with other datasets in queries, reports etc.  Removes the need to create pivot tables before accessing segment datasets.
  • Project Variables
  • Data Model Viewer - allows existing Data Model tables to be viewed, and underlying records extracted and saved
    • The following methods generate models that are supported by the Data Model viewer:
      • AnalyseSegments
      • CreateSegmentModel
      • {TODO}
  • CreateSegmentModel - Optimised version of AnalyseSegments that creates a table but no spreadsheet.  Model is automatically added to Model Storage Hub.  
  • AnalyseSegments, CreateSegmentModel
    • Outputs a Data Model table that allows underlying records to be selected from a cell
    • Index calculations allow for rapid identification of segments that are under or over indexed for specified target datasets
    • Manage Data Sets wizard for editing of target Datasets now available in Runner
  • Script Editor | Previous - opens previously executed scripts and their target projects
  • Overlap report
  • Models and Templates can now be imported when injecting data via a package


Improvements

  • DM-777 post profile process date columns can handle year month columns
  • DM-778 INJECTION  'debugIgnoreInjectionErrors' added to InjectPackage to help debug project when repo is not available
  • Data Model Table display added to Runner output - Allows runner output to be viewed in grid form
  • Templates can be added as default views to datasets
  • Scripts support %CURRENT_VERSION% - allows DataJet server version to be used as needed in object descriptions and script development
  • DATEFROM supports constants (previously only supported fields)
  • Expression "Replace" allows an empty string for "replacement_string"

Administration Tools

  • Command Processor - Admin | Command Interface - modifications:
    • Re-initiaise campaign manager with "Reset campaign manager" command - {TODO: Why might you need this?}
    • "States" command added to command window - allows different service states to be verified (e.g., Mongo, Campaign, Script Hub, Package Hub)
    • Upgraded to support the default grid control
    • support for "help" or "?"
  • Packages have their own folder in Remote File Manager - supports easy retrieval of package files for debug purposes
  • C


New API Calls

  • CreateSegmentModel
  • DataTableQueryModel
  • ListDataTableModels
  • DropDataTableModel
  • GetDataTableModel
  • ExportDataTableModelIntoTable
  • SetDefaultTemplate
  • GetDefaultTemplate
  • Overlap


Details

DM-639 P7 AMP-220 Extend support for ranking to support continuous fields as ranking field.

Ranking is limited to discrete fields.  To rank a continuous field, first remap the field to reduce the number of possible values, and then use the remapped field as the ranking field.