QSAS IndexQSAS: The QM Science Analysis System

DCM (Data Configuration Map) File Specification

Introduction

This is the specification of the format of DCM files for Qdos databases.

File Specification

The file is text, and line oriented. The basic format is CSV (comma separated variable) as used by eg Microsoft Excel. In this format sometimes a field may be surrounded by double quotes e.g: "field one".

Line Components

All fields within a line are treated as strings.

There are two cases where string substitution can take place:

Aliases
If an alias called "alias_name" has been declared using an "alias" line then the string "$(alias_name)" is substituted by the value of the alias. The substition occurs when the file is read. If "alias_name" is not defined, then a null string value should be used, and, possibly, a warning reported. Aliases should have "file scope" when multiple configuration files are merged together. For example, if File1 merges File2 then the aliases of File2 take precedence over File1, but the aliases of File1 are still available while File2 is being read. (The behaviour should be like that scope of variables in a C program.)
Environment Variable Aliases
If there is an environment variable called "env_name" then the string "$[env_name]" is substituted by the value of the environment variable. The substition occurs when the file is read. If "env_name" is not defined, then a null string value should be used; no warning/error should be reported.

Data Priority

Each data entry in the DataMap has a "priority value" which is a numerical value used to decide between entries, when those entries overlap (at least in some sense, that might depend on the type of the entry). In general, the entry with the highest priority value takes precedence over other entries (everything else being equal)

For time structured data the precedence rules are as follows:

  1. At any time, the entry with the highest priority value takes precedence.
  2. If two data entries have equal priority then then entry with the earlier start time has precedence.
  3. If two data entries have equal priority, and equal start times or both contain the requested start time, then the entry taking precendence is not defined. (This is because entries will probably be sorted on start time, and order of entries as read in may not be preserved.)

Line Types

A Data Map Configuration file contains different types of line, which are indicated by the value in the first field. Lines without a recognised first field should be silently ignored. The following line types are defined:

comment, alias, unalias, merge, data, dcm_info, data_info

Comment line

  field_0 = "comment"
The line (any number of fields) is ignored.

DCM Information Line

  field_0 = "dcm_info"
  field_1 = information ...

  example: dcm_info,Merge of my data plus Fred's 24-nov-2001
Specifies information specifically about the DCM, that can be used by applications to inform a user about the DCM. There can be any number of dcm_info lines, and these are all accumulated as a sequence of strings. Care must be taken if there is a comma in the information; use of quotes around the field is recommended in this case.

DCM_info line

Data Information Line

  field_0 = "data_info"
  field_1 = dcm_data_base_type
  field_2 = data_name
  field_3 = information ...

  example: data_info,cdf:ts,/ampte/mag_field,Magnetic field in nT
Specifies information specifically about a specific data source name (in the context of a specific data base type) that can be used by applications to inform a user about that data source. There can be any number of data_info lines, and these are all accumulated as a sequence of strings. The first such line should be self contained and concise, since it may be used to give a one line summary about the data source. Care must be taken if there is a comma in the information; use of quotes around the field is recommended in this case.

Alias Line

  field_0 = "alias"
  field_1 = alias_name
  field_2 = alias_string
  field_3 (optional) = alias_string
  ... etc ...


  example: alias,AMPTE_PP_ELX_84,/mission/ampte/1984/cdf/elx
Specifies an alias string for the given alias name, such that whenever the string "$(alias_name)" is encountered it is substituted by its alias string.

Alias substitution occurs IMMEDIATELY after a line is read from the configuration file. So an alias can change its value in the course of the file being read. For example:

    alias,XX,hello
    data,$(XX)/world      ( equivalent to:  data,hello/world )
    alias,XX,farewell
    data,$(XX)/world      ( equivalent to:  data,farewell/world )
In the alternative form, where there are optional alias strings, then the alias value is set to the first non-null string. The effect is a conditional alias statement. Thus: alias,XX,,,fred is equivalent to alias,XX,fred

The intended use of this form is where one of the alias strings is another alias or environment alias. For example: alias,QDOS_MAIN_DIR,$[QDOS_DIR],/qdos/main has the effect of setting the alias QDOS_MAIN_DIR from the environment variable QDOS_DIR if it exists, but to "/qdos/main" otherwise.

Unalias line

  field_0 = "unalias"
  field_1 = alias_name
Removes alias_name from list of aliases.
** There could be an "unalias_all" function, but it is not clear
** how useful this would be, and how it would interact with the
** file based scope of aliases.

Merge line

  field_0 = "merge"
  field_1 = relative_priority_value
  field_2 = merge_dmc_file_name
Merge another DMC file with name "merge_dmc_file_name", into the current Data Map. Data entries in the merge file are added to the current list; each data entry has its priority changed by the numerical relative_priority_value, which can be any value (including negative values). The string representation should satisfy the rules for the C library stdio scanf "%lg" representation (ie double).

If the merge_dmc_file_name is not absolute (ie doesn't start with "/"), then the search algorithm for the merge file is as follows:

  1. Directory of current DMC file
  2. Current working directory ie $[PWD]
  3. $(QCONFIG_DIR) if alias exists
  4. $[QCONFIG_DIR] if evironment variable exists
Example:
    merge,10,data_map_v2
    merge,0,data_map_v1
This combines entries from data_map_v2 and data_map_v1, putting the v2 entries at a higher entry than the v1 entries. The effect is that v1 data is available for (eg) times when v2 is not available.

Data line: cdf:ts (CDF time structured database)

  field_0 = "data"
  field_1 = "cdf:ts"
  field_2 = priority_value
  field_3 = data_source_name
  field_4 = CDF_data_variable
  field_5 = CDF_time_tags_variable
  field_6 = CDF_time_tags_type
  field_7 = full_data_file_name
  field_8 = time_interval_start
  field_9 = time_interval_end
  field_10 = number_items_in_file
  field_11 (optional) = data_availability_map

  example:

   data,cdf:ts,0,edi/c1/Evec,EDI_PP%E_C1,Epoch,cdf_epoch,$(EDI_C1)/edidata.cdf,
     01-Jan-2000 00:00:00.000,01-Jan-2000 12:00:00.000,720,300:zzz0000zSCzzznn

  [Note: line above has been split to be more readable.]

  priority_value         ...  numerical value to be assigned to this
                              data entry. The string representation
                              should satisfy the rules for the C library
                              stdio scanf "%lg" representation (ie double).
  data_source_name       ...  data name as supplied to the Qdos data source
                              This is a name within the root context of
                              the data source, and so should not start with
                              a "/".
  CDF_data_variable      ...  name of CDF variable for data
  CDF_time_tags_variable ...  name of CDF variable for time tags
  CDF_time_tags_type     ...  type/format for time tags
                              Recognized types: "cdf_epoch"
                              * Others might be added.
  full_data_file_name    ...  full (ie absolute) data file name
  time_interval_start    ...  (inclusive) as ISO format time string
  time_interval_end      ...  (inclusive) as ISO format time string
  number_items_in_file   ...  number of data items in the file; each
                              item is a complete data object with a
                              single time tag. This is the same as
                              the number of records in the file for
                              CDF_data_variable (and CDF_time_tags_variable).
  data_availability_map  ...  text string to be used to show data availability
                              within time interval. Optional.
                              Format: res":"DAchars
                              res = numeric value of resolution (ie time
                              quantum size) in seconds for each bit of
                              the data availability bit pattern
                              DAchars = character encoding of 
                                        data availability bit pattern,
                                        where 1/0 indicates
                                        presence/absence of any data within
                                        the respective subinterval
                                        The encoding is a 6 bits per word
                                        value mapped to the characters
                                        0-9,@,?,A-Z,a-z (a total of 64 chars).
                              Example: 200:ajaj00000zzzzzzz3

There will generally be many lines with the same data_source_name, each line providing the file and CDF variable name for different time intervals. 
QSAS IndexPage created by Janet Barnes, csc-support-dl@imperial.ac.uk

Last up-dated: Septemeber 2001