Mandatory Variable Metadata

Next: Extra Metadata for Depend Up: Variable Metadata Previous: Variable Metadata Contents

Mandatory Variable Metadata

These metadata provide formatting information specific to the named variable. There is no preferred order for parameters within a variable metadata block.

Each block of variable metadata takes the form,

START_VARIABLE = name

parameter = value

$\vdots$ $\vdots$

END_VARIABLE = name

Where data in science units are provided it is required that metadata sufficient to describe that data for science use is provided. The following metadata are required whenever meaningful.

VALUE_TYPE

This identifies the data type and is necessary for conversion from the ascii entry. Allowed values are
ISO_TIME
FLOAT
DOUBLE
INT
CHAR
BYTE

SIZES

This is essential for any variable that has more than one element, such as arrays and vectors. The value string must comprise as many comma-separated integer values as there are array indices in the variable with the number giving the size of the array over that index. Thus an 8 by 54 array would have the entry
SIZES = 8,54
It is not required for scalars.

DATA

The concept of a variable that is fixed for all records is supported for cef files. Data for these `non-record-varying' variables must be supplied within the header variable metadata segment, and no entry is then allowed in the data records. The presence of a parameter `DATA' will be taken to indicate that this is a non-record-varying variable. The value(s) associated with this parameter are the data for that variable. These are particularly useful for label variables. They are comma separated. For array data elements will appear in the natural C ordering - last index varies fastest, and data lines for arrays may be continued using ` $\backslash$ ' as a continuation marker following one of the commas separating the list of values.

REPRESENTATION_i

This is a rigorous generalisation of the ISTP FRAME attribute for CAA, and allowed values are enumerated in the Metadata Dictionary, CAA-CDPP-TN-0002. It is only used for tensor types (vectors and tensors). There are as many of these attributes as there are indices i in the tensor. A vector takes only REPRESENTATION_1. This should have the same number of entries as the dimension of index i. For example, a cartesian pressure tensor takes
REPRESENTATION_1 = "x", "y", "z"
REPRESENTATION_2 = "x", "y", "z"
indicating that the components are
"","","","","","","","",""
A complete tensor in real 3-space will have three components in each index. An incomplete tensor may be specified where the dimension of one or more indices are less than three. For example a vector measured in the xy-plane will have
REPRESENTATION_1 = "x", "y"
and this in turn implies it may be rotated in the xy-plane but not around any other axis. This will often duplicate LABEL_i, and is provided distinctly as it identifes TENSOR type data explicitly, and this has science processing connotations - ability to rotate, for example.

TENSOR_ORDER

This is the number of indices for a tensor data type. Vectors have rank 1.

SI_CONVERSION

Required for all data in science units. Text string of the form
number

SI unit
where number is the conversion factor to SI units. It is the factor that the variable must be multiplied by in order to turn it into SI units. The string SI unit is the standard unit that it converts to. For example the magnetic field for FGM may be in nT, and to convert to Tesla the value of ``SI_CONVERSION'' should be `1.0e-9

T'. For compound units the grammar will be of a standard form: distinct unit dimensions will be separated by space characters and powers (signed) will be preceded by the carat, $\scriptsize\wedge$ . Non-dimensional qualifiers, which do not appear in the SI units list, are to be enclosed in braces `()'. For example, `m s $\scriptsize\wedge$

1' or `(number electrons) m $\scriptsize\wedge$

3'. Similarly `(percent)' and `(ratio)' would provide user information on dimensionless quantities. Non-integer powers are permitted, e.g. `Hz $\scriptsize\wedge$

0.5'. SI units should be one of:

s second
kg kilogram
m metre
Hz hertz
A ampere
K kelvin
J joule
V volt
T tesla
Pa pascal
C coulomb
H henry [needed for $mu\_o$ ]
F farad [needed for $eps\_o$ ]
W watt
N newton
ohm
mho
rad radian
sr steradian
degree [alternative angle measure, not SI, but convenient and often used].
unitless [added for compliance with the MDD documentation, and used only when no units can be specified, e.g. Tpar/Tperp]

FIELDNAM

Name for the field, generally not as short as LABLAXIS

LABLAXIS

Short name suitable for labelling the data. The units need not be included as they are supplied separately in UNITS. In the case of vectors and arrays this will be the common stem applicable for all dimensions, for example, a cartesian velocity might have LABLAXIS ``V'' with the component specifiel by LABEL_1 (see below).

DEPEND_i and LABEL_i

These contain information identifying the i $^{th}$ dimension in an array variable. Either one or the other of these attributes must exist for each index of an array, but never both.

The LABEL_i attribute is used when it is sufficient to provide a text label for each entry for this index. For example a vector such as a velocity in cartesian gse coordinates will take a LABEL_1 attribute such as
LABEL_1 = "x", "y", "z"
and the stem "V" is provided by the LABLAXIS (see above). Similarly, a pressure tensor in cartesians would have
LABEL_1 = "x", "y", "z"
LABEL_2 = "x", "y", "z"
and stem "P" provided by LABLAXIS. This permits construction of labels for individual components from the stem plus appropriate entry under LABEL_i.

By contrast, DEPEND_i will normally point to another variable as it is used when arrays need descriptions that are more complex than labels. Vectors, tensors, and non-ordered arrays, such as Status have LABEL_i attributes rather than DEPEND_i. Quantities such as power spectra and particle spectra will require DEPEND_i attributes to describe numerical quantities such as bin boundaries and other metadata describing the physical quantities associated with that array index (see below). DEPEND_i attributes will in general take a LABLAXIS attribute themselves to provide a label stem for the numerical bin descriptions, but they do not normally require LABEL_i or DEPEND_i attributes themselves.

There must be as many of these parameters as there are entries in the Sizes value. Thus a 3-D array would require three parameters, DEPEND_1, DEPEND_2 and DEPEND_3 (or equivalently LABEL_i for one or more of the indices). The variables identified may be either non-record-varying or supplied for each record to allow for changing bin boundaries such as variable energy sweeps. Each of the identified dimension variables must have a defining variable block in the header as normal.

Depend variables should be 1-D arrays of the same size as the dimension for that array index. Thus for an array with the first index varying over azimuthal angular bins, a Depend_1 variable describing these N azimuthal angular bins might point to a variable Dimension_phi of size N, giving the centre of each bin.

Where DEPEND_i is providing information on binning (e.g. angle, energy and frequency ranges) then DEPEND_i will itself also have a DELTA_PLUS and a DELTA_MINUS attribute providing bin edge offsets. The lower bin boundary is given by the appropriate element in DEPEND_i - DELTA_MINUS and the upper boundary by DEPEND_i + DELTA_PLUS. The bin width is given by DELTA_PLUS + DELTA_MINUS.

If binning is regular so that all bins have the same width then a single value may be provided for the DELTA attributes, otherwise they must each have the same number of entries, N, as the dimension of index i. Where bins are open ended, e.g. "above 100MeV", then either DEPEND_i and DELTA_PLUS(MINUS) should represent an estimate of the instrument responsiveness to give an effective location and width, or they should fit a logical progression from lower bins. Descriptive attributes should be provided to explain this to the user.

The syntax does not permit the use of N+1 elements to describe a dimension of size N, even though this would be more efficient for uniform touching bins (i.e. just specifying the N+1 bin boundaries).

Complex data values are represented by an extra dimension taking a LABEL_i indicating either "Re" or "Im". If this is the last index then the real and imaginary parts of a value are successive entries in the record.

DELTA_PLUS and DELTA_MINUS

DEPEND_i variables, including the time tags which are DEPEND_0 for the data, may have these. They describe the range over which the data are integrated, representative, etc, and locate the position of the time tag or value within this range. In the case of time, DELTA_PLUS and DELTA_MINUS are the number of seconds corresponding to the sampling interval or other representative time interval, given in seconds as a float. Alternatively, DELTA_PLUS and DELTA_MINUS for time variables could themselves be variables, e.g., if they are record-varying, in which case they require their own metadata. For an array of energies used as dependencies for array data, DELTA_PLUS and DELTA_MINUS used together with the energy value provide a complete description of the energy bin over which the measurement was made or is representative.

It is also advisable to supply extra information to describe the relationships between the DEPEND variables with the parent variable, such as the method required to construct the volume of the bin. No syntax is prescribed for these parameters which should be text, but examples of possible descriptors are listed below.

For some data it will be necessary to specify different area and volume factors depending on the slice to be taken through the data. Added parameters may be specified to cover any integration or processing factors appropriate.

For example, consider a 2-D array of phase space densities calculated at centre energies and starting polar angles $\theta_j$ and over a constant azimuthal angular range of $15^\circ$ . A suitable description parameter and described DEPEND variables could take the value

DEPEND_1 = E
and the variable E block would contain metadata
DELTA_PLUS = WE
DELTA_MINUS = WE

where WE is another variable (or a single constant entry or a comma separated list of the right length) and

DEPEND_2 = theta
and the variable theta block would contain metadata
DELTA_PLUS = Wth
DELTA_MINUS = 0

Other attributes may be defined to show the use of these form factors in calculating quantities associated with the binning, such as

Theta_Factor = "TFactor[j] is cos(theta[j])-cos(theta[j]+Wth[j])"
Energy_Factor = "EFactor[i] for volume is E[i]^2 *2*WE[i] + (2/3)WE[i]^3"
Volume_Factor = "V[i][j] = EFactor[i]*TFactor[j]* pi * 15/180"

In this example the energy dimension is tagged by an array giving the mid energies for each energy bin, and another for the bin half-width. The extra attributes then describe constructing the volume element of the [i][j] element in the data array. Thus, if the data array is in partial densities, then the phase space density is found by dividing the data dn[i][j] by V[i][j]. (In practice, the partial density would probably be an integration in velocity and require converting from energy to velocity, with a different V[i][j].

For fixed bin widths of 100 eV, say, the subscript would be removed from the WE[i] in the above, and a separate WE variable would not be used.

Next: Extra Metadata for Depend Up: Variable Metadata Previous: Variable Metadata Contents

Anthony Allen 2009-10-19