Reference guide to programs and utilities
This section has usage information for the numerous programs and scripts
used in the DiFX system. Basic help information for most or all of these
programs can be gotten by typing the program name with either no command
line arguments or with a -h option, depending on the program. In the
usage descriptions below, arguments in square brackets
\([\) \(]\) are optional and can often include multiple
different parameters. Cases where 1 or more arguments of a certain type
(such as files) can be passed to the program, the usage instructions
will look like arg1 \([\ \cdots\) argN\(]\), with the
implication that N arguments of this type were passed. In cases where 0
arguments of that type is also allowed, that first argument will also be
in square brackets. If it is not obvious from the program name, the
software package containing the program follows the section header. The
package that includes each program is included in its section heading.
Note that several VLBA specific programs are discussed in this manual
that are not documented here, such as tsm. These are preexisting
programs that may be documented elsewhere and are less likely to be
useful outside VLBA operations. Also note that programs from the
nrao_difx_db package are internal to NRAO and in general are not
applicable outside VLBA operations. The code for these programs can be
made available upon request.
apd2antenna \(\mathrm{(package: difx2fits)}\)
Python program apd2antenna will read a .apd file (see
Sec. [sec:apd]), which stores baseline-based fringe-fit
solutions, and writes to stdout an antenna-based set of fringe-fit
values. These are determined through a least-squares fit. A reference
antenna must be specfied.
Usage:
apd2antennaapdFile refAntapdFile is a
.apdfile (§[sec:apd]), created bydifx2fitsrefAnt is a 1-based number or text string indicating the reference antenna to be used
Example:
apd2antenna DQ1206.apd 2
A reference antenna is required as the least-squares solution is ill-determined otherwise. The reference antenna has its delay, rate, and phase set to zero in the process.
The output of this program, which is sent to stdout is documented in
Sec. [sec:aapd]. It is conventional to redirect the
output of this program to a file ending in .aapd .
avgDiFX \(\mathrm{(package: difxio)}\)
avgDiFX is a utility packaged with difxio. It reads two complete
DiFX filesets that need to cover the same timerange and have essentially
the same array structure and produces a new complete fileset. The
.input, .calc, and .im files, along with any extracted pulse
cal data, are copied from the first of the two datasets. The visibility
data from the two file sets is averaged. Certain pulsar processing may
benefit from this capability.
Usage:
avgDiFXconfigFile1 configFile2 outputConfigFileExample:
avgDiFX pass1_01.input pass2_01.input output_01.input
bp2antenna \(\mathrm{(package: difx2fits)}\)
Python program bp2antenna will read a .bandpass file (see
Sec. [sec:filebandpass]), which stores
baseline-based bandpass solutions determined by difx2fits, and
writes to stdout an antenna-based set of bandpasses. These are
determined through a least-squares fit separately to phase and
amplitude. This program requests a reference antenna be provided and a
minimum of three antennas are required. This antenna’s phases are fixed
to be zero and all other antennas’ phases are determined relative to
that antenna. If a negative number is provided for the reference
antenna, the antenna phases will be adjusted after antenna-based
solutions are found such that the average bandpass phase at each
frequency is zero. The bandpass can be smoothed using the low-pass
filter LPF option. The value to be provided is measured in MHz;
bandpass features smaller than this are smoothed out. Optionally a pulse
cal data file (determined by non-DiFX program pcalanal) can be applied
to create “absolute phase” antenna-based bandpasses. This feature is
experimental. If autocorrelation bandpasses are included in the
.bandpass file, they will be ignored.
Usage:
bp2antennabandpassFile refAnt \([\) LPF \([\) pcalFile \(]\) \(]\)bandpassFile is a
.bandpassfile (§[sec:filebandpass]), created bydifx2fitsrefAnt is a 1-based number or text string indicating the reference antenna to be used
LPF is the low pass filter parameter, measured in MHz
pcalFile contains pulse cal phases after removing dominant delays (FIXME: to be documented…)
Example:
bp2antenna DQ1206.bandpass 2
The output of this program, which is sent to stdout is documented in
Sec. [sec:abp]. It is conventional to redirect the output
of this program to a file ending in .abp .
calcif2
Program calcif2 evaluates the delay model, producing a delay model
file (ending with .im) from a file containing the source, antenna
and scan timing information (ending with .calc). The detailed
calculations are performed by the Goddard CALC program. Prior to this
writing (May 4, 2013), the only option for the calculation back-end was
CALC version 9 with NRAO additions which add ocean loading and
near-field corrections (accurate as close as a few
\(\times 10^5\) km). Now new options are being introduced, including
CALC 9 with the Sekido-Fukushima near-field model (using the –sekido
option). Note that this option requires an installation of a special
version of CALC that is not covered in this document.
Instead of calling CALC for every tabulated model row, calcif2
computes a 5th degree polynomial every 120 seconds (typically), very
closely resembling the delay model generation used at the VLBA hardware
correlator. These polynomials are then evaluated at each model point.
This results in a tremendous speedup at negligible loss of accuracy. By
default calcif2 will call CALC three times for each model point and
calculates more accurate \(u, v, w\) coordinates from delay
measurements made over a small patch of the sky:
where \(l, m\) are angular coordinates (in radians) relative to the delay center on the sky, \(\tau\) is the delay at the delay center and \(c\) is the speed of light.
Normally calcif2 will be called by difxqueue, startdifx, or
another higher-level program if needed.
calcif2 connects via Remote Procedure Call (RPC) to an instance of
CalcServer which must be running on a computer identified by
environment variable $CALC_SERVER, or by the specified computer if
the -s option is used. If the output files (specified in the
.calc file) exist and are current (have newer modification times
than the .calc file, then the files will not be recreated unless the
force option is used.
In addition to calculating the delay model, this program computes the baseline vectors, \(u,v,w\) (relative to Earth center on a per-antenna basis) and source elevation vs. time.
Usage:
calcif2\([\) options \(]\) \(\{\)-a\(\mid\) calcFile1 \([\) calcFile2 \([\cdots]\) \(]\) \(\}\)options can be:
-hor–help: print usage information and exit-aor–all: run on all.calcfiles found in the current directory-vor–verbose: print more verbose logging/debug info-qor–quiet: print less verbose logging/debug info-for–force: rerun even if output files exist and are current-nor–noaber: don’t perform aberration \(u,v,w\) corrections-For–fit: Instead of producing an \(n\) term polynomial from \(n\) samples, calculate more samples and perform a fit. This is not of general use as tests have shown that the improvement is negligible.-zor–allow-neg-delay: don’t zero delays that are negative (i.e., shadowed by Earth)-Aor–noatmos: don’t include atmosphere in calculation of \(u,v,w\)-sserver or–serverserver : connect to server, not$CALC_SERVER-oorder or–orderorder : make polynomials with order+1 terms (default 5)-iint or–intervalint : make a polynomial every int seconds (default 120)–override-versionignore difx version clashes
calcFile is a
.calcfile (§[sec:input]), such as one generated byvex2difx(§1.102)Example 1:
calcif2 job1420.000.calc job1421.000.calcExample 2:
calcif2 -s kepler job1420.000.calcExample 3:
calcif2 -a -i 60
CalcServer
Program CalcServer contains the Goddard Space Flight Center CALC
package version 9.1, used to compute geometric delay models for VLBI
applications. It is a repackaged version of the same source code that is
used to compute models on the VLBA correlator. It is configured to run
as a server. All of its interactions are via RPC calls from other
programs, such as calcif2, which could be running on the same or
different computer. This program only needs to be started once on a
given machine using the startCalcServer script. It should probably
be set to start automatically upon boot of the machine on which
CalcServer runs. Environment variable $CALC_SERVER should be set
to the name of the computer on which CalcServer is running.
Start:
startCalcServerTest:
checkCalcServer $CALC_SERVERStop:
killall CalcServer
Note that CalcServer must be installed (with make install) to be
usable as the paths for various files are permanently set in the
executables at compile time. At this time it seems CalcServer cannot
be compiled for 64-bit machines.
checkdir \(\mathrm{(package: mk5daemon)}\)
Program checkdir can be used to check the integrity of one or more
.dir files that are stored at a location pointed by environment
variable MARK5_DIR_PATH . Even after many years of use, the Mark5
units tend to be a weak point in the reliability of correlation. Since
reading the module directory and examining a bit of data from each scan
are the first actions done to a module, many of the possible problems
show up at this time. This utility looks for a number of possible
problems, including scans that could not be decoded, overlapping or
out-of-order scans, scans with illegal format parameters and others.
This program makes no attempt to fix problems. It is up to the operator
to determine if a problem is real or not and if further action should be
taken. In cases where many scans are not properly decoded it is
worthwhile to rename (or remove) the .dir file in question and
regenerate the directory. A second directory read often succeeds when a
first one does not.
Usage:
checkdir\([\) options \(]\) \([\) module list \(]\)options can be:
-hor–help: print usage information and exit-vor–verbose: be more verbose in execution (-v -vfor more)-qor–quite: be less verbose in execution-aor–all: run on all files in $MARK5_DIR_PATH-sor–show: print the entire directory file to screen-Hor–histogram: print a histogram of record rates
Example 1:
checkdir -aExample 2:
checkdir NRAO-123 NRAO+266Example 3:
checkdir -s NRAO+233
Either -a or a list of module names can be provided (but not both
simultaneously). If the former, a less verbose output will be generated
by default. Except in the lowest verbosity mode (the default for
-a), module directories without any detected problems will show a
one line summary consisting of the number of scans and the time range of
the module.
checkmpifxcorr \(\mathrm{(package: mpifxcorr)}\)
Program checkmpifxcorr reads the .input and other associated
files for a DiFX job and parses them with the same logic used by
mpifxcorr in order to determine their validity.
Usage:
checkmpifxcorr\([\) options \(]\) configFile …options can be:
-h: print usage information and exit-f -s -e -w -i -v -d: select the verbosity level of output (options refer to “fatal”, “severe”, “error”, “warning”, “info”, “verbose”, and “debug” levels).
All of the files referenced from the provided configuration (.input)
files are read as well (excepting any baseband files or the .vex
file). This check has proven especially useful for pulsar processing.
The default verbosity level will lead to printing of any problems at the
“warning” level or worse. See
Sec. [sec:difxalertmessage] for details on
the severity levels.
cleanVDIF \(\mathrm{(package: vdifio)}\) [sec:printVDIF]
Program cleanVDIF loops through a VDIF file writing valid content to
a new output file.
Usage:
cleanVDIFinputvdiffile outputvdiffile Mbps \([\) options \(]\)inputvdiffile is the recorded VDIF file to clean
outputvdiffile is the corrected VDIF file to write
Mbps is the data rate in megabits/second
options can be:
-vor–verbose: be verbose in execution
Example:
cleanVDIF bad.vdif good.vdif 256
condition \(\mathrm{(package: nrao\_difx\_db)}\)
This is an NRAO-only program owing to its ties to the VLBA database.
Program condition is mainly used to extract Mark5 module
conditioning reports from the database but also has the means to
manually import data into the database. When querying (with the find
action), one or more “identifiers” can be supplied which can be either
the names of the Mark5 modules or serial number of individual disks (or
a mix of the two!). Environment variable VLBA_DB must be set to
point to the correct database.
Usage:
condition\([\) options \(]\) action + argsoptions can be:
-hor–help: print usage information and exit-vor–verbose: be verbose in execution
action can be one of:
add report1 \([\)report2 \(\cdots\) \(]\)
find identifier1 \([\)identifier2 \(\cdots\) \(]\)
report is the name of a file containing one or more condition reports from
SSEraseidentifier is either a Mark5 module VSN or a hard disk serial number
Example 1:
condition add NRAO-040Example 2:
condition find NRAO-042Example 3:
condition find NRAO+342 NRAO+270Example 4:
condition find Y66M3BQE
condition_watch \(\mathrm{(package: nrao\_difx\_db)}\)
This is an NRAO-only program owing to its ties to the VLBA database.
Program condition_watch is meant to run as a background process on
the correlator head node. Its function is to receive
Mark5ConditionMessages emitted by a special version of SSErase
(the module conditioning program) and stuff this data into the database.
This program is automatically started by mk5daemon when it is
supplied with the -w or –condition-watch arguments. When
restarting mk5daemon by hand, make sure that a duplicate copy of
condition_watch is not left running. Environment variable
VLBA_DB must be set to point to the correct database.
Usage:
condition_watch\([\) options \(]\)options can be:
-hor–help: print usage information and exit
Example:
condition_watch
countVDIFpackets \(\mathrm{(package: vdifio)}\) [sec:countVDIFpackets]
Program countVDIFpacket loops through a VDIF file and counts number
of valid and skipped frames. Packet counts are performed only on the
thread ID requested.
Usage:
countVDIFpacketsvdiffile Mbps threadIdvdiffile is the recorded VDIF file
Mbps is the data rate in megabits/second
threadId is the threadId to report on
Example:
countVDIFpackets example.vdif 256 3
cpumon \(\mathrm{(package: difxmessage)}\)
Program cpumon is a program that listens for difxLoad messages
multicast from the Mark5 units and displays the information; updating
the display as new messages are received.
Usage:
cpumon
Make sure the terminal is at least 60 characters wide and is at least as tall as there are computers that may transmit information. To quit, use ctrl-C. The columns displayed are:
Computer name
CPU load averaged over 10 seconds
Memory usage / Total memory
Network receive rate (Mbps)
Network transmit rate (Mbps)
Number of CPU cores
diffDiFX.py \(\mathrm{(package: vis2screen)}\)
Program diffDiFX.py generates a context-sensitive difference of two
DiFX output files for detailed version testing. Corresponding visibility
records are differenced and statistics on the differences are
accumulated and printed at the end of the processing.
Usage:
diffDiFX.py\([\) options \(]\) \(\{\) difxfile1 difxfile2\(]\) \(\}\)options can be:
-hor–help: print usage information and exit-f FREQor–freq=FREQ: Only look at visibilities from this FREQ index-b BASELINEor–baseline=BASELINE: Only look at visibilities from this BASELINE num-t THRESHOLDor–threshold=THRESHOLD: Display any difference that exceeds THRESHOLD-e EPSILONor–epsilon=EPSILON: Display any differences that exceeds allowed numerical error EPSILON-s SKIPRECORDSor–skiprecords=SKIPRECORDS: Skip SKIPRECORDS records before starting comparison-m MAXRECORDSor–maxrecords=MAXRECORDS: Stop after comparing MAXRECORDS (if >0) records-p PRINTINTERVALor–printinterval=PRINTINTERVAL: Print a summary every PRINTINTERVAL records-c MAXCHANNELSor–maxchannels=MAXCHANNELS: The length of the array that will be allocated to hold vis results-vor–verbose: Turn verbose printing on-i INPUTFILEor–inputfile=INPUTFILE: Parse INPUTFILE for the correlation setup–matchheaders: On seeing a header mismatch, skip through file 2 looking for next match
difxfile1 is the first difx file to compare
difxfile2 is the second difx file to compare
- Example:
difxDiFX.py -i example_1.input example_1.difx/DIFX_55523_025239.s0000.b0000comparison_1.difx/DIFX_55523_025239.s0000.b0000
If the error for any record exceeds the specified threshold a verbose error message is printed. Summary statistics are printed at the end of the file. Warnings are printed if the headers do not match between the two files.
difx2fits
Program difx2fits creates a FITS output file from the native output
format created by mpifxcorr and several other files carrying
information about the observation. Multiple input file sets can be
specified. A separate output FITS file is created for each unique
frequency setup encountered. When run, difx2fits requires the
following files to be present for each DiFX file set being converted:
baseFilename
.difx/baseFilename
.inputbaseFilename
.calcbaseFilename
.im
Several other files are optional and are typically used to populate calibration and ancillary tables:
baseFilename
.flagflagspcaltsysweather$GAIN_CURVE_PATH/.difx/*.history
With the exception of the gain curve files, all the input files to
difx2fits are expected to be in the current working directory or in
the place indicated by the .input file. As the visibility file
(.difx) is read, any records that are all zero are omitted.
Usage:
difx2fits\([\) options \(]\) \(\{\)-d\(\mid\) baseFilename1 \([\cdots\)baseFilenameN\(]\) \([\)outFile\(]\) \(\}\)options can be:
-hor–help: print usage information and exit-nor–no-model: don’t write model (ML) table-sscale or–scalescale : scale visibility data by scale-tinterval or–deltatinterval : generate.jobmatrixfile with time intervals of length interval seconds–difx-tsys-intervalinterval : the Difx-derived tsys interval (sec) (default 30.0s averaging)–difx-pcal-intervalinterval : the Difx-derived pcal interval (sec) (default 30.0s averaging)-Sor–sniff-all: sniff all bins and phase centers, not just the first-Tinterval or–sniff-timeinterval : use interval as the sniffer time resolution-vor–verbose: increase verbosity of output; use twice or thrice to get even more-dor–difx: run on all.difxfiles found in the directory-kor–keep-order: don’t sort the antennas by name-1or–dont-combine: make a separate FITS file for each input job-xor–dont-sniff: don’t generate sniffer output files-0or–zero: don’t put visibility data in FITS file–binb : Select on this pulsar bin number–phasecentrep : (U.S. spelling okay too) Create a FITS file for all the p\(^{\rm th}\) phase centers (default 0)–override-version: ignore difx version clashes–bandpass: write the.bandpassfile (see Sec. [sec:filebandpass])-mnJob or–max-jobsnJob : split into more FITS files after reaching nJob input files.–eop-merge-modemode : sets conditions for allowing jobs with different EOPs to be merged or not; options arestrict(default),drop,relaxed–clock-merge-modemode : sets conditions for allowing jobs with different clock models to be merged or not; options arestrict(default) ordrop–antpol: use antenna-based polarization labels as in VEX. Note: fits-idi file will violate original specifications and abide extended specifications.–polxy2hv: re-labels all polarizations XY to HV. Requires –antpol option.
baseFilename is the prefix of the jobfile to convert; it is okay to use the
.difxfilename insteadoutFile is the name of the
FITSfile to produce; if not provided one will be made based on the project codeExample 1:
difx2fits dq109_1 DQ109.FITSExample 2:
difx2fits -v -v -d
Environment variables respected:
DIFX_GROUP_ID: if set, run difx2fits withumask(2).DIFX_LABEL: the local name of the difx install. Used to verify matching versions and put inside FITS file; if not set,DIFX_VERSIONwill be used instead.DIFX_MAX_SNIFFER_MEMORY: maximum amount mf memory (bytes) to allow sniffer to use.DIFX_VERSION: the difxbuild version name.GAIN_CURVE_PATH: directory containing gain files.TCAL_PATH: a directory containing Tcal value files.TCAL_FILE: ifTCAL_PATHis not set, use the file pointed to by this env. var.
Unless adjusted with the –difx-pcal-interval and
–difx-tsys-interval parameters, the respective PC and TY data will
be time averaged to the default of 30 seconds. For geodetic VLBI or
other observations with very short scans you may want to shorten the
averaging time.
Unless disabled with the –dont-sniff or -x flag, four “sniffer”
output files (.acb, .apd, .wts and .xcb) will be written
for each .FITS file produced. These files are used by difxsniff
and its associated programs to produce data plots that are used to
assess data quality.
Unless disabled by setting interval to a non-positive number with the
-t or –deltat option, an output file with suffix .jobmatrix
will be produced. This file contains an ASCII art diagram of which jobs
contributed to each .FITS file produced as a function of both time
and antenna.
Unless augmented with option –antpol the produced FITS files are
fully compliant with the original FITS-IDI specifications. In case of
mixed mode polarization (XY against RL) or certain feed types (HV in
particular) the option –antpol allows to force output of
noncompliant but polarization correct FITS files; FITS-IDI numerical
parameter STK_1 is set to the new value of -9, indicating to
post-processing software that it should refer to the existing FITS-IDI
character parameters Antenna1Feed1 and Antenna2Feed1 (populated from
VEX) for the polarization details.
If submitting a bug report for difx2fits, please include in it the
full output of difx2fits -v -v and the .input and .calc
files.
difx2fits displays several diagnostics during the conversion
process, separately for each output FITS file. The size of each FITS
table is printed; a zero size indicates that table is not produced. For
the visibility table, input files contributing to the output are
printed. Scan information is printed at increased verbosity levels. Not
all DiFX output visibilities are written to the FITS file. Accounting of
the disposition of the visibilities is provided. Invalid records are
those containing infinite or NaN values and indicate a likely bug in the
software. Flagged records are those identified by vex2script (or other
DiFX file set generation programs) in the .flag file as being
illogical, such as cases where a particular baseline during a job
belongs to a different subarray. When using integration times longer
than 1 second it is possible for one visibility to span two scans. Such
records are dropped. Finally any visibilities, produced outside a normal
scan start/stop time are dropped; this should not occur unless the
.calc file is modified between correlation and FITS creation.
If there are any files matching .difx/*.history for .difx/
output being converted to .FITS, the contents of these files will be
inserted into the FITS HISTORY table.
difx2mark4
Program difx2mark4 creates a Mark4 output file set from
mpifxcorr input and output files. When run, difx2mark4 requires
the following files to be present for each file set being converted:
baseFilename
.difx/baseFilename
.inputbaseFilename
.im
as well as the .vex file referenced in the .input file, which
may be common to many DiFX file sets.
Usage:
difx2mark4\([\) options \(]\) baseFilename1 \([\cdots\)baseFilenameN\(]\)options can be:
-hor–help: print usage information and exit-vor–verbose: increase verbosity of output; use twice or thrice to get even more-dor–difx: run on all.difxfiles found in the directory-kor–keep-order: don’t sort the antennas by name–override-version: ignore difx version clashes-ror–raw: suppresses normalization of amplitudes-por–pretend: do a dry run-eor–experiment-numbern : set the experiment number to n which must be a 4 digit number (default is 1234)-bcode flo fhi : Override freq band codes. Frequencies are in MHz. Multiple parameters of this kind can be specified.
baseFilename is the prefix of the jobfile to convert without the underscore and job number
Example
difx2mark4 dq109
difxarch \(\mathrm{(package: nrao\_difx\_db)}\)
Program difxarch is a simple script that moves FITS files
produced by makefits from the correlation queue staging area
(defined by the DIFX_QUEUE_BASE environment variable) to the archive
staging area (defined by environment variable DIFX_ARCHIVE_ROOT). A
process running on the archive computer will periodically monitor new
files in this staging area and will then copy them to the actual
archive. In order to prevent premature pick-up of these files, they are
first moved into a directory with a name beginning with a period
(.). This directory is renamed without the period once all files to
be archived are copied.
Usage:
difxarch\([\) options \(]\) passName1 \([\cdots\)passNameN\(]\)options can be:
-hor–help: print usage information and exit-vor–verbose: increase verbosity of output-por–pretend: generate SQL and bash commands but don’t execute them–override-versionignore DiFX version clashes
passName is the name of a correlator pass; a file called passName
.fitslistis expected to be presentExample:
difxarch -v clock
difxbuild
Program difxbuild aids in the installation of DiFX onto a cluster.
Full documentation on the install process can be found in
§[sec:installdifxbuild], so details will
not be shown here. Command syntax is as follows:
Usage:
difxbuild\([\) options \(]\) command \([\) command arguments \(]\)options can be:
-hor–help: print usage information and exit-dor–documentation: print full in-line documentation to screen-tot–todo: print developer’s to-do list-vor–verbose: increase verbosity of output-qor–quiet: decrease output verbosity-por–pretend: generate SQL and bash commands but don’t execute them-Vor–version: print version and quit
command is one of the
difxbuildcommands, such asbuildorsvn; the program help information will list all optionscommand arguments are options for some commands
difxcalc11
difxcalculator \(\mathrm{(package: difxio)}\)
Program difxcalculator looks at a set of DiFX input files (.input,
.calc, etc.) and reports/calculates key operating parameters. This
program is inspired by the difx_calculator.xls spread sheet
available at
https://github.com/difx/difx/wiki/calculator.
Usage:
difxcalculator\([\) options \(]\) baseName \([\)speedUp\(]\)options can be:
-hor–help: print usage information and exit
baseName is the base name of a correlator job
speedUp is the expected processing rate relative to real-time
Example:
difxcalculator mt933_01
Known bugs:
Does not take into consideration multiple phase centers or zoom bands.
difxclean \(\mathrm{(package: nrao\_difx\_db)}\)
Program difxclean simply deletes all data from $DIFX_QUEUE/ for
a particular project. It also removes all jobs with status not equal to
COMPLETE for this project from the DIFXQUEUE table of the
database. It is intended that is be run at the same time the project is
released, meaning data has been correlated and archived. The user does
not have to be in any particular directory when running this program.
Usage:
difxclean\([\) options \(]\) projectoptions can be:
-hor–help: print usage information and exit-por–pretend: don’t actually do the erasure
project is the name of the project to be “cleaned” out
Example:
difxclean mt917
difxcopy \(\mathrm{(package: misc\_utils)}\)
Python program difxcopy is used to copy DiFX input (and other) files
to a different directory. In the process, explicit references to other
files that are being copied are changed to reflect their new file system
path. For a given file prefix, prefix, the following files are copied
if they exist: prefix.input, prefix.calc,
prefix.flag, prefix.delay, prefix.uvw,
prefix.rate, and prefix.im. The .vex file referenced
within the .calc file is also copied.
Usage:
difxcopy\([\) options \(]\) jobPrefix1 \([\cdots\)jobPrefixN\(]\) destDiroptions can be:
-hor–help: print usage information and exit-vor–verbose: possibly increase verbosity of output
jobPrefixN is the file prefix of a job, e.g.,
mt911_04would be the prefix corresponding to input filemt911_04.inputdestDir is the directory in which the copied and modified files will be placed
Example:
difxcopy mt911_02 mt911_03 mt911_04 /home/difx/queue/MT911
difxdiagnosticmon \(\mathrm{(package: difxmessage)}\)
Program difxdiagnosticmon listens for multicast messages of the
difxStatus variety and simply prints their contents to the terminal.
This is mainly useful for debugging mpifxcorr. Diagnostic
information that is produced includes status of internal buffers, memory
usage, execution time, data throughput and lost subintegrations.
Usage:
difxdiagnosticmon\([\) options \(]\)options can be:
-hor–help: print usage information and exit
difxlog \(\mathrm{(package: difxmessage)}\)
Program difxlog can be used to collect DiFX multicast messages for a
particular correlator job and write them to a file. A new instance of
difxlog must be started for each job being run. Normally this will
be done automatically if implemented in the particular deployment of
DiFX. Both startdifx and mk5daemon can start DiFX and will
instantiate a difxlog process as needed. Only messages of type
DifxAlertMessage and DifxStatusMessage are collected and written
to the output file.
Usage:
difxlogjobIdentity outFile \([\) logLevel pidWatch \(]\)jobIdentity : the name of the job being run (specifically, it should match the
identifierfield in the DifxMessage being sent).outFile : the name of the output file containing log information.
logLevel : the minimim message severity to retain (see §[sec:difxalertmessage]).
pidWatch : the program id (in the Unix sense) of the mpifxcorr process running.
Example:
difxlog mt911_04 mt911_04.difxlog 4 1243
Unless a pidWatch value is specified, difxlog will run until
killed. If a pidWatch value is provided, difxlog will quit as soon
as that process stops running. The loglevel parameter can be used to
select the maximum severity level to write to the log. The possible
values and their meanings are:
0 |
Fatal |
|
1 |
Severe |
an internal error that should never happen happened (likely bug) |
2 |
Error |
a problem was encountered in the data processing |
3 |
Warning |
something suboptimal was noted |
4 |
Informative |
a note containing progress information |
5 |
Verbose |
more detailed progress information |
6 |
Debug |
values probably of use only to software developers |
difxqueue \(\mathrm{(package: nrao\_difx\_db)}\)
This is an NRAO-only program owing to its ties to the VLBA database.
Python program difxqueue is a program used to maintain the DiFX
correlator queue. There are two main responsibilities in doing so:
copying or deleting files in the correlator queue directory (which is
project specific: $DIFX_QUEUE_BASE/projectName) and maintaining
the database entries for each queued job. In the VLBA context, this
program is the main interface between the analysts and the correlator
operators. This program is mainly intended to work on one job pass at
a time rather than single jobs or whole projects. In some cases one job
pass could be one job, or it could be a whole project (or both), but in
many cases there will be multiple passes per project with possibly
multiple jobs per pass. It is possible for difxqueue to operate on
individual jobs when a list of job numbers is provided. The first
command line argument describes the action to perform. Each subsequent
argument is then context dependent; see the examples or run with the
-h command line parameter to get a feel for the variety of options
allowed. Once a job has been correlated successfully, its status will be
COMPLETE. There is no need to delete a job from the queue once it
is complete. Doing so will require recorrelation if the results of that
job are still needed. Each job in the queue has a priority. The smaller
the priority, the lower the number. By default a queued job will have
priority 2.
Usage:
difxqueue\([\) options \(]\) action \([\) args \(]\)options can be:
-hor–help: print usage information and exit-ppriority or–prioritypriority : set the priority of jobs to priority-qqueuedir or–queuedirqueuedir : manually set the staging directory-vor–verbose: increase verbosity of output-dor–db-only: do not copy/delete/move files; operate only on database–override-version: ignore DiFX version clashes
action : the action to perform
add: add job(s) to the queue, usually a whole pass at a time. Default priority is 2; use the-poption to set the priority if a different priority is required.Example 1:
difxqueue add clockExample 2:
difxqueue add mt911 1 2 3 4Example 3:
difxqueue -p 3 add geodesybump: increase the priority of queued job(s)Example:
difxqueue bump clockdel: remove job(s) from the queueExample 1:
difxqueue del clockExample 2:
difxqueue del mt911 2 3list: list all jobs within a passExample:
difxqueue list mt911listall: list all incomplete jobs in the queue; note that this is not restricted even to any particular project. If one or more projects is specified, all jobs, complete or not, for those projects will be listed. If no segment code is appended to a project name, then all matching proposal codes will be listed.Example 1:
difxqueue listallExample 2:
difxqueue listall BX123 BY321Example 3:
difxqueue listall BR138Alog: list all correlations that have happened for a given project. This simply searches the DIFXLOG database table and dumps it to the screen in a readable fashion.Example 1:
difxqueue log BX123prod: print production queue list, possibly sending to a fileExample 1:
difxqueue prodExample 2:
difxqueue prod queue.txtset: set the status of queued job(s)Example 1:
difxqueue set tc015d COMPLETEExample 2:
difxqueue set tc015d QUEUED 3 4slide: decrease the priority of queued job(s)Example:
difxqueue slide mt911 6
args : action dependent arguments, usually a pass name and possible list of job numbers
Note that exept for the listall and prod actions, the current
working directory must contain the .joblist file for a project.
difxsniff \(\mathrm{(package: SniffPlots)}\)
Program difxsniff is a reimplementation of the VLBA analysts’
program sniff.pd to be more appropriate for software correlation
where the sniffer data is generated at the same time as the FITS files.
It uses the same underlying set of plotting programs (plotwt,
plotbp, and plotapd) as sniff.pd did. It should be run in a
project directory as it will create a subdirectory (if not existing
already) which by default is called sniffer/refant within the
current directory. All files created by difxsniff will be placed in
this directory, overwriting existing files with the same filenames.
Unlike sniff.pd, difxsniff is a purely non-interactive command
line program. Note that although .FITS files are provided to
difxsniff, it is the associated files ending in .apd, .wts,
.acb and .xcb that are actually read.
Usage:
difxsniff\([\) options \(]\) refants FITS1 \([\ \cdots\) FITSN \(]\)options can be:
-hor–help: print usage information and exit
refants is a list reference antennas, separated by spaces
FITS is a FITS file created by
difx2fits; multiple FITS files can be specified togetherExample 1:
difxsniff LA *.FITSExample 2:
difxsniff NL FD *.FITS
difxspeed \(\mathrm{(package: vex2difx)}\)
Program difxspeed does processing benchmarking, possibly over a
range of parameters, of DiFX. To ensure that data playback (reading from
files, Mark5 modules or network) are not limiting performance, the FAKE
mode of DiFX (see §[sec:fake]) is used; thus the
output data are meaningless. difxspeed takes a .speed
(§[sec:speed]) file as input. This file contains
various parameters, many of which are identical to those in the .v2d
(§[sec:v2d]) files. An important difference with the
parameters specified in .speed files is that multiple values can be
provided for many of the parameters. In the benchmarking process, a
separate run of DiFX for each combination of the supplied parameters is
performed. The first combination is run twice, with the first being
labeled a dummy run. This is because the timing of the first execution
can vary depending on recent usage of the correlator.
Usage:
difxspeed\([\) options \(]\) inputFile \([\) numIterations \(]\)options can be:
-hor–help: print usage information and exit
inputFile is the
.speedfile describing the series of benchmarks to runnumIterations is the number of times to execute all test combinations
Each run of difxspeed will append a new column of data to a file
called inputFile.out ; if the file does not exist, a new file
will be created. Documentation of this output file format can be found
in §[sec:speed.out].
difxusage \(\mathrm{(package: nrao\_difx\_db)}\)
This is an NRAO-only program owing to its ties to the VLBA database.
Program difxusage mines the VLBA database for correlator usage
statistics. Usage is as follows:
Usage:
difxusage\([\) options \(]\) mjdStart mjdStopoptions can be:
-hor–help: print usage information and exit-vor–verbose: be more verbose in execution-lor–list: print all matching jobs-aor–all: select jobs in all states-cor–complete: select only complete jobs (default)-kor–killed: select only killed jobs-uor–unknown: select only unknown jobs
mjdStart is the start time (in Modified Julian Days) to look for jobs
mjdStop is the stop time (in Modified Julian Days) to look for jobs
Note that environment variable VLBA_DB must be set to point to the
postgres database in question.
difxvmf \(\mathrm{(package: calcif2)}\)
Note: this is coming in DiFX 2.7 series…
difxvmf takes a DiFX fileset (including the .im file) and
modifies the wet and dry troposphere values based on the Vienna Mapping
Functions. This program retrieves the needed external data from
http://vmf.geo.tuwien.ac.at. The .im file will be replaced with an
updated version.
Usage:
difxvmf\([\) options \(]\) filebase1 \([\) filebase2 …\(]\)options can be:
-hor–help: print usage information and exit-vor–verbose: be more verbose in execution-wor–usewx: use metrology data from each site rather than defaults
filebasen is the
.inputfile or prefix to be processed; multiple can be provided.
If –usewx is specified, files of the form
project.station.weather will be looked for in the
local directory and used to supply metrology data, overriding defaults.
Environment variables:
DIFX_VMF_DATA: a writable directory for caching downloaded VMF dataDIFX_VERSION: to enforce DiFX version compatibility
difxwatch \(\mathrm{(package: difxmessage)}\)
Program difxwatch can be used to monitor progress of ongoing DiFX
jobs and kill jobs that appear to be hung.
Usage:
difxwatch\([\) options \(]\)options can be:
-hor–help: print usage information and exit–version: show program’s version number and exit-iidletime or–idle-timeidletime : maximum number of seconds a job is allowed to be idle before it is killed.
DiFX Operator Interface
The DiFX Operator Interface (DOI) is a java-based application to monitor and control the correlation of DiFX jobs. Jobs to be correlated can be selected with a file browser or retrieved with a database request. A separate manual [doi] will be made available with instructions for its use. The only specific detail that will be mentioned here is the contorted path the starting of a job takes:
The job to run is selected.
The DOI determines which resources (Mark5 units and processor nodes) are required.
If the intended output file already exists, a dialog will ask the operator whether to overwrite this file or not.
The DOI allocates resources.
The DOI assembles a
DifxStartMessageXML document and multicasts it with the correlator head node as the recipient.The
mk5daemonprocess running on the head node captures this message.mk5daemonfork()s; the child process changes its userId todifxand spawns anmpirunprocess viasshto ensure the proper environment variables are set.The
mpirunprocess starts a copy ofmpifxcorron each of the Mark5 units and processing nodes that is requested.mk5daemonfork()s again; the child process changes its userId todifxand spawns adifxlogprocess viasshto ensure the proper environment variables are set.All processes continue until job end is reached or the job is killed.
When the first fork()ed
mk5daemonprocess ends, thedifxlogprocess stops automatically, causing the second fork()ed process also to stop.The DOI receives messages suggesting the job has ended and frees the allocated resources.
e2ecopy \(\mathrm{(package: nrao\_difx\_db)}\)
Program e2ecopy copies files one directory to another, changing the
ownership to $DIFX_ARCH_USERNAME in the process. This program must
be setuid root; the person installing the program must run
chmod +s e2ecopy after installation if make install is not run
by root. Normally this program is run by difxarch (see
§1.16). This is a VLBA-centric program, but could
be used by others.
Usage:
e2ecopy\([\) options \(]\) fromDir toDir file1 \([\ \cdots\) fileN \(]\)options can be:
-hor–help: print usage information and exit-vor–verbose: be more verbose
fromDir : the source directory of the file(s) to copy
toDir : the destination directory
file : a file to copy (multiple files may be provided)
Note: “e2e” is NRAO terminology for “End to End”, a philosophy of providing user software covering the full project lifecycle from proposal handling to archive access. In this particular case the name arose due to the location of the archive staging area at NRAO.
errormon \(\mathrm{(package: difxmessage)}\)
Program errormon listens for multicast messages of the difxError
variety and simply prints their contents to the terminal. It is
effectively the same as difxlog except that log data is sent to
stdout rather than a systematically named file.
Usage:
errormon\([\) options \(]\) \([\)maxSeverity\(]\)options can be:
-hor–help: print usage information and exit
maxSeverity : maximum severity level to display (default = 8)
See § [sec:difxalertmessage] for a list of severity codes. If no maxSeverity is provided, the default level of 8 will cause no selection to occur; all messages will be printed.
A similar program, errormon2, does nearly the same thing, but
defaults to a less verbose output, and sends output to stderr rather
than stdout (so use with grep or other *nix tools is more
cumbersome. It also writes its output to a log file.
See documentation for difxlog to see the list of alert levels.
extractSingleVDIFThread \(\mathrm{(package: vdifio)}\)
This program has been superceded by vmux (see
Sec. 1.105).
extractVDIFThreads \(\mathrm{(package: vdifio)}\)
This program has been superceded by vmux (see
Sec. 1.105).
fakemultiVDIF \(\mathrm{(package: vdifio)}\)
Note: the input file for fakemultiVDIF must have a single thread or
unpredictable results will occur.
fileto5c \(\mathrm{(package: mark5daemon)}\)
filterVDIF \(\mathrm{(package: vdifio)}\)
generateVDIF \(\mathrm{(package: vdifio)}\)
genmachines \(\mathrm{(package: mpifxcorr)}\)
Program genmachines uses the information in a .input file and a
file containing information about the members of the compute cluster
(such as the file pointed to by $DIFX_MACHINES) to produce a
.machines file (§[sec:machines]) needed by
mpifxcorr. Note that genmachines is not intended to be run by
hand anymore as startdifx does this, if necessary. If playback
directly off Mark5 units is to be done, genmachines will send a
multicast request to all Mark5 units on the correlator requesting an
inventory of loaded Mark5 modules. The mk5daemon process on each
unit will respond with another multicast message containing the loaded
modules and the status of the unit, i.e., whether busy or available to
be used. This information is collected by genmachines which will
look for availability of all the modules and detect conflicts (i.e., two
needed modules loaded in the same unit). If all needed modules are found
and enough resources remain for the computations, a .machines file
and a .threads file are written. Note that the .machines file
contains a certain number of comment lines so that the use of Unix
command wc -l can be used to determine exactly how many processes
will be started. It is suggested to run this program immediately before
starting the software correlator to minimize the chance that the Mark5
units change their status or that information about the modules
whereabouts becomes stale; it is thus discouraged to run with
.input.
Usage:
genmachines\([\) options \(]\) input1 \([\ \cdots\) inputN \(]\)options can be:
-hor–help: print usage information and exit-vor–verbose: be more verbose-oor–overheadcoresohc : leave at least ohc on each compute node unscheduled-mfile or–machinesfilefile : use file instead of$DIFX_MACHINES-nor–no-threads: don’t write a.threadsfile.-dor–difxdb: lookup module locations in a database.
input is a
.inputfile; multiple files can be specified, each producing its own.machinesfile
getshelf \(\mathrm{(package: nrao\_difx\_db)}\)
This is an NRAO-only program owing to its ties to the VLBA database.
Program getshelf retrieves the shelf location of specified modules
from the legacy VLBA database and prints them to the screen. While
possibly useful, this program is not required for the software
correlation process.
Usage 1:
getshelf\([\) options \(]\) module1 \([\) module2 \([ \cdots ] ]\)Usage 2:
getshelf\([\)options\(]\) shelfFileoptions can be:
-hor–help: print usage information and exit-vor–verbose: print the database query string as well
moduleN is the volume serial number (VSN) of a module; multiple modules can be specified
shelfFile is a
.shelffile (§[sec:shelf]), as may be written bydb2vexDefault is the current working directory if none is provided.Example 1:
getshelf NRAO+267Example 2:
getshelf bx123a.skd.shelf
jobdisks \(\mathrm{(package: mpifxcorr)}\)
Program jobdisks looks through job files to see which modules
(disks) are needed for correlation. It reads through .input files,
as used by mpifxcorr, to get the needed information. There are two
modes of operation. By default, a matrix of all modules for all stations
is displayed, with a – symbol indicating that a particular station
is not used in a particular job. An asterisk (``) indicates a module
change. The second mode, instigated with command line argument ``-c,
summarizes only module changes. Running without any arguments will cause
jobdisks to look at job files within the current directory,
prioritizing on .input files if any exist and falling back on
.fx files otherwise. Listings for a subset of jobs can be made by
specifying particular files.
Usage:
jobdisks\([\) options \(]\) \([\)file1\(] \cdots [\) fileN \(]\)options can be:
-hor–help: print usage information and exit-cor–changes: print module changes only
file is a
.fxor.inputfile; mixed types are not supported. Multiple input files may be supplied.Example 1:
jobdisksExample 2:
jobdisks job1420*.inputExample 3:
jobdisks *.fxExample 4:
jobdisks -c
Known bugs:
Program should make sure datastream type is MODULE.
joblist \(\mathrm{(package: mpifxcorr)}\)
Program joblist prints useful information about DiFX correlator jobs
to stdout. Six columns of output are produced:
Job file base filename
File indicator, showing a particular character for each one of the files associated with that job that is found within a pair of square brackets, \([\ ]\):
.calcfile (§[sec:calc]).machinesfile (§[sec:machines]).threadsfile (§[sec:threads]).imfile (§[sec:im]).difxfile (§[sec:difx])
Band code of first scan in file
Observation duration of correlation (in minutes)
Recording mode triplet; three integers(data rate(Mbps), number of baseband channels & quantization bits) separated by dashes
Comma separated list of antennas
One line is printed for each .input file found in the list of
directories provided (or current directory if not listed).
Usage:
joblist\([\) options \(]\) \([\) dir1 \(] \cdots [\) dirN \(]\)options can be:
-hor–help: print usage information and exit
dir is a directory for which to print job information (default is current shell directory). Multiple directories can be specified.
Example 1:
joblistExample 2:
joblist $JOB_ROOT/*
jobstatus \(\mathrm{(package: mpifxcorr)}\)
Warning: As of DiFX 2.0.1, this utility has not yet been updated to work with DiFX 2 output format.
Program jobstatus lists the current correlation progress for each DiFX job in one or more directories. This program is normally run without any command line arguments from within the project directory. For each job, the base filename is listed with 5 or 6 additional columns of data. These columns are
Observation duration (minutes)
Record mode triplet (Mpbs-nChan-nBit)
Number of stations in job
Speed up factor (ratio of correlation time to observe time), or zero if correlation has not yet begun.
Percentage complete
Number of minutes remaining (only if Percentage complete isn’t 0% or 100%)
Below these lines, five more lines containing information about the group of jobs as a whole is are presented. The contents of these lines are:
Total job time : Minutes of observe time in listed jobs
Fraction complete : Percentage in time through the entire project
Job time remaining : Minutes of observation left to be correlated
Wall time remaining : Minutes of real time needed to complete jobs
Average speedup : Ratio of total correlation time to run time, up to current point
Note that the speedup and time remaining values are estimates and don’t include model calculation, conversion to FITS, and job startup time.
Usage:
jobstatus\([\) options \(]\) \([\) dir1 \(] \cdots [\) dirN \(]\)options can be:
-hor–help: print usage information and exit
dir is a directory for which to print job information (default is current shell directory). Multiple directories can be specified.
Example 1:
jobstatusExample 2:
jobstatus $JOB_ROOT/*
Known bugs:
This program has not been updated to work with DiFX 2.0 output
listcpus \(\mathrm{(package: mk5daemon)}\)
Python program listcpus uses ssh to connect to each machine listed
in a file (usually $DIFX_MACHINES) and peaks at the list of CPUs on
that machine and prints to stdout. Only the first column of this file is
used and any content after a # is ignored. For each CPU on the
machine, the model name, which usually also contains the CPU speed, is
listed. For multi-core CPUs, each core will appear as its own CPU.
Usage:
listcpus\([\) options \(]\)options can be:
-hor–help: print usage information and exit-vor–verbose: increase output verbosity-mfile or–machinesfile : use file instead of$DIFX_MACHINESfor list of machines to probe
Multiple directories can be specified.
Example 1:
listcpusExample 2:
listcpus -m myCPUs.list
makefits \(\mathrm{(package: difx2fits)}\)
This program is tuned for NRAO use; some modifications may be required
for use at other sites. In particular, this program requires that jobs
were queued to be run in $DIFX_QUEUE_BASE//experiment/.
Program makefits is basically a wrapper for difx2fits
(§1.14) that does some sanity checking and
ensures that files end up in the proper places with the proper names.
This program is intrinsically pass-based and it bases its
functionality on the .joblist
(§[sec:joblistfile]) file that is written by
vex2difx (§1.102). One must run this program
on the software correlator head node (swc000 in the current VLBA
DiFX implementation). Upon successful completion, FITS-IDI files are
created in the same directory in the correlator job staging area
($DIFX_QUEUE_BASE/projectName) and the sniffer output files are
left in a subdirectory of the current working directory. An additional
output file is left in the current working directory called
passName.fitslist . This file has a list of the FITS files that
are to be archived once the data for this pass are deemed valid.
The checks that makefits performs will by default not allow an
incomplete set of FITS files to be produced. This can be overridden with
a special command line argument (below). This is part of an
accountability chain that aims to ensure that nothing gets omitted.
Usage:
makefits\([\) options \(]\) passNameoptions can be:
-hor–help: print usage information and exit-vor–verbose: increase output verbosity–override-version: ignore potential difx version conflicts–allow-partial: bypass check for complete set of correlated output and proceed
Multiple directories can be specified.
Example:
makefits clock
makemark4 \(\mathrm{(package: difxdb)}\)
This program is tuned for NRAO use; some modifications may be required for use at other sites.
Program makemark4 is essentially a wrapper for difx2mark4
(§1.15) that does some sanity checking and
ensures that files end up in the proper places with the proper names.
This program is intrinsically pass-based and it bases its
functionality on the .joblist
(§[sec:joblistfile]) file that is written by
vex2difx (§1.102). One must run this program
on the software correlator head node (swc000 in the current VLBA
DiFX implementation). Upon successful completion, Mark4 file sets are
created in the same directory in the correlator job staging area
($DIFX_QUEUE_BASE/projectName). An additional output file is
left in the current working directory called passName.mark4list
. This file has a list of the Mark4 file sets that are to be archived
once the data for this pass are deemed valid.
Usage:
makemark4\([\) options \(]\) passNameoptions can be:
-hor–help: print usage information and exit-vor–verbose: increase output verbosity
Example:
makemark4 rdv95
m5bstate \(\mathrm{(package: mark5access)}\)
Program m5bstate will perform state counts on a baseband data file.
Usage:
m5bstatefile format nFrames \([\) offset \(]\)file is the file to decode
format is the format of the data
nFrames is the number of data frames (typically a few kB in size) to decode
offset is the number of bytes into the file to start decoding
Example:
m5bstate sample.vlba VLBA1_2-256-8-2 100
Notes:
See documentation for
m5bfor details on specifying the data format.Only real-sampled data with 1 or 2 bits per sample is supported at this time.
In the case of VDIF data, only single thread data with \(2^n\) channels is supported. For equivalent functionality in the multi-thread VDIF case see
vdifd.
m5d \(\mathrm{(package: mark5access)}\)
Program m5d is a very simple example program using the mark5access
decoding library. It turns out to be useful enough as a stand-alone
program to be separately documented. This program takes as command line
input the name of a file containing (or thought to be containing) VLBI
baseband data, the expected format of the data, and the number of
samples per baseband to decode. Optionally a starting file offset can be
supplied. If the data can be decoded correctly, information about the
data will be printed to the screen along with a table of decoded data.
The output values, -3, -1, 1, or 3 for valid data, are printed in
nchan columns. Data that cannot be decoded (either due to data
replacement headers, data fill pattern replacing the actual data after
unloading from a Mark5 module, or identified via the VDIF invalid bit)
will show as 0. It should be invoked with the following parameters:
Usage:
m5dfile format n \([\) offset \(]\)file is the file to decode
format is the format of the data
n is the number of samples to decode
offset is the number of bytes into the file to start decoding
Example 1:
m5d sample.vlba VLBA1_2-256-8-2 24Example 2:
m5d sample.mk4 MKIV1_4-128-2-1 600 200Example 3:
m5d sample.5b Mark5B-512-16-2 1200
The format parameter is constructed from four parts as type-rate-nchan-nbit where:
type is the type of format and should be one of
VLBA1_1,VLBA1_2,VLBA1_4,MKIV1_1,MKIV1_2,MKIV1_4,Mark5B, orVDIFrate is the data rate in Mbps
nchan is the number of baseband channels
nbit is the number of bits per recorded sample
See the usage examples above for some explicit values. Note for the
VLBA and MKIV format types the fanout is appended as this
affects the decodability of the files.
Notes:
In the case of VDIF data, only single thread data with \(2^n\) channels is supported. For equivalent functionality in the multi-thread VDIF case see
vdifd.
m5findformats \(\mathrm{(package: mark5access)}\)
Program m5findformats attempts to determine which format a baseband
data file may be. Currrently it searches over 16 to 2048 Mbps data rates
in factors of 2 and checks only for MKIV, VLBA and Mark5B types.
Usage:
m5findformatsfilenamefilename is the name of the baseband data file.
Run with no command line arguments to get help information.
m5fold \(\mathrm{(package: mark5access)}\)
Program m5fold takes a baseband data stream and integrates the power
(formed by squaring the voltage) in a number of time bins that equally
divide a given period. This is a simplifed version of “folding” such as
is used in pulsar processing. A typical use of such functionality would
be to investigate the waveform of the switched power injected into the
receiver for calibration. This program has found considerable utility in
determining time offsets between the sample clock and formatter time
(modulo the period of the calibration cycle). In the case of 2-bit
sampling a non-linear correction is applied before results are written
to a file. This correction takes the form
where \(P\) is a value proportional to true power and \(\hat{P}\) is the value obtained by calculating \(\left<\hat{v}^2\right>\) when the bitstream is reproduced with values \(\hat{v} \in ( -v_{\rm high}, -1, 1, v_{\rm high} )\). This non-linear correction can be turned off by setting nbin to a negative value. Note that this program is not useful for 1-bit quantized data. The program should be used as follows:
Usage:
m5foldinfile format nbin nchunk freq outfile \([\) offset \(]\)infile is the file to decode
format is the format of the data‘
nbin is the number of bins to calculate per period; if negative, power correction is not performed and the absolute value of nbin is used
nchunk is the number of 10000 sample chunks to operate on
freq is the reciprocal of the period to be observed (Hz)
outfile is the name of the output file
offset (optional) is the number of bytes into the file to start decoding
Example:
m5fold sample.vlba VLBA1_2-256-8-2 128 10000 80 sample.fold
See the documentation for m5d for information on specifying the data
format.
The output file will contain nchan+1 columns where nchan is the number of baseband channels in the data stream. The first column contains the time (seconds) within the period. Each remaining column is folded power for one baseband channel. If nbin is positive and the data is 2-bit quantized, the scaling is such that \(\left<v^2\right> = \sigma^2\) yields a power reading of 1.0, for sampler threshold \(\sigma\). Optimal signal to noise ratio occurs for a value of about 1.03. For non 2-bit quantization, the power will be in units of reconstituted \({\rm counts}^2\).
In the case of VDIF data, only single thread data with \(2^n\)
channels is supported. For equivalent functionality in the multi-thread
VDIF case see vdiffold.
m5pcal \(\mathrm{(package: mark5access)}\)
Program m5pcal can be used to extract pulse cal tones from baseband
data in Mark4, VLBA, Mark5B and single-thread VDIF formats.
Usage:
m5pcal\([\) options \(]\) infile format freq1 \([\) freq2 \([ \ldots ] ]\) outfileoptions can be:
-hor–help: print usage information and exit-cn or–chunksizen : use a fixed rather than automatic chunk size-vor–verbose: increase output verbosity-qor–quiet: decrease output verbosity-nn : loop over n chunks of data (default is 1000)-NN : perform N outer loops, each yielding a result set-oo or–offseto : jump o bytes into file-ii or–intervali : use pulse cal comb interval of i MHz (default is 1)-ee or–edgee : don’t use channels closer than e MHz from band edges when computing delay (default is 1/8 of bandwidth)
infile is the file to decode
format is the format of the data‘
freq1 …is/are the frequencies (MHz) relative to baseband of the first tone to detect; there should be one freq specified per baseband channel
outfile is the name of the output file
Note: The position of the first tone in a baseband channel (freq1 for
baseband 1, and so on) must not be larger than the tone interval (set
with -i i). All tones are extracted from each baseband channel.
The tone interval is allowed to exceed the bandwidth of a baseband
channel in which case freqN will effectively select just a single tone
from the baseband.
See the documentation for m5d for information on specifying the data
format.
m5slice \(\mathrm{(package: mark5access)}\)
m5spec \(\mathrm{(package: mark5access)}\)
Program m5spec is an example program using the mark5access decoding
library that is a bit more advanced than the m5d program is. It
forms total power spectra for each baseband channel in the data,
including cross spectra for polarization pairs, assuming data is in
alternating polarization pairs (if not, the cross spectra should make no
sense, but they are formed anyway). The results are written to a text
file with the following columns: Column 1 is the frequency offset from
baseband for each channel; Columns 2 to nchan+1 are the total power
spectra for each baseband channel; Columns nchan+2 to
\(4 \times\)nchan+1 contain, in pairs, the amplitude and phase
of the cross spectra for each pair of channels. It should be invoked
with the following parameters:
Usage:
m5specinfile format npoint n outfile \([\) offset \(]\)infile is the file to decode
format is the format of the data
npoint is the number of points to calculate for each spectrum
n is the number of FFT frames to include in the calculation
outfile is the name of the output file
offset (optional) is the number of bytes into the file to start decoding
Example:
m5spec sample.vlba VLBA1_2-256-8-2 256 1000 vlba.spec
See the documentation for m5d for information on specifying the data
format.
In the case of VDIF data, only single thread data with \(2^n\)
channels is supported. For equivalent functionality in the multi-thread
VDIF case see vdifspec.
m5test \(\mathrm{(package: mark5access)}\)
Program m5test is an example program using the mark5access decoding
library that works its way through a VLBI baseband data stream
attempting to decode data and header information to look for problems.
Every million samples (per baseband channel) a summary line containing
frame number, decoded date and time, and counts of valid and invalid
frames are shown. After 20 invalid frames are encountered the program
will stop. Otherwise the program will run until end of file or until
interrupted by the user. Usage is as follows:
Usage:
m5testinfile format \([\) offset \(]\)infile is the file to decode
format is the format of the data
offset (optional) is the number of bytes into the file to start decoding
Example:
m5test sample.vlba VLBA1_2-256-8-2
See the documentation for m5d for information on specifying the data
format.
m5time\(\mathrm{(package: mark5access)}\)
Program m5time decodes the time of the beginning of a Mark4, VLBA,
or Mark5B datastream and prints the result in integer MJD and UT hours,
minutes, seconds to the screen.
Usage:
m5timeinfile formatinfile is the file to decode
format is the format of the data‘
m5timeseries \(\mathrm{(package: mark5access)}\)
Program m5timeseries produces a power measurments for each of
channel of a baseband data file, averaging over a specified time
interval.
Usage:
m5timeseriesinfile format tint ntime outfile \([\) offset \(]\)infile is the file to decode
format is the format of the data‘
tint is the integration time per sample in milliseconds
ntime is the number of samples to generate
outfile is the name of the output file
offset (optional) is the number of bytes into the file to start decoding
Example:
m5timeseries sample.vlba VLBA1_2-256-8-2 6.25 8000 sample.series
The output file contains nchan+2 columns of data where nchan is the number of channels in the data file. The first column is sample number. The second column is time since beginning of series, in seconds. The remaining columns are power measurements for the channels.
m5tsys \(\mathrm{(package: mark5access)}\)
mk5cat \(\mathrm{(package: mk5daemon)}\)
This program sends data on a module to standard out. See additional
documentation under mk5cp which operates on similar principles
(mk5cat is mk5cp writing to stdout Note that the executable for mk5cat
is identical to that for mk5cp and only the name of the program actually
differs.
Usage:
mk5cat\([\) options \(]\) { bank \(\mid\) VSN } scansoptions can be:
-hor–help: print usage information and exit-vor–verbose: increase verbosity, e.g., print directory to screen
bank is either
AorBVSN is a valid 8-character VSN of a loaded module
scans is one or more scan numbers (starting at 1) with scan numbers separated by commas.
Many of the other baseband data utilities documented here such as
m5d, m5spec and vmux can take input from stdin and thus
can be mated with mk5cat. Usually a single hyphen (-) as the
name of the input file indicates this to these programs.
Example:
mk5cat B PT_BB241_No0111 | m5spec - Mark5B-2048-16-2 128 10000 methanol.spec
mk5control \(\mathrm{(package: mk5daemon)}\)
mk5control is a program that sends XML messages of type
DifxCommand to the mk5daemon programs that run on the software
correlator cluster members. This program is a superset of mk5take
and mk5return, allowing any allowed command to be sent.
Usage:
mk5control\([\) options \(]\) command unit1 \(\cdots\) unitNoptions can be:
-hor–help: print usage information and exit.
command is the (non-case-sensitive) command to be executed; see list below.
unit is the number of a correlator Mark5 unit, a range,
allfor all software correlator cluster members,mark5for all Mark5 units, orswcfor all software correlator compute nodes.Example 1:
mk5control stopmark5a 07 08 09 11 14Example 2:
mk5control resetmark5 14-24Example 3:
mk5control startmark5a mark5
The list of supported command types is below. All commands are not case sensitive.
GetVSNRequest aMark5StatusXML document to be multicast from the unitResetMark5ExecuteSSResetandssopen; this cures many/most mark5 hangsClearClear the stat of the Mark5 unit and get the VSNs, can be dangerous if other programs are currently using the Streamstor cardRebootReboot the machinePoweroffShut down the machineStopMk5DaemonStop themk5daemonprogram; you probably never need to do thisGetDirExtract the directory from the modules in both banks and save to files in $MARK5_DIR_PATHGetDirASame as above, but look only at bank AGetDirBSame as above, but look only at bank BStopDirStop a directory read that is in progressKillMpifxcorrSend sigkill (likekill -9) to all processes on machine called mpifxcorrCopyCopy data from a module to files in a provided directory. At least three parameters must be provided that match the parameters ofmk5cp. Because of the way mk5control parses the command line, the wordcopyand the parameters must all be enclosed in quotes.StopCopyStop a data copy processGetVerRequest send of aDifxMessageMk5VersionXML messagemountXX Cause Linux device/dev/sdXX to be mounted on/mnt/usbumountCause/mnt/usb to be unmountedTestUsed in debugging — for developers only
mk5cp \(\mathrm{(package: mk5daemon)}\)
Program mk5cp copies baseband VLBI data from a module to a file
somewhere on the operating system filesystem, perhaps an external USB
disk. This program is often started using mk5control to tell the
instance of mk5daemon running on the desired Mark5 unit to start the
copy. Status information is multicast via a Mark5StatusMessage
document.
Usage:
mk5cp\([\) options \(]\) { bank \(\mid\) VSN } scans outputDirectoryoptions can be:
-hor–help: print usage information and exit-vor–verbose: increase verbosity: print directory to screen
bank is either
AorBVSN is a valid 8-character VSN of a loaded module
scans is one or more scan numbers (starting at 1) with scan numbers separated by commas. No spaces are allowed in the list. A range can be specified with a hyphen (see examples). Alternatively, a scan name, or portion thereof, can be specified. When a partial scan is provided, any scan name matching that partial scan name will be copied. The scans parameter can also take either a time range (two floating point modified Julian Days connected with an underscore, or a byte range (must be a multiple of 4) via two integers separated by an underscore, or a byte start and a length can be specified with two integers separated by a plus sign.
outputDirectory is the directory to which files will be copied. Make sure the destination directory exists before running this program and make sure sufficient free space remains on that filesystem. If the outputDirectory parameter is set to
-(the hyphen), data will go to stdout. Another utility calledmk5catis derived from this behavior.Example 1:
mk5cp NRAO-123 4 /mnt/usb/WaltersProjectExample 2:
mk5cp A 1,2,3 /tmpExample 3:
mk5cp B BC120A /mnt/usb/BC120A/PTExample 4:
mk5cp NRAO+255 6-12 /tmpExample 5:
mk5cp NRAO+255 123123100_124123100 /tmpExample 6:
mk5cp NRAO+255 54327.13124_54327.15124 /tmpExample 7:
mk5cp NRAO+322 123123100+1000000 /tmp
mk5daemon \(\mathrm{(package: mk5daemon)}\)
mk5daemon is a program that started automatically at boot time on
all of the software correlator cluster nodes (not only the Mark5 units!)
that performs a number of operations in support of the software
correlator.
The functions that mk5daemon performs are:
Logging
All received multicast messages, significant internal functions, and interactions of the
Mark5Aprogram are logged to human readable log files. These log files are restarted at the beginning of each day. By default these log files are saved in/tmp.High level control of Mark5 units
The
Mark5Aprogram (written by Haystack) is the principle program used to access the Mark5 systems at the VLBA stations and the hardware correlator. DiFX directly accesses the Streamstor card via a library level programming interface. Since only one program is allowed to do this (or face a crash of varying degree of seriousness), access to the Streamstor card must be carefully managed. One function ofmk5daemonis to maintain knowledge of who “owns” the Streamstor card at a given time to prevent conflicts. The starting and stopping of theMark5Aprogram can be requested by two messages of typeDifxCommand:startmark5aandstopmark5a. When these commands are received bymk5daemon, the requested action is taken unless Streamstor conflict is likely. This type of command and others can be sent tomk5daemonwith themk5controlprogram (§1.59).Low level control of Mark5C units
This program exposes a VLBI Standard Interface (VSI) over TCP port 2620 that very closely emulates equivalent functionality of the Mark5C Data Recording System (DRS) program provided by Haystack observatory. The implementation of the DRS command set is not complete but is sufficient for monitor and control at record time. At the time of writing this program is used in lieu of DRS at the two Mark5C units provided by USNO.
CPU, memory, and network monitoring
Every 10 seconds,
mk5daemonextracts data from the/procdirectory to get information about the CPU load, memory usage, and network traffic. These numbers are multicast in aDifxLoadmessage and logged.Module VSN and state determination
Receipt of a multicast
getvsncommand will result inmk5daemonmulticasting out aMark5Statusmessage containing information on the VSNs of the inserted modules as well as the state of the Mark5 unit. WhenMark5Ais running, a socket is opened to this program and thebank_set?query is issued, which returns the VSNs, regardless of the activity. WhenMark5Ais not running,mk5daemoneither directly determines the VSNs through a Streamstor API library call if the Mark5 unit is idle, or doesn’t respond if the Mark5 unit is busy. With eachMark5Statusmessage that is multicast frommk5daemonthe state of the Mark5 unit is included. See §[sec:xml] for details on these XML messages.Starting of mpifxcorr
If
mk5daemonis started with the-Hor–head-nodeoption, it will be allowed to start new correlations. A correlation will be started when adifxStartmessage is received if it passes some minor sanity checks. Sincemk5daemonruns as root, it has the capability of changing file ownerships. By default, the output files frommpifxcorranddifxlogwill have their ownership and permissions changed to match those of the.inputfile.
Normally mk5daemon is started automatically, either by
/etc/rc.local or by a script in /etc/init.d . The command line
options supported are:
Usage:
mk5daemon\([\) options \(]\)options can be:
-hor–help: print usage information and exit-qor–quiet: be less verbose and don’t mulitcast state-Hor–head-node: give head-node permissions-mor–isMk5: force mk5daemon on this host to act as Mark5 regardless of hostname (default is mark5fx??)-uuserID or–useruserID : use userID when executing remote commands (default is ’difx’)-llogPath or–log-pathlogPath : put logs in directory logPath, not/tmp
Please be sure not to have multiple instances of mk5daemon running
at any one time on any individual Mark5 or correlator unit!
mk5dir \(\mathrm{(package: mark5daemon)}\)
Program mk5dir extracts the directory from a module. Normally one
would not call this program directly but would use the getdir option
of mk5control. By default this program will change the disk module
state to played. There is a danger that if this is done with an
SDK 9 unit and the disk is later needed in an SDK 8 unit that it will no
longer be readable in the later without a full reset of its VSN. As of
April 2014 this program supports decoding of all directory types
described by Mark5 Memos 81
(http://www.haystack.mit.edu/tech/vlbi/mark5/mark5_memos/081.pdf) and
100 (http://www.haystack.mit.edu/tech/vlbi/mark5/mark5_memos/100.pdf).
Usage:
mk5dir\([\) options \(]\) { bank \(\mid\) VSN }options can be:
-hor–help: print usage information and exit-vor–verbose: increase verbosity: print directory to screen-for–force: force directory read even if current-For–fast: get format details from new-style directory (Mark5B format only)-nor–nodms: get directory but don’t change disk module state-sor–safedms: only change disk module state if SDK 8 unit or new dir type-dor–dmsforce: proceed with change of module state even if this makes module unreadable in SDK 8 units-bb or–beginb : begin with scan number b-ee or–ende : end with scan number b-wfile or–writefile : write directory file file to the module
bank is one of
A,BorABVSN is a valid 8-character VSN of a loaded module
The resultant directory file will be saved in a file called
VSN.dir in the directory pointed to by environment variable
MARK5_DIR_PATH .
This program responds to the value of environment variable
DEFAULT_DMS_MASK. This variable should be an integer representing
the state of three bits. mk5dir only responds to the setting of bit
1 (value 2); if this bit is set, the disk module state will not be
updated on directory reading. It is recommended to set this environment
variable at recording stations so auto-erasure of modules does not
occur.
In the mode where a specified .dir file is to be written to a Mark5
module directory the VSN must be provided explicitly (i.e., selecting by
bank is not allowed)
mk5erase \(\mathrm{(package: mark5daemon)}\)
Program mk5erase replaces the functionality of SSErase. It is
used to either erase or condition a Mark5 module. It supports SDK9 and
earlier revisions of the Conduant API and either legacy or new (see
Mark5 memop 81) module directories. Conditioning results are multicast
upon conclusion of conditioning, to be received bt condition_watch.
Conditioning (which is started with the -c option) causes an entire
read/write cycle of the entire module to be performed. This can require
a good fraction of 24 hours to complete. Status updates and progress are
sent every 10 seconds as well. By default the original directory version
will be restored on the module, with zero scans. The version of
directory to use can be forced with either the -l or -n options.
Usage:
mk5erase\([\) options \(]\) VSNoptions can be:
-hor–help: print usage information and exit-vor–verbose: increase verbosity: print directory to screen-for–force: force directory read even if current-cor–condition: Do full conditioning, not just erasing-ror–readonly: Perform read-only conditioning-wor–writeonly: Perform write-only conditioning-dor–getdata: Save the performance data to a file called VSN.timedata-lor–legacydir: Put an empty legacy directory on the module when complete-nor–newdir: Put an empty new style directory on the module when complete-0or–nodir: Put a zero-size directory on the module when complete
VSN is a valid 8-character VSN of a loaded module
Note that this program will not run without specifying a legal mounted
module VSN. If you wish to erase a module that has no VSN set, use the
vsn program first.
Control-C can be used to safely abort conditioning early. The directory
will be left in an indeterminate state, so use caution when doing this;
if conditioning is stopped before completion use the vsn program to
assess and possibly modify the current module state.
mk5mon \(\mathrm{(package: difxmessage)}\)
Program mk5mon is a program that listens for mark5Status
messages multicast from the Mark5 units and displays the information;
updating the display as new messages are received.
Usage:
mk5mon
Make sure the terminal is at least 110 characters wide and is at least as tall in characters as there are Mark5 units that may transmit information. To quit, use ctrl-C. The columns being displayed are:
Mark5 unit name
VSN of module in Bank A
VSN of module in Bank B
State of the Mark5 unit
Playback rate, if playing, in Mbps
Playback position, in bytes from beginning of module
Scan number of data being played, if playing
Scan name of data being played, if playing
mk6cp \(\mathrm{(package: mark6sg)}\) [sec:mk6cp]
mk6cp is a wrapper around mk6gather which makes the operation of
copying multiple files from a Mark6 module easier.
Usage:
mk6cp\([\) options \(]\) filematch1 \([\) filematch2 …\(]\) destinationoptions can be:
-hor–help: print usage information and exit-vor–verbose: increase output verbosity-ror–resume: don’t copy files that exist in the destination path
filematch1 is a shell-style pattern for matching scan names
destination is the output path to place files; needs to start with
.or/or end with/
mk6gather \(\mathrm{(package: mark6sg)}\) [sec:mk6gather]
mk6gather extracts data from a Mark6 module, assembling as necessary
the data scattered across the disks in the module.
Usage:
mk6gather\([\) options \(]\) templateoptions can be:
-hor–help: print usage information and exit-vor–verbose: increase output verbosity-aor–append: append to existing file; this continues a previous gather-ooutfile or–outputoutfile : send output to outfile (default isgather.out)-bbytes or–bytesbytes : stop writing after bytes are written-sbytes or–skipbytes : skip first bytes of file
template specifies a file list to match, in a shell-style wildcard pattern, such as the name of a scan (e.g.,
BB407A_LA_No0001)
If more than one file is to be copied, it is best to use the mk6cp
([sec:mk6cp]) program instead.
mk6ls \(\mathrm{(package: vdifio)}\) [sec:mk6ls]
mk6ls looks in the standard Mark6 mountpoint locations (either in
/mnt/disks/*/*/data or in the path pattern set by environment
variable $MARK6_ROOT for VDIF formatted files. Found files will be
probed and summarized.
Three printing levels are supported:
short : Simply prints the file names. The portion of the filename corresponding to the mountpoint location is excised.
long : Prints the file names (same as for short), the number of actual files making up the virtual Mark6 file (e.g., scattered across multiple mountpoints), the full size of the virtual file, and an indication of the completeness of the virtual file set.
full : Prints the same information as for long followed by details of the files, such as details of the Mark6 file version, block sizes, packet numbers. This mode is mainly useful for developers with access to the vdifio source code.
Usage:
mk6ls\([\) options \(]\)-hor–help: print usage information and exit-sor–short: print long form output (default)-lor–long: print long form output-for–full: print full information for each file
mk6mon \(\mathrm{(package: difxmessage)}\)
mk6summary \(\mathrm{(package: mark6sg)}\)
mk6vmux \(\mathrm{(package: vdifio)}\)
mpifxcorr
The core of the DiFX software correlator is the program called
mpifxcorr. This program uses Message Passing Interface (MPI) to
exploit parallel computing to make correlation practical on a cluster of
ordinary computers. This program runs on all the machines listed in the
.machines file that is passed to mpirun, the program that starts
mpifxcorr. It should be initiated from the cluster head node from
within the project directory. The usage line below is appropriate for
use with OpenMPI (§[sec:mpi]) and within the DiFX
context; other incantations may provide better results depending on the
setup. See the OpenMPI documentation for more details.
Usage:
mpirun -npnProcess–bynode –hostfile\([\) otherOptions \(]\) machinesFilempfixcorrinputFile \([\) options \(]\)nProcess is the number of processes to start; found with
wc -lmachineFilemachinesFile the
.machineFileinputFile the
.inputto run; the full path to this file needs to be given, so prepending the file with‘pwd‘/is typicalotherOptions can be any additional option to
mpirun;startdifxuses the–mca btl \hat{\ }udapl,openib –mca mpi_yield_when_idle 1to suppress some warning messages and be less aggressive on networkingoptions are additional options that
mpifxcorrcan take which include:-MmonHostname:\([\)monSkip\(]\) : hostname of a machine serving as a monitor data server, with optional value indicating how many records to skip between sends.-rstartSec : start startSec seconds into the job, writing a new set of files into the visibility directory (.difx/)–vgoscomplex: flips sideband of all Complex VDIF stations by complex conjugating the unpacked Complex VDIF data
Within the DiFX framework, the user should never have to directly start
mpifxcorr as this is done more simply with startdifx or via the
DiFX Operator Interface in conjunction with mk5daemon.
oms2v2d \(\mathrm{(package: vex2difx)}\) [sec:oms2v2d]
The VLBI scheduling program sched generates a file with extension
.oms which is used to populate some fields in the VLBA database.
These fields are usually used to feed the dynamic scheduler but can also
be used to reduce the tedium of transferring information from the
sched input file (.key) to the vex2difx input file .v2d.
For simple experiments this resulting .v2d file can be used
unedited, but for more typical experiments editing will be required. The
resulting file will have the same file prefix as the input file and will
end with .v2d.
Usage:
oms2v2d\([\) options \(]\) file.omsfile
.omsis an.omsfile written byschedoptions can be:
-hor–help: print usage information and exit-for–force: allow overwrite of output file
Example:
oms2v2d bx123.oms
Note: sched now produces a .tv2d file (template vex2difx) that
can contain useful information for some projects (especially
multi-phase-center projects), however, this file is not tied to any
particular version of DiFX (or more importantly, version of
vex2difx) and thus cannot be guaranteed to be legal. It is thus
suggested to use oms2v2d and transfer over needed information by
hand after the fact.
padVDIF \(\mathrm{(package: vdifio)}\) [sec:padVDIF]
Program padVDIF takes an input VDIF file and inserts additional
packets as needed to create a contiguous without gaps. Newly inserted
frames will have the invalid bit set.
Usage:
padVDIFinfile outfile Mbps \([\) newStartMJD \(]\)infile is the input VDIF file
outfile is the output VDIF file
Mbps is the data rate in megabits/second
newStartMJD is the MJD (with fractional component) to overwrite the times with
Example:
padVDIF raw.vdif smooth.vdif 2048
plotapd \(\mathrm{(package: SniffPlots)}\) [sec:plotapd]
Program plotapd takes a text file containing sniffer fringe fit
results and makes plot files. Separate plots for Amplitude, Phase and
Delay (hence the name suffix “apd”are made for each baseline in the
resultant file. A plot of delay rate is also produced.
This is an interactive command line program; running plotapd will
prompt the user for inputs. The program difxsniff will run
plotapd and its sister programs automatically, so usually it won’t
be necessary to run by hand.
The PGPLOT_FONT environment variable must be set, otherwise all plot
text will be missing.
plotbp \(\mathrm{(package: SniffPlots)}\) [sec:plotbp]
Program plotbp takes a text file containing sniffer bandpass
output files and creates plots. This program can produce both auto- and
cross-correlation data plots.
This is an interactive command line program; running plotbp will
prompt the user for inputs. The program difxsniff will run
plotbp and its sister programs automatically, so usually it won’t be
necessary to run by hand.
The PGPLOT_FONT environment variable must be set, otherwise all plot
text will be missing.
plotwt \(\mathrm{(package: SniffPlots)}\) [sec:plotwt]
Program plotwt takes a text file containing sniffer data weights.
In this context, a data weight of 0 indicates complete loss of data and
a weight of 1 indicates complete data. Each plotted data point will
usually span many correlator integration periods. A solid dot will be
plotted for the mean of these points, and “error bars” will indicate the
range of weights over the averaging period. Note that in some cases
where awkward integration times are used it may be possible for the
weight to occasionally exceed 1, as long as the long running average
never does.
This is an interactive command line program; running plotwt will
prompt the user for inputs. The program difxsniff will run
plotwt and its sister programs automatically, so usually it won’t be
necessary to run by hand.
The PGPLOT_FONT environment variable must be set, otherwise all plot
text will be missing.
printDiFX.py \(\mathrm{(package: vis2screen)}\) [sec:printDiFX]
Program printDiFX prints a summary of the visibility information in
a DiFX output file. It loops through all the records printing some basic
info about frequencies, baselines, polarizations, times etc., plus a
couple of selected visibility values from the start and middle of the
record, to the screen.
Usage:
printDiFXdifxfile inputfiledifxfile is the full name of the visibility file in the
.difxdirectoryinputfile is the path to the input file used to generate this difx output
Example:
printDiFX example_1.difx/DIFX_55523_025239.s0000.b0000 example_1.input
printVDIF \(\mathrm{(package: vdifio)}\) [sec:printVDIF]
Program printVDIF loops through a VDIF file inspecting each packet
header and printing some basic summary info (time etc.) to the screen.
Usage:
printVDIFvdiffile Mbpsvdiffile is the recorded VDIF file to inspect
Mbps is the data rate in megabits/second
Example:
printVDIF example.vdif 256
printVDIFgaps \(\mathrm{(package: vdifio)}\)
printVDIFheader \(\mathrm{(package: vdifio)}\)
Program printVDIFheader is a powerful diagnostic tool that prints
details of each VDIF header found in a VDIF file. All fields of each
header, including information in known Extended VDIF Headers (see
Sec. [sec:vdifedv]) are printed. Three different
print formats are allowed: compact short version, detailed long,
and hexidecimal hex.
Usage:
printVDIFheaderinputFile \([\) inputFrameSize \([\) printLevel \(] ]\)inputFile is the input multi-thread VDIF file, or
-for stdininputFrameSize is the size of one thread’s data frame, including header (for RDBE VDIF data this is 5032)
printLevel describes what is to be printed; one of
short,long, orhex
Run this program with no arguments to get some additiona explanation.
psrflag \(\mathrm{(package: difxio)}\)
psrflag is a program that reads one or more DiFX filesets and
produces a flag table readable by AIPS UVFLG that contains flags for
times when the fringe rate, on a per-baseline basis, resonates with the
pulse period, allowing DC bias to correlate. The pulsar .binconfig
file is used to determine the harmomic content of the pulsar profile.
Usage:
psrflag\([\) options \(]\) inputFile …inputFile is the input multi-thread VDIF file; multiple may be provided
options can be:
-hor–help: print help information and quit-vor–verbose: be more verbose in execution
record5c \(\mathrm{(package: mark5daemon)}\)
recover \(\mathrm{(package: mk5daemon)}\)
recover is a program that wraps the XLRRecover call for convenient
use. This replaces the functionality of the recover= command of the
Mark5A program.
Usage:
recover\([\) options \(]\) type banktype is the type of recovery to attempt. See below.
bank should be either
AorBand is the bank containing the module to address.options can be:
-hor–help: print usage information and exit-for–force: allow overwrite of output file-vor–verbose: be more verbose in execution
Example:
recover -v 2 A
There are three possible modes of operation that are selected with the type argument:
Fix directory if power failed during recording
Allow access to data that might have been overwritten
Unerase the module
These recovery attempts will not always be successful.
reducepoly \(\mathrm{(package: difxio)}\)
Program reducepoly takes one or more DiFX file sets as input and
will modify each fileset’s delay model (.im file) to have polynomial
representations with fewer terms. All polynomials in the ,im file
will be reduced, including the baseline vectors, atmospheric components,
azimuth, and elevation. The original .im files will be overwritten.
The main purpose of this program is to evaluate the impact of using
different polynomial orders,
Usage:
reducepoly\([\) options \(]\) inputFile …inputFile is the input multi-thread VDIF file, or
-for stdin; multiple may be providedoptions can be:
-hor–help: print help information and quit-2: reduce polynomials to two terms-3: reduce polynomials to three terms-4: reduce polynomials to four terms-5: reduce polynomials to five terms
searchVDIF \(\mathrm{(package: vdifio)}\)
splitVDIFbygap \(\mathrm{(package: vdifio)}\)
startdifx \(\mathrm{(package: mpifxcorr)}\)
Starting mpifxcorr generally requires a lengthy command. This
inspired the creation of startdifx which vastly simplifies use of
the DiFX correlator. In addition to spawning the mpifxcorr
processes, startdifx can orchestrate some of the preparatory work
(for example running calcif2 and genmachines) and optionally run
difx2fits to create a .FITS file for each job. This program is
meant to work within the DiFX environment and would probably require
modification to be useful in other situations.
Usage:
startdifx\([\) options \(] [\) startDelay \(]\) input1 \([\) input2 \(\cdots\) \(]\)options can be:
-hor–help: print usage information and exit-vor–verbose: be more verbose-qor–quiet: be quieter-for–force: proceed on files even if correlator output already exists and is up to date-aor–automachines: rungenmachinesonly if no.machinesfile exits-gor–genmachines: rungenmachinesunconditionally (default)-nor–nomachines: don’t rungenmachines-dor–dont-calc: don’t runcalcifeven if needed – will skip file-mor–message: startmpifxcorrby sending difxStart message tomk5daemon-For–fits: rundifx2fitson output of each job separately-lor–localhead: use the current host as the headnode, not$DIFX_JEAD_NODE–override-version: ignore potential difx version conflicts-Aagent or–agent=agent : call mpirun through agent with filebase as only argument-MmachinesFile or–machines-file=machinesFile : use machinesFile instead of the one expected based on the job name
startDelay is an optional number of seconds to jump into the job upon start
inputN is a
.inputfile, or its prefixExample 1:
startdifx job1420.000.inputExample 2:
startdifx -f -n job1420.000 job1421.000Example 3:
startdifx -F *.input
Environment variables respected:
DIFX_MESSAGE_GROUP: multicast group to use (when using -m option), overriding default 224.2.2.1DIFX_MESSAGE_PORT: multicast port to use (when using -m option), overriding default 50200DIFX_HEAD_NODE: when using -m option, this must be set, which specifies which machine will serve as the head nodeDIFX_MPIRUNOPTIONS: used to pass options to the mpirun commandDIFX_CALC_PROGRAM: can be used to change the delay model program (default iscalcif2); only needed if model calculations have not been doneDIFX_CALC_OPTIONS: used to override options to the delay model program
statemon \(\mathrm{(package: difxmessage)}\)
Program statemon listens for multicast messages of the
difxStatus variety and simply prints their contents to the terminal.
This is mainly useful for debugging mpifxcorr and any programs
responsible for launching it.
Usage:
statemon\([\) options \(]\)options can be:
-hor–help: print usage information and exit
Environment variables respected:
DIFX_MESSAGE_GROUP: multicast group to use, overriding default 224.2.2.1DIFX_MESSAGE_PORT: multicast port to use, overriding default 50200
stopmpifxcorr \(\mathrm{(package: mpifxcorr)}\)
If software correlation is in progress and it is desired to stop it, it
is best to gently stop it rather than killing it abruptly. In most
circumstances this can be accomplished with stopmpifxcorr. This
program must be run on the machine running the manager process of the
software correlator (usually this is swc000 for the VLBA). If
multiple mpifxcorr processes are found running on a machine,
stopmpifxcorr will not proceed unless the -f option is used.
Usage:
stopmpifxcorr\([\) options \(]\)options can be:
-hor–help: print usage information and exit-for–first-pid: send stop message to the numerically first process ID found-qor–quiet: don’t produce much output
stripVDIF \(\mathrm{(package: vdifio)}\) [sec:stripVDIF]
Program stripVDIF strips network headers from a VDIF format basebad
data file (e.g., captured from wireshark) and dumps a pure VDIF stream.
Usage:
stripVDIFinfile outfile \([\) skipbytesfront \([\) skipbytesback \([\) skipbytesinitial \(] ] ]\)infile is the input VDIF file
outfile is the output VDIF file
skipbytesfront is the number of bytes to skip over before each frame (default is 54)
skipbytesback is the number of bytes to skip over after each frame (default is 4)
skipbytesinitial is the number of bytes to skip over only once after opening the file (default is 28)
Example:
stripVDIF vdif.wireshark vdif.pure 54 4 28
tabulatedelays \(\mathrm{(package: difxio)}\)
Program tabulatedelays takes one or more DiFX filesets and produces
a text file containing, for each scan, a table of interferometer delays
(\(\mu\)s) and rate (\(\mu\)s/s) based on the values in the
.im file at a cadence of one entry every 8 seconds. Based on command
line options values other than the delay can be extracted and tabulated.
The details of the output files are documented in comments at the top of
the output file.
Usage:
tabulatedelays\([\) options \(]\) inputFile …inputFile is the input multi-thread VDIF file, or
-for stdin; multiple may be providedoptions can be:
-hor–help: print help information and quit–az: print azimuth (deg) and azimuth rate (deg/s) instead of delay, rate–el: print elevation (deg) and azimuth rate (deg/s) instead of delay, rate–dry: print the dry component of atmospheric delay (\(\mu\)s) instead of delay, rate–wet: print the wet component of atmospheric delay (\(\mu\)s) instead of delay, rate–uvw: print the antenna-based baseline coordinates \((x, y, z)\) (meters) instead of delay, rate–clock: print the clock offset (\(\mu\)s) and rate (\(\mu\)s/s) instead of delay, rate–perint: print values at the center of every integration rather than every 8s–addclock: include clock offset/rate in printed delay/rate values
This program reads through one or more difx datasets and evaluates delay
polynomials in the .im files on a regular time grid (every 8
seconds). Delays and rates are both calculated. Output should be self
explanatory. Plotting utilities such as gnuplot can be used directly on
the output.
When operating without –perint, the entirety of the delay
polynomials are plotted, even exceeding the time range of the scans to
which they belong. Comments in the output separate scans cleanly. When
–perint is used, only the time covered by the scans is output.
Sign conventions:
Delay: a positive delay indicates wavefront arrival at the station before wavefront arrival at earth center. The delay includes contribution from wet and dry atmosphere components.
Rate: simply the time derivative of Delay.
Clock Offset: sign convention is opposite that of
.vex "clock_early"parameter; a positive clock offset indicates slow station clock. The sum of Clock Offset and Delay is the total correlator delay.Clock Rate: simply the time derivative of Clock Offset.
testdifxinput \(\mathrm{(package: difxio)}\)
This program was intended mainly for helping debug parsing of .input
files and associated other files. It turned out to be useful as a
general tool to investigate the contents of such files. When multiple
input files are provided on the command line merging of the resultant
data structures is attempted. Two output files are created when run:
input.out and calc.out. These files should closely resemble the
input files if the parsing was done properly.
Usage:
testdifxinput\([\) options \(]\) inputFilePrefix1 \([\) inputFilePrefix2 \(\cdots ]\)options can be:
-vor–verbose: be a bit more verbose-hor–help: print help information and quit
inputFilePrefix\(n\) is the base name of an input file
testdifxmessagereceive\(\mathrm{(package: difxmessage)}\)
Test program testdifxmessagereceive captures multicast DiFX messages
and prints them to the screen. Both the raw XML is shown and the decoded
values. It is mostly useful as a tool for examining the correctness of
the multicast messages that are broadcast and is not intended to be part
of an operational system. There is a special binary mode which instead
listens for the multicast high time resolution autocorrelations. In this
mode, only a terse summary of what is received is printed (see the
source code for more information).
Usage:
testdifxmessagereceive\([\) options \(]\) \([\) type \(]\)options can be:
-hor–help: print help information-bor–binary: generate output based on binary records
type is the kind of message to capture (not for use with binary records):
DifxLoadMessage
DifxAlertMessage
Mark5StatusMessage
DifxStatusMessage
DifxInfoMessage
DifxDatastreamMessage
DifxCommand
DifxParameter
DifxStart
DifxStop
Mark5VersionMessage
Mark5ConditionMessage
DifxTransientMessage
If type is not provided, all message types will be captured.
testmod \(\mathrm{(package: mk5daemon)}\)
testmod is a program that is used to perform read and write tests on
Mark5 modules. It is meant as a replacement of the ResetModule
program that relies on the Mark5A program which is being phased out
of VLBA operations. Read-only tests can be performed without risk of
erasing astronomical data recorded on the disks. The more invasive
write-read tests (which are the default) will erase all preexisting
data! A matrix of numbers similar to what is produced by ResetModule
or SSerase in condition mode, but with statistics from a much
smaller volume of reading/writing is produced. Usually badly performing
disks will occur in pairs with both bad disks belonging to the same bus
(e.g., disks 0 and 1, 2 and 3, 4 and 5, or 6 and 7). Badly performing
drives should have their directory files .dir (see
§ [sec:dir]) updated by hand to include RT at the end
of the top line.
Usage:
testmod\([\) options \(]\) bankoptions can be:
-hor–help: print usage information and exit-vor–verbose: produce more informative/diagnostic output;-v -vfor even more-for–force: continue to produce files despite warnings-ror–readonly: Perform read-only test-Ror–realtime: Switches to real-time mode (see below)-dor–skipdircheck: Disable directory checking (see below)-Sor–speed: Disable correctness testing to better test throughput-nn or–nrepn : Repeat the test n times (default is 2)-ss or–blocksizes : Read and write bytes at a time (default is 10 MB)-bb or–nblockb : Perform b reads per test (default is 50)-pp or–pointerp : Start read-only tests at byte positionp-ofile or–dirfilefile : Write the module directory to file file
bank is the Mark5 bank containing the disk to be studied (
AorB)
Many modules being tested are perhaps damaged in some way and may
require the -R and/or -d options above. Is is generally safe to
use these options, but the diagnostic power of this program may be
reduced in cases where some drives are intrinsically slow, but still
produce valid data.
testseqnumbers \(\mathrm{(package: difxmessage)}\)
Program testseqnumbers is a utility to listen for DiFX multicast
messages and identify any that come with a sequence number that is not
sequential. This is a good way to identify possible packet loss or
duplication on a DiFX cluster network.
Usage:
testseqnumbers\([\) options \(]\)options can be:
-hor–help: print usage information and exit-vor–verbose: produce more output;-v -vfor even more
If run without the -v option, only unexpected packets will be noted.
If run with one -v flag, each received packet will be identified
with a period being written to the screen. If run with 2 -v flags,
each packet received will have its source and sequence number printed.
vdif2to8 \(\mathrm{(package: vdifio)}\)
Program vdifb2to8 takes a 2-but sampled VDIF stream and reencodes it
as 8-bit samples. It should work on any form of VDIF with 2 bits per
sample (2+2 bits complex). It is anticipated that any 8-bit decoder will
be performed with linear level spacings, unlike the case for 2-bit
samples which use a high to low ratio of 3.3359 to minimized
quantization noise. Given this anticipation the levels chosen in the
output 8-bit stream correspond to levels of 35.5 and 118.5 counts for
the low and high states respectively. These levels lead to a ratio as
close to 3.3359 as possible. The 0.5 offset comes about from assuming
that the 256 output states are centered on zero and thus range from
-127.5 …-0.5, 0.5 …127.5. Extra bytes in the input stream (not
corresponding to valid VDIF frames) will be excised but invalid frames
(as marked with the invalid bit) will be retained and encoded to 8 bits.
Usage:
vdif2to8inputFile frameSize outputFileinputFile is the file to read (2 bits per sample)
frameSize is the size of the VDIF frames including frame headers (5032 for VLBA or VLA VDIF data)
outputFile is the output file (8 bits sper sample)
Example:
vdif2to8 input.vdif 5032 output.vdif
vdifbstate \(\mathrm{(package: vdifio)}\)
Program vdifbstate will perform state counts on a multi-thread VDIF
baseband data file.
Usage:
vdifbstatefile frameSize dataRate threadlist nFrames \([\) offset \(]\)file is the file to decode
frameSize is the size of the VDIF frames including frame headers (5032 for VLBA or VLA VDIF data)
dataRate is the file data rate, measured in Mbps, not including frame headers
threadList is a comma-separated list of thread ids to decode
nFrames is the number of data frames (typically a few kB in size) to decode
offset is the number of bytes into the file to start decoding
Example:
vdifbstate sample.vdif 5032 1024 1,2,3,4 100
Notes:
See documentation for
m5bfor details on specifying the data format.Only real-sampled data with 1 or 2 bits per sample is supported at this time.
If a non-power-of-two number of threads is requested, extra channels will be invented to pad out the next power of two. Data in these extra channels has undefined qualities.
vdifChanSelect \(\mathrm{(package: vdifio)}\)
vdifd \(\mathrm{(package: vdifio)}\)
Program vdifd takes a multi-thread VDIF file and decodes some
samples. It is implemented as a python script that makes use of vmux
and m5d to do most of the work. It is thus a good program to study
to understand how vmux can be used. This program takes as command line
input the name of a file containing (or thought to be containing)
multi-thread VDIF baseband data, information about the VDIF stream, and
the number of samples per baseband to decode. Optionally a starting file
offset can be supplied. If the data can be decoded correctly,
information about the data will be printed to the screen along with a
table of decoded data. The output values, -3, -1, 1, or 3 for valid
data, are printed in nchan columns. Data that cannot be decoded
(either due to data replacement headers, data fill pattern replacing the
actual data after unloading from a Mark5 module, or identified via the
VDIF invalid bit) will show as 0. It should be invoked with the
following parameters:
Usage:
vdifdfile frameSize dataRate threadlist n \([\) offset \(]\)file is the file to decode
frameSize is the size of the VDIF frames including frame headers (5032 for VLBA or VLA VDIF data)
dataRate is the file data rate, measured in Mbps, not including frame headers
threadList is a comma-separated list of thread ids to decode
n is the number of samples to decode
offset is the number of bytes into the file to start decoding (default is 0)
Example:
vdifd sample.vdif 5032 1024 1,2,3,4 24
In the above example, a stream consisting of 4 channels, each with 64 MHz bandwidth and 2 bits per sample and with thread ids 1, 2, 3 and 4 are to be decoded. If the thread ids were to be permuted, the decoded output data would be permuted in the same manner.
Notes:
At this time only 2-bit real-valued data can be properly decoded.
If a non-power-of-two number of threads is requested, extra channels will be invented to pad out the next power of two. Data in these extra channels has undefined qualities.
vdiffold \(\mathrm{(package: vdifio)}\)
Program vdiffold takes a baseband data stream and integrates the
power (formed by squaring the voltage) in a number of time bins that
equally divide a given period. This is a simplifed version of “folding”
such as is used in pulsar processing. A typical use of such
functionality would be to investigate the waveform of the switched power
injected into the receiver for calibration. This program has found
considerable utility in determining time offsets between the sample
clock and formatter time (modulo the period of the calibration cycle).
In the case of 2-bit sampling a non-linear correction is applied before results are written to a file. This correction takes the form
where \(P\) is a value proportional to true power and \(\hat{P}\) is the value obtained by calculating \(\left<\hat{v}^2\right>\) when the bitstream is reproduced with values \(\hat{v} \in ( -v_{\rm high}, -1, 1, v_{\rm high} )\). This non-linear correction can be turned off by setting nbin to a negative value. Note that this program is not useful for 1-bit quantized data.
The program should be used as follows:
Usage:
vdiffoldinfile frameSize dataRate threadlist nbin nchunk freq outfile \([\) offset \(]\)infile is the file to decode
frameSize is the size of the VDIF frames including frame headers (5032 for VLBA or VLA VDIF data)
dataRate is the file data rate, measured in Mbps, not including frame headers
threadList is a comma-separated list of thread ids to decode
nbin is the number of bins to calculate per period; if negative, power correction is not performed and the absolute value of nbin is used
nchunk is the number of 10000 sample chunks to operate on
freq is the reciprocal of the period to be observed (Hz)
outfile is the name of the output file
offset (optional) is the number of bytes into the file to start decoding
Example:
vdiffold sample.vdif 5032 1024 1,2,3,4 128 10000 80 sample.fold
The output file will contain nchan+1 columns where nchan is the number of baseband channels in the data stream. The first column contains the time (seconds) within the period. Each remaining column is folded power for one baseband channel. If nbin is positive and the data is 2-bit quantized, the scaling is such that \(\left<v^2\right> = \sigma^2\) yields a power reading of 1.0, for sampler threshold \(\sigma\). Optimal signal to noise ratio occurs for a value of about 1.03. For non 2-bit quantization, the power will be in units of reconstituted \({\rm counts}^2\).
Notes:
At this time only 2-bit real-valued data can be properly decoded.
If a non-power-of-two number of threads is requested, extra channels will be invented to pad out the next power of two. Data in these extra channels has undefined qualities.
The output columns are in the same order as the thread id list. Thus, you can rearrange the output order by changing the order of the thread list. This enables reordering of data so that polarization pairs occur consecutively, allowing more sensible cross-correlation columns.
vdifspec \(\mathrm{(package: vdifio)}\)
Program vdifspec forms total power spectra for each baseband channel
in the data, including cross spectra for polarization pairs, assuming
data is in alternating polarization pairs (if not, the cross spectra
should make no sense, but they are formed anyway). The results are
written to a text file with the following columns: Column 1 is the
frequency offset from baseband for each channel; Columns 2 to
nchan+1 are the total power spectra for each baseband channel;
Columns nchan+2 to \(4 \times\)nchan+1 contain, in pairs,
the amplitude and phase of the cross spectra for each pair of channels.
It should be invoked with the following parameters:
Usage:
vdifspecinfile frameSize dataRate threadlist npoint n outfile \([\) offset \(]\)infile is the file to decode
frameSize is the size of the VDIF frames including frame headers (5032 for VLBA or VLA VDIF data)
dataRate is the file data rate, measured in Mbps, not including frame headers
threadList is a comma-separated list of thread ids to decode
npoint is the number of points to calculate for each spectrum
n is the number of FFT frames to include in the calculation
outfile is the name of the output file
offset (optional) is the number of bytes into the file to start decoding
Example:
vdifspec sample.vdif 5032 1024 1,2,3,4 256 1000 vlba.spec
Notes:
At this time only 2-bit real-valued data can be properly decoded.
If a non-power-of-two number of threads is requested, extra channels will be invented to pad out the next power of two. Data in these extra channels has undefined qualities.
The output columns are in the same order as the thread id list. Thus, you can rearrange the output order by changing the order of the thread list. This enables reordering of data so that polarization pairs occur consecutively, allowing more sensible cross-correlation columns.
vex2difx
vex2difx is a program that takes a .vex files (such as one
produced by sched with various tables based on observe-time data
appended, probably by db2vex in the case of VLBA operations) and a
.v2d configuration file (see §[sec:v2d]) and
generates one or more .input and .calc file pairs for use with
the DiFX correlator. Note specifically that .ovex files, as used at
many/most Mark4 correlators, are not supported. vex2difx, along with
calcif2, supercedes the functionality of vex2config and
vex2model, two programs that were widely used but never fully
integrated into the VLBA’s software chain. Don’t forget that oms2v2d
can be used to create a valid baseline .v2d file from the .oms
file made by sched, perhaps saving some time.
The following guiding principles drove the design of vex2difx:
The output files should never need to be hand edited
Simple experiments should not require complicated configuration
All features implemented by mpifxcorr should be accessible
All experiments expressible by vex should be supported
The configuration file should be human and machine friendly
Command line arguments should not influence the processing of the vex file
Note that not all of these ideals have been completely reached as of now. It is not the intention of the developer to guess all possible future needs of this program. Most new features will be easy to implement so send a message to the difx-users mailing list or file a JIRA [jira] bug tracking ticket for requests.
Usage:
vex2difx\([\) options \(]\) v2dFileoptions can be:
-hor–help: print usage information and exit-oor–output: write a configuration file called v2dFile.params(see §[sec:params]) as output-vor–verbose: produce more informative/diagnostic output;-v -vfor even more-dor–delete-old: delete old output from same .v2d file-for–force: continue to produce files despite warnings-sor–strict: treat some warnings as errors and quit (default)
v2dFile is a
.v2dfile (see §[sec:v2d]) that controls the operation of this program; the filename cannot contain underscore charactersExample:
vex2difx bx123.v2d
VDIF issues
Unlike for the other formats, VDIF does not make use of the $TRACKS
section for numeric assignment of channel. Instead the channels as
listed in the $FREQ section are sorted alphabetically by their link
name (usually something like Ch01. The alphabetical list is matched
against the thread-channel order where the threads are listed in numeric
order; the “thread index” takes precedence over the “channel index”. The
track_frame_format parameter of the $TRACKS section is still
required.
Mark5B issues
The Mark5B format, including its 2048 Mbps extension, is now supported
by vex2difx. The .vex file track assignments for Mark5B format
has never been formally documented. vex2difx has adopted the track
assignment convention used by Haystack. Formally speaking, Mark5B has no
tracks. Instead it stores up to 32 bitstreams in 32 bit words. The
concept of fanout is no longer used with Mark5B. Instead, the equivalent
operation of spreading one bitstream among 1 or more bits in each 32 bit
word is performed automatically. Thus to specify a Mark5B mode, only
three numbers are needed: Total data bit rate (excluding frame headers),
number of channels, and number of bits per sample (1 or 2). The number
of bitstreams is the product of channels and bits.
The $TRACKS section of the vex file is used to convey the bitstream
assignments. Individually, the sign and magnitude bits for each channel
are specified with fanout_def statements. In unfortunate
correspondence with existing practice, 2 is the first numbered bitstream
and 33 is the highest. In 2-bit mode, all sign bits must be assigned to
even numbered bitstreams and the corresponding magnitude bit must be
assigned to the next highest bitstream. To indicate that the data is in
Mark5B format, one must either ensure that a statement of the form
track_frame_format = MARK5B;
must be present in the appropriate $TRACKS section or
format = MARK5B
must be present in each appropriate ANTENNA section of the .v2d
file. As a concrete example, a $TRACKS section may resemble:
$TRACKS;
def Mk34112-XX01_full;
fanout_def = A : &Ch01 : sign : 1 : 02;
fanout_def = A : &Ch01 : mag : 1 : 03;
fanout_def = A : &Ch02 : sign : 1 : 04;
fanout_def = A : &Ch02 : mag : 1 : 05;
fanout_def = A : &Ch03 : sign : 1 : 06;
.
.
.
fanout_def = A : &Ch15 : mag : 1 : 31;
fanout_def = A : &Ch16 : sign : 1 : 32;
fanout_def = A : &Ch16 : mag : 1 : 33;
track_frame_format = MARK5B;
enddef;
Media specification
vex2difx allows .input file generation for two types of media. A
single .input file can have different media types for different
stations. Ensuring that media has been specified is important as
antennas with no media will be dropped from correlation. The default
media choice is Mark5 modules. The TAPELOG_OBS table in the input
vex file should list the time ranges valid for each module. Jobs will be
split at Mark5 module boundaries; that is, a single job can only support
a single Mark5 unit per station. All stations using Mark5 modules will
have DATA SOURCE set to MODULE in .input files. If
file-based correlation is to be performed, the TAPELOG_OBS table is
not needed and the burden of specifying media is moved to the .v2d
file. The files to correlate are specified separately for each antenna
in an ANTENNA block. Note when specifying filenames, it is up to the
user to ensure that full and proper paths to each file are provided and
that the computer running the datastream for each antenna can see that
file. Two keywords are used to specify data files. They are not mutually
exclusive but it is not recommended to use both for the same antenna.
The first is file. The value assigned to file is one or more
(comma separated) file names. It is okay to have multiple file keywords
per antenna; all files supplied will be stored in the same order
internally. The second keyword is filelist which takes a single
argument, which is a file containing the list of files to read. The file
pointed to by filelist only needs to be visible to vex2difx, not
the software correlator nodes. This file contains a list of file names
and optionally start and stop MJD times. Comments can be started with a
# and are ended by the end-of-line character. Like for the file keyword,
the file names listed must be in time order, even if start and stop MJD
values are supplied. An example file as supplied to filelist is
below:
# This is a comment. File list for MK for project BX123
/data/mk/bx123.001.m5a 54322.452112 54322.511304
/data/mk/bx123.002.m5a 54322.512012 54322.514121 # a short scan
/data/mk/bx123.003.m5a 54322.766323 54322.812311
If times for a file are supplied, the file will be included in the
.input file DATA TABLE only if the file time range overlaps with the
.input file time range. If not supplied, the file will be included
regardless of the .input file time range, which could incur a large
performance problem.
A few sample ANTENNA blocks are shown below:
ANTENNA MK
{
filelist=bx123.filelist.mk
}
ANTENNA OV { file=/data/ov/bx123.001.m5a,
/data/ov/bx123.002.m5a,
/data/ov/bx123.003.m5a }
ANTENNA PT { file=/data/pt/bx123.003.m5a } # recording started late here
Pulsars
Some information, including example .v2d sections, on setting up
pulsar correlation can be found in §[sec:pulsars].
You may find additional information at https://github.com/difx/difx/wiki/vex2difx .
vexpeek \(\mathrm{(package: vex2difx)}\)
Program vexpeek takes a vex file as input and sends to stdout the
experiment name, segment, and a list of antennas and the MJD times that
they were included in the observation. This program is mainly intended
to be called from python program db2vex which needs to know a little
about the file before appending the CLOCK and TAPELOG_OBS tables. The
VLBA operations system relies on such functionality but there is no
reason other operations couldn’t use this. This program uses the same
parsing infrastructure as vex2difx so the warnings that may be
produced and sent to stderr in running vex2difx will also do so
with vexpeek. Thus, when db2vex is run some of these error
messages may be seen.
Usage:
vexpeek\([\)options\(]\) vexFileoptions can be:
-hor–help: print usage information and exit-vor–verbose: print decoded version of vexFile to screen-for–format: add per-antenna format description to output-bor–bands: print list of bands used by vexFile-uor–diskusage: add per-antenna disk usage to output-sor–scans: print list of scans and antennas used by each
vexFile is the vex format file to be inspected.
vlog \(\mathrm{(package: vex2difx)}\)
Program vlog takes as input a calibration file (cal.vlba,
§[sec:cal]). This file is parsed to produce four files
containing formatted arrays that are convenient for use in the
construction of FITS tables: flag, pcal, tsys, and
weather (§[sec:flag],
§[sec:pcal], §[sec:tsys] and
§[sec:weather]). This program is named after AIPS
task vlog that does nearly the same thing.
Usage:
vlogcalFile \([\)antennaList\(]\)calFile is the
cal.vlbafile produced bytsmto be processed.antennaList is an optional comma-separated list of antennas to process. If omitted, all antennas with calibration data will be processed.
Running with no command line arguments will print usage information to the terminal and exit.
vmux \(\mathrm{(package: vdifio)}\)
Program vmux takes a VDIF file with multiple threads of one channel
each and multiplexes the data into a new VDIF file consisting of a
single thread containing all the channels. In the case that a
non-power-of-two number of channels are contained in the input file (or
equivalently, a non-power-of-two number of threads are specified on the
command line), the next power of two will be selected as the number of
channels in the output thread and any unused channel slots will contain
random data.
Usage:
vmux\([\)options\(]\) inputFile inputFrameSize framesPerSecond threadList outputFile \([\)offset \([\)chunkSize\(] ]\)inputFile is the input multi-thread VDIF file, or
-for stdininputFrameSize is the size of one thread’s data frame, including header (for RDBE VDIF data this is 5032)
framesPerSecond is the number of frames per second in the input file for each thread (and is thus the number of output frames per second as well)
threadList is a comma-separated list of integers in range 0 to 1023; the order of the numbers is significant and dictates the order of channels in the output data
outputFile is the name of the output, single-thread VDIF file, or
-for stdoutoffset is an optional offset into the input file (in bytes)
chunkSize is (roughly) how many bytes to operate on at a time (default is 2000000)
options can be:
-hor–help: print usage information and exit-vor–verbose: be more verbose in execution-qor–quiet: be less verbose in execution-ff or–fanoutf : set fanout factor to f-eor–EDV4: convert VDIF extended data version to EDV4 (default)-nor–noEDV4: don’t convert VDIF extended data version to EDV4
The concept of fanout applies to certain variants of VDIF data where one logical sampled channel is interleaved multiple threads. Certain modes of the DBBC3 makes use of this mode. See Sec. [sec:vdiffanout] for details.
VDIF Extended Data Version (EDV) 4 is used to contain per-channel validity flags within a multi-channel VDIF file. See Sec. [sec:vdifedv4] for details.
vsn \(\mathrm{(package: mk5daemon)}\)
Program vsn is used to check or set the Volume Serial Number (VSN),
the write protect state, and the Disk Module State (DMS) of a module.
This program can not be used when the Mark5 unit is being used for
something else. You must be logged into the Mark5 unit that contains the
module to inspect/change. In addition to displaying the VSN of the
module, this utility will list information about each disk in the
module. The columns displayed are:
Disk number: in the range 0 to 7.
Drive model: the model number of the disk.
Serial number: the serial number of the disk (in parentheses).
Drive model revision number: addition model information.
SMART capable: 1 indicates SMART information is available; 0 otherwise.
SMART state: (only valid if SMART capable) 1 indicates good health.
Note that the drive model, serial number and revision number can have spaces making it hard to tell when one field start and the next begins. Thus the serial number is contained within parentheses to make this clear.
Usage:
vsn\([\) options \(]\) bank \([\) newVSN \(]\)options can be:
-hor–help: print usage information and exit-for–force: proceed without asking-vor–verbose: be more verbose in operation-por–played: set DMS to played-ror–recorded: set DMS to recorded-eor–erased: set DMS to erased-wor–writeprotect: set write protection-uor–unwriteprotect: clear write protection
-sor–smart: Get S.M.A.R.T. data from disks and write to file VSN.smart)bank is the Mark5 unit bank to look at (must be
AorB)newVSN is the new name to assign to the module and must be a legal VSN
Example 1: show VSN:
vsn AExample 2: set VSN:
vsn A NRAO+456
If you get a message such as “Watchdog caught a hang-up executing …” that means access to the module failed. This could indicate a bad module. The module should be reinserted (perhaps in a different bank or unit) and the unit rebooted before coming to a firm conclusion.
vsum \(\mathrm{(package: vdifio)}\)
Program vsum prints a summary of one or more VDIF files, printing
such information as list of threads found, collectively, in the first
and last few MB of the file, the VDIF epoch and other time information,
and packet size. If the data is not recognized as VDIF, an error code
will be printed; see below for the list of codes and their meanings.
Legacy format VDIF data is not supported.
Usage:
vsum\([\) options \(]\) file1 \([\) file2 \([ \ldots ] ]\)file1 …is/are the VDIF file(s) to summarize
options can be:
-hor–help: print usage information and exit-sor–shortsum: print one-line summary per file-6or–mark6: interpret provided file names as Mark6 scans–allmark6: summarize all files found on mounted Mark6 modules
If the -6 or –mark6 or –allmark6 option is used, it is
assumed that the files are to be found in their expected location, which
could be altered by an environment variable. See
sec [sec:mark6path] for more details.
If the summary operation failed for a file, one of the following error codes will be returned:
-1 |
The file size could not be determined |
-2 |
The file could not be opened |
-3 |
Memory allocation failure |
-4 |
File read failed |
-5 |
Frame size could not be determined |
-6 |
First frame could not be found |
-7 |
Seek to near-end-of-file failed |
-8 |
Read at end of file failed |
-9 |
A valid frame at the end of the file could not be found |
The -s or –shortsum option produces output that can be used
directly by vex2difx as a file list.
zerocorr \(\mathrm{(package: mark5access)}\)
Program zerocorr is intended to cross correlate data with zero time
delay (and a window determined by the spectral resolution) between two
recordings made at the same station. It is possible to correlate data
observed in different formats and even mis-matched bandpasses, through
the construction of a file describing the details of the single sub-band
that is to be correlated.
Usage:
zerocorr\([\) options \(]\) confFileoptions can be:
-hor–help: print usage information and exit-vor–verbose: be more verbose in operation
confFile is a file describing the correlation parameters
Example:
zeroconf td006.zc
See documentation on confFile (or .zc file) in section
[sec:zc]. Output data documentation can be found for
.vis files in section [sec:vis] and for .lag
files in section [sec:lag].