Development Reference
Humdrum Software Development
The Humdrum Toolkit is necessarily limited in scope and there are many
functions that users will wish to add.
In developing adjunct software tools, it is imperative
that the software conform to the following design conventions:
-
Programs should be general-purpose and adapt to a
wide variety of input circumstances.
-
If possible, programs should be able to process any Humdrum input
rather than be limited to a given type of input interpretation.
-
Command names should be limited to 8 characters in length in order
to ensure portability to DOS systems.
-
Command names should preferably be the same as the output
interpretation produced by the command.
-
Command names should not be unduly abbreviated
since infrequently used software is less easily
remembered than frequently used system commands.
-
The command syntax should conform to standard POSIX conventions.
-
Errors and warnings should be prefaced by giving
the name of the program or command which issues the error message. e.g.
vox: ERROR: voice 3 begins with a null token.
-
Errors messages should be sent to "stderr" rather than to the
standard output.
-
Wherever possible, `filter' programs should produce outputs that
are identical in structure to the input.
More specifically, input line numbers should correspond to output
line numbers -- where appropriate.
-
Comments, interpretations, barlines, and double barlines should
be echoed in the output as the default condition
(except in the case of formatted non-Humdrum outputs).
-
For many programs,
the user should be able to skip the processing of certain types of
tokens (such as barlines) by specifying a
.B "-s"
flag -- followed by a user-defined regular expression.
Tokens matching the regular expression should be echoed
unprocessed in the output stream.
-
Programs should handle spine-path changes in a fashion
appropriate to the nature of the command.
-
Comments and interpretations should be identified by explicitly matching
the exclamation mark or asterisk in the
first
column of the input data token.
Exclamation marks and asterisks are legitimate data signifiers
when not occurring in the first column of an input token.
-
Where possible, outputs should
not
be formatted with descriptive labels etc.
The preferred output format is to have all outputs
conform to the Humdrum syntax.
This ensures that all outputs can themselves be used as inputs to
other Humdrum programs.
-
Programs should generally avoid assumptions
concerning context-dependent inputs.
Inputs should be assumed to be context-free.
-
Programs should be able to handle inputs
with unexpected user extensions or representational addenda
-- such as the presence of spurious or unknown characters.
-
Programs that search or examine inputs for certain features, properties,
or errors should return a
null
output if nothing is found.
Messages indicating that `nothing was found' should be avoided.
"Silence is golden."
Standard Program Skeleton
Much of the Humdrum software was originally developed using the
AWK programming language.
AWK was designed by
Alfred Aho, Brian Kernighan, and Peter Weinberger.
It is syntactically very similar to the C programming language,
but is easier to use and promotes better software productivity.
AWK provides powerful text manipulation features that make it
admirably suited to the creation of Humdrum software.
AWK is also a very easy language to learn,
and is an excellent first language for novice programmers.
The Humdrum Toolkit includes programing skeletons that may provide a
useful starting place for software development using AWK.
Two skeleton files are provided with the toolkit:
skeleton.ksh
and
skeleton.awk.
The kornshell file (.ksh) parses the command line,
issues appropriate error messages if the command is improperly invoked,
displays a help screen if necessary, and assembles the command
parameters to invoke an awk script for the command (.awk).
The skeleton.awk skeleton contains a main loop that
is normally executed for each record of input.
A series of useful functions are included in the AWK skeleton program.
These functions include:
- Parse_command.
This function checks that the input passed from the
corresponding kornshell script for the command.
The Parse_command function contains a list of valid options and
assigns the passed parameters to the appropriate option variables.
- Store_indicators.
This function allows the spine-path indicators for the
current record to be stored in the array path_indicator
so that they may be used later.
- Store_new_interps.
This function stores the new interpretations found in an
interpretation record for each spine.
- Process_indicators.
This function takes the spine-path indicators that were stored
in the array 'path_indicator' in the function 'store_indicators'
and manipulates the arrays 'path_indicator' and 'current_interp'
according to the contents of the array 'path_indicator'.
- Ins_array_pos.
This function inserts new positions in the
arrays 'path_indicator', 'current_interp', and 'current_key' and
copies elements so that everything is preserved
- Del_array_pos.
Performs the opposite of function 'ins_array_pos'.
- Exchange_spines.
This function exhanges two spines by exchanging the corresponding
elements in current_interp.