humsed [-E] 'sed_instruction[;sed_instruction;...]' [inputfile ...] [ > outputfile]
humsed [-E] [-f scriptfile] [inputfile ...] [ > outputfile]
The humsed command is fashioned after the UNIX sed stream editor. In contrast to sed, humsed editing instructions are applied only to Humdrum data records; Humdrum interpretations and comments are not affected by humsed.
The humsed command accepts one or more sed instructions. Instructions are specified on the command-line within a pair of single quotes. Where more than one editing instruction is specified, successive instructions are separated by a semicolon. Alternatively, instructions may be executed from a scriptfile using the -f option. If instructions are provided both on the command-line as well as via a scriptfile, the command line instructions are performed prior to the scriptfile instructions.
Permissible instructions include
s
for substitution,
y
for transliteration,
d
for deletion,
i
for insertion,
a
for append,
r
for file-read, and
w
for file-write.
Each instruction may be preceded by an optional
regular expression
that limits the scope of the editing instruction only to those
data records matching the regular expression.
For example, the user may replace all occurrences of `X' with `Y' --
provided the signifier `Z' also occurs in the same data record.
In the case of the delete
(d)
instruction, failing to specify a preceding regular expression will
result in the deletion of all data records in the input.
For further information concerning the syntax and use of humsed editing instructions, refer to the documentation for the UNIX sed command.
Options are specified in the command line.
-h displays a help screen summarizing the command syntax -E invoke Extended Regular Expression syntax -f scriptfile execute editing instructions from the file scriptfile
With the -E option, humsed invokes the "extended" regular expression syntax, rather than the normal or "basic" regular expression syntax. With extended regular expressions, the following additional operations are supported: one-or-more (+), zero-or-one (?), logical OR (|), precedence grouping ( ), and alphanumeric token start and end anchors < >.
Note that not all systems support extended regular expressions for the sed command; on such systems the -E option for humsed is ineffective and may result in an error.
The -f options allows the user to specify a scriptfile that contains a set of editing instructions. Instructions in scriptfile are executed after any command-line editing scripts.
Simple substitution:
humsed 's/A/X/g' ragtime
The above command replaces the upper-case letter A
by the upper-case letter X.
Without the
g
(global) modifier, only the first occurrence of an "A"
in each data record would be modified.
The use of
g
applies the substitution instruction to
all
occurrences in a data record.
Substitution commands can be preceded by another regular expression
that limits the selection of records that are affected by the substitution.
For example, the following command eliminates all measure numbers in a
**kern
representation:
humsed '/=/s/[0-9]*//g' jellyroll
Rather than simply eliminating all numerical data, the initial
regular expression
(/=/)
limits the substitution operation to those data records contain the
**kern
barline signifier
(=).
More complicated substitutions may involve compound (two or more) instructions. Instructions are separated by a semicolon, and are executed in succession for each data record. Consider the following command:
humsed 's/4[A-G]/8&/g;s/84/8/g' chicago > fastbass
This command changes all quarter-note pitches (in a
**kern
representation) below middle C
to eighth-note durations, while leaving quarter-notes above middle C unchanged.
The first substitution instruction
(s/4[A-G]/8&/g)
searches for all strings beginning
with the number 4, followed by one of the upper-case letters A to G.
It then prepends the number 8;
thus the token 4F will be replaced by 84F.
(Note that the ampersand (&) in the substitution denotes the
matched string found by the target regular expression.)
The second substitution
(s/84/8/g)
replaces the string 84 by the string 8.
In short, tokens such as 4F and 4CC# will be modified to 8F and 8CC#
respectively -- whereas tokens such as 2F and 4cc# will remain unmodified.
(Note that this command is inadequate if 24th notes
(thirty-second note triplets) are present in the input --
since they will be transformed to 28th notes.)
The transliteration instruction (y) provides a short-cut for multiple single-character substitutions. For example, the following command replaces A with 0, B with 1, C with 2, etc. for the letters A to J:
humsed 'y/ABCDEFGHIJ/0123456789/' dixieland
Substitutions are organized by mapping each element in the first character string with the corresponding element in the second string. The first and second character strings must contain the same number of characters.
The delete instruction is preceded by a regular expression, followed by the single letter d. The following command deletes all data records containing the lower-case letter "r".
humsed '/r/d' swing
The file-write instruction (w) provides a way of copying selected material to a specified output file. Consider the following command:
humsed '/;/w pauses' bigband
This command identifies all data records in the file "bigband"
that contain a semicolon (the
**kern
pause signifier) and copies them into the file "pauses."
Recall that
humsed
operates only on Humdrum data records, so
the
w
command will cause only data records to be outputted.
Hence the resulting file "pauses" will not be a valid Humdrum file.
(If the user wishes the extracted material to be in a valid Humdrum
format, this could be done using the Humdrum
yank
command:yank -m ';' 0 bigband > pauses
.)
The
humsed
command can also be used to read
(r)
material from a specified file whenever
a certain condition occurs in the input stream.
For example, the following command could be used to search for
**kern
pause signifiers (;) and add a global comment indicating
the presence of a pause.
humsed '/;/r comment' bebop
-- where the file "comment" contains the following global comment:
!! A pause.
awk (UNIX),
regexp (4),
regexp (6),
rid (4),
sed (UNIX),
vi (UNIX),
yank (4)
In addition, it is possible to disrupt the spine structure by inserting or deleting tabs. Substitutions may result in empty lines or extra spaces that render the file no longer consistent with the Humdrum syntax.