Vocal Text


text — vocal text representation


The text representation permits the representation of sung or spoken language. Only languages transcribable into the Roman alphabet can be represented using text. For non-Roman alphabets and non-language vocables, refer to the IPA representation.

The text representation distinguishes three classes of data tokens: text-tokens, silence-tokens, and barlines.

Text-tokens are of three types: syllabic, melismatic, and polysyllabic. A syllabic text-token means that a single syllable is associated with a single moment (such as a single pitch). A melismatic text-token means that a single syllable is associated with several successive moments (such as more than one pitch). A polysyllabic text-token means that several syllables are associated with a single moment (such as a single pitch).

The simplest of these is the melismatic text-token — which consist merely of a single vertical bar (|). By itself, the vertical bar means that the previous syllable is sustained through the current moment. Syllabic and polysyllabic text-tokens are more complicated.

In the text representation, text-tokens are syllable-oriented, so words must be broken-up into the component syllables. Words may be of three basic types: single-syllable words, multi-syllable words, and hyphenated words (e.g. the word "half-mast"). As a result, four types of syllables are distinguished: (1) a single-syllable word, (2) a word-initiating syllable, (3) a word-completing syllable, and (4) a mid-word syllable. In text the hyphen (-) is used to signify syllable boundaries; the tilde (~) is used to signify boundaries between hyphenated words (necessarily also a syllable boundary). The following table illustrates the how these signifiers are used:

text a single-syllable word text- a word-initiating syllable -text a word-completing syllable -text- -d-word syllable text~ a single-syllable word beginning a hyphenated multi-word ~text a single-syllable word completing a hyphenated multi-word ~text~ a single-syllable word continuing a hyphenated multi-word ~text- a word-initating syllable completing a hyphenated multi-word -text~ a word-completing syllable — part of a hyphenated multi-word ———— —————————————————————-

A syllabic text-token consists of any one of the above tokens.

Occasionally, it is useful to encode several syllables or words for a single moment (or pitch) — as in the case of recitativo passages. In the text representation, such multi-word or polysyllabic text-tokens are encoded via Humdrum multiple-stops. That is, several words or syllables are encoded on the same data record, with each successive syllable separated by a space. The following encoding shows two syllables/words for each note:


**pitch **text C4 I saw A4 a li- G4 -ly beau- F4 -ti- -ful , *- *- ———– ————-

In the text representation, punctuation marks — such as commas, periods, and exclamation marks — are segregated from the syllable texts by a space. The period character is represented by the asterisk (*) in order to avoid the illegal construction of a period (null token) in a multiple stop. Five remaining punctuation marks are represented literally in text: comma (,), exclamation mark (!), question mark (?), colon (:), and the semicolon (;). When used in a melismatic context, an explicitly notated (printed) dash is represented by a double vertical bar (||). This signifies both the presence of a printed dash and the fact that the previous syllable continues to sound. In addition, double quotation marks (") and parentheses (()) are also permitted. Once again, all such forms of punctuation must be segregated from the alphabetic text by the presence of a space. It is important to remember that periods are represented by asterisks.

The text representation is intended to represent canonical language information rather than verbatim transcriptions of a printed score. Textual abbreviations and printing conventions such as "etc.," "#" and "$" must be expanded as: et ce- -te- -ra, num- -ber, and dol- -lar respectively (depending on the language). (If the user wishes to represent text as printed, an explicit underlay or overlay representation would be more appropriate.)

Where greater subtlety is required for pronunciation, refer to the IPA representation.

The original ASCII character set is regrettably hostile to non-English languages, especially with regard to accents. With the extended (international) ASCII set, French, German, and other accents are properly handled, but for English users, a given computer system may prove inflexible for non-English materials. In such cases, the text representation provides the following aliases to designate common accents. Accent signifiers are assumed to modify the immediately preceding Roman letter.

/ acute accent \ grave accent \^ circumflex 0 small superscript circle 1 macron 2 Umlaut 5 cedilla 6 hacek 7 enya (tilde) —- ————————–

Accent signifiers in text

Musical phrasing can be explicitly represented in text via the open brace ({) and closed brace (}) for phrase-beginnings and phrase-endings respectively. Slurs can be represented via the open bracket ([) and closed bracket (]). Stressed and unstressed syllables can be explicitly denoted by prepending the plus sign (+) or the underscore (_) respectively — at the beginning of the associated text.

A handful of vocables can be represented via text including humming (signified by the upper-case letter `M' by itself) and laughing (signified by the upper-case letter `L' by itself). Tandem interpretation also permit encoding of the depth of vibrato and the degree of ennunciation. For example, shallow vibrato is indicated by *vibrato1 whereas deep vibrato is indicated by *vibrato9. Sloppy ennunciation is indicated by *ennun1 whereas highly precise ennunciation is indicated by *ennun9. Intermediate values can be used as needed.

Apart from text-tokens text also permits the encoding of silence-tokens and barlines. Silence-tokens consist simply of the percent sign (%) appearing in isolation. This indicates the termination or absence of any vocal sound.

Barlines are represented using the "common system" for barlines — see barlines.

Much vocal music is strophic in form — containing two or more "verses" sung to the same musical setting. Abbreviated representations may take advantage of the Humdrum strophe mechanism to avoid duplicate encoding of melodies, etc. See strophe.


It is recommended that files containing predominantly text data should be given names with the distinguishing `.txt' extension.


The complete system of signifiers used by text is summarized in the following table.

A-Z upper-case letters A to Z
a-z lower-case letters a to z
\% silence token (character by itself)
L laughing voice (character by itself)
M humming voice (character by itself)
- multi-syllable-word syllable boundary
~ hyphenated-word word separator; printed dash
| melisma indicator (syllable sustain)
|| printed hyphen indicating melisma
. null token (not period)
* period (must be preceded by a space)
? question mark (must be preceded by a space)
colon (must be preceded by a space) ; semicolon (must be preceded by a space) ! exclamation mark (must be preceded by a space) ( open parenthesis ) closed parenthesis { beginning of a phrase } end of a phrase [ beginning of a slur ] end of a slur = barline == double barline / acute accent \ grave accent \^ circumflex 0 small circle 1 macron 2 Umlaut 5 cedilla 6 hacek 7 enya (tilde) —— ————————————————

Summary of text Signifiers


A sample document is given below:


**text *English =1 Once up- || -on a . time | there =2 was a lit- -tle girl . whose =3 *- ———–

Notice that the first measure contains two Humdrum double-stops (Once up- and -on a). The double vertical bar (||) indicates that the word "upon" is explicitly divided by a notated hyphen. The word time is melismatic — spanning two time-events.


Currently, no special-purpose Humdrum commands produce text as output or process text encoded data as input.


The following tandem interpretations can be used in conjunction with text:

language indicator *Deutsch strophe indicator *strophe verse number *S/3 choral text *ensemble solo text *solo meter signatures *M6/8 underlay position *unter overlay position *ueber vibrato depth *vibrato7 ennunciation *ennun5 ——————– ————-

Tandem interpretations for text


` assemble, **IPA, strophe, thru`