Reference Records

Reference records are formal ways of encoding “library-type” information pertaining to a Humdrum document. Reference records provide standardized ways of encoding bibliographic information that is suitable for computer-based access to metadata about the digital score.

Humdrum reference records lines start with three exclamation marks (!!!), followed by a multi-letter reference code, then an optional number, then an optional dash and sub-key, then an optional @ or @@ and language qualification, then a colon ending the reference key, and finally some textual content for the record:

!!!COM: The composer
!!!OTL: The title
!!!OTL2: The second title
!!!OTL@DE: The title translated into German
!!!OTL@@JP: The title in original language of Japanese.
!!!OTL-rev: The revised title (where "rev" is an informal sub-categorization)
!!!OTL3-sub@@HAW: The third subtitle in the original language of Hawaiian.

Standardized reference records use upper-case letters and the # character. Informal or user-created reference records should preferably use lower-case letters, or start with a lower-case letter in order to avoid future conflicts with future standardized reference records. Reference record keys must only use letters and the # character as well as the underscore (_). Spaces are specifically not allowed, so use _ as a replacement for spaces if needed when creating non-standardized reference records. Dashes are also allowed, but note that these are typically used for sub-categorizations of a core reference record type.

Over 80 reference codes are pre-defined in Humdrum and are grouped by category below. You can also click on a reference record name in the following list to jump directory to the description of a particular reference record, or mouseover the name to see a brief description of the reference record.

Authorship Information

Composer’s name.

In some cases, opinions differ regarding the best spelling of a composer’s name. If so, all common spellings should be given — each alternative separated from the previous by a semicolon:

!!!COM: Chopin, Fryderyk; Chopin, Frédéric

With respect to accents, the default encoding should be UTF-8. Refer to the discussion concerning the RLN reference record for alternate encodings.

If a work was composed by more than one composer, then each composer’s name should appear on a separate COM record with a number designation prior to the colon. For example,

!!!COM1: Smith,  A.  
!!!COM2: Jones,  B.

The preferred order of the name should be family-name, given-name. If the name should not be alphabetized or shortened by the last name, then enclose the portion of the name for alphabetization or shortening in curly braces. Placing the last name in

!!!COM: {Josquin} des Prez
!!!COM: Prez, {Josquin} des
!!!COM: Ciccone, {Madonna} Louise

See also COL reference records.

Sub-categorizations

Reference IDs for composers can be given as a sub-categorization of COM entries, such as COM-viaf-id and COM-viaf-url to encode the VIAF (Virtual International Authority File) IDs for the composer:

!!!COM: Beethoven, Ludwig van
!!!COM-viaf-id: 32182557
!!!COM-viaf-url: http://viaf.org/viaf/32182557

Beethoven’s grandfather would be:

!!!COM: Beethoven, Ludwig van
!!!COM-viaf-id: 80351209
!!!COM-viaf-url: http://viaf.org/viaf/80351209

To encode the Wikipedia page for a composer in a reference record, and with an optional language qualification:

!!!COM: Beethoven, Ludwig van
!!!COM-wikipedia: https://en.wikipedia.org/wiki/Ludwig_van_Beethoven
!!!COM-wikipedia@DE: https://de.wikipedia.org/wiki/Ludwig_van_Beethoven

Attributed composer.

This may include attributions known to be false. Multiple attributions to different composers should be placed into separate COA entries with an added number

!!!COA1: Composer 1
!!!COA2: Composer 2

Alternate names for the attributed composer can be added to a line, separating the alternate names by a semicolon (;).

Note that if a document contains both COA and COM records, then the attributed composer is implicitly assumed to be false.

Suspected composer.

This reference code indicates the belief of the editor or producer of the document as to the true identity of the composer(s). If more than one composer is suspected, each name should appear on a separate COS record with a number designation prior to the colon.

Composer’s abbreviated, alias, or stage name.

Such as Madonna for Madonna Louise Ciccone, or Lady Gaga for Stefani Joanne Angelina Germanotta:

!!!COM: Germanotta, Stefani Joanne Angelina
!!!COL: Lady Gaga

Composer’s corporate name.

Corporate names may include the names of popular groups (especially when the actual composer is not known). Corporate names may also include business names, such as Muzak. If the corporate name is in reference to one of multiple composers in COM entry, then any enumeration number attached to the COM record should match that of the COC record.

Composer’s dates.

The birth and death dates should be encoded using the zeit format, which provides a highly refined time representation, including methods for representing uncertainty, approximation, and boundary dates (e.g. prior to …, after …).

Composer’s birth location.

Where the composer was born.

Composer’s death location.

Where the composer died.

Composer’s nationality.

For clarity, a language qualifier can be used to indicate the language of the nationality word, although English is expected to be the language for an unqualified CNT record. A double at-sign (@@) can optionally be used to indicate the word for the nationality in the composer’s nations’s language.

!!!CNT: German
!!!CNT@@DE: Deutscher
!!!CNT@EN: English
!!!CNT@RU: немец

However, note that the word for a nationality becomes complex when expressed in many languages, such as Deutscher for a male German, or Deutsche for a female German, both of which are adjectives. Deutch is the noun, but typically this means the German language rather than the German nationality.

In cases where the composer changes nationality, successive nationalities should be listed in chronological order separated by semicolons:

!!!COM: Stravinsky, Igor
!!!CNT: Russian; French; American

When the composer is born with dual-citizenship, then separate the (first) nationalities with a comma, with the more strongly associated nationality first (such as for country in which the composer was born or where the composer lived longer).

Lyricist’s name.

If more than one lyricist was involved in the work, then each lyricist’s name should appear in a separate LYR record with an enumeration number prior to the colon. If the composer was also the lyricist, this should be explicitly encoding using the independent LYR record rather than implicitly assumed.

Librettist’s name.

The name of the librettist. If more than one librettist was involved in the work, then each librettist’s name should appear on a separate LIB record with an enumeration number prior to the colon. If the composer was also the librettist, this should be explicitly encoding using an independent LIB record rather than implicitly assumed.

Sub-categorizations

Librettists and other roles other than the composer do not have a specific reference record for birth and death dates. Such information can be added as a sub-categorization of the person’s reference record. Here is an example of adding dates for the librettist:

!!!LIB: Da Ponte, Lorenzo
!!!LIB-CDT: 1749/03/10-1838/08/17

Music arranger’s name.

If more than one arranger was involved in the work, then each arranger’s name should appear in a separate LAR record with an enumeration number prior to the colon.

Orchestrator’s name.

If more than one orchestrator was involved in the work, then each orchestrator’s name should appear in a separate LOR record with an enumeration number prior to the colon.

Original language of vocal/choral text.

The language should be encoded as an ISO 639 language code as described in the language qualification section of this page.

When multiple original languages are used, separate each language code with a semicolon:

!!!TXO: LA; OC

Language of the encoded vocal/choral text.

The language should be encoded as an ISO 639 language code as described in the language qualification section of this page.

When the text is encoded in both an original language along with translations (encoded in separate adjacent spines), prefix the original language(s) with @@ and the translated languages with @:

!!!TXL: DE
!!!TXO: @@DE, @EN, @FR

The langauge codes should be capitalized.

Language tandem interpretations

The language of a text can also be enocoded within the data by using the *lang: tandem interpretation to specify the current language of the text:

**text
*lang:EN
This
is
in
English,
*lang:DE
aber
das
ist
auf
deutsch.
*-

Translator of the text.

The name of the translator of any vocal, choral, or dramatic text. If more than one translator was involved in the work, then each translator’s name should appear on a separate TRN record with an enumeration number before the colon.

Recording Information

Humdrum representations may encode information pertaining to sound recordings (such as sound-based analyses). For information derived from sound recordings the following reference records may be pertinent.

Album title.

Manufacturer or sponsoring company.

The company or organization responsible for the release, distribution, and/or manufacturing of the recording.

Recording company’s catalogue number.

The album’s numerical designation.

Release date.

The release date should be encoded using the **date format.

Place of recording.

Name of the producer.

Date of recording.

The date of recording should be encoded using the **date format.

Track number.

Performance Information

Humdrum representations may encode performance-activity information rather than (or in addition to) score-related information. If the representation encodes a given performance (such as a MIDI performance), then the following reference records may be pertinent.

!!!MGN: Name of the performance group (ensemble), such as the Juilliard String Quartet.

MPN Performer's name. If more than one performer was involved in the work, then each performer's name should appear on a separate MPN record with a number designation prior to the colon. Note that this record can be used for soloists, and !!!MGN: is used for the name of a group, such as the Chicago Symphony Orchestra, and !!!MCN: is for the conductor of the group.

MPS Suspected performer. If more than one performer is suspected, each name should appear on a separate MPS record with a number designation prior to the colon.

MRD Date of performance. The performance date should be encoded using the **date format described in the Humdrum Reference Manual.

MLC Place of performance. (Local language should be used.)

MCN Name of the conductor of the performance.

MPD Date of first performance. The date of first performance should be encoded using the **date format described in the described in the Humdrum Reference Manual.

Work Identification Information

OTL Title. The title of the specific section or segment encoded in the current file. Titles must be rendered in the original language, e.g. Le sacre du printemps. (Title translations are encoded using other reference records.)

OTP Popular Title. This reference record encodes well-known or alias titles such as "Pathetique Sonata".

OTA Alternative title. This reference record encodes earlier or alternate titles.

OPR Title of larger (or parent) work from which the encoded piece is a part. For example, "Gute Nacht" (OTL) from Winterreise (OPR).

OAC Act number. For operas and musicals, this reference record encodes the act number as an Arabic (rather than Roman) numeral. The number may be preceded by the word "Act" as in Act 3.

OSC Scene number. For operas and musicals, this reference record encodes the scene number as an Arabic (rather than Roman) numeral. The number may be preceded by the word "Scene" as in Scene 3.

OMV Movement number. For multi-movement works such as sonatas and symphonies, this reference record encodes the movement number as an Arabic (rather than Roman) numeral. The number may be preceded by the word "Movement" or "mov." etc., as in mov. 3.

OMD Movement designation or movement name. Typically movements may be named according to the tempo (e.g. "Allegro ma no troppo") or according to a style, genre or form (e.g. "Fugue"), or according to a programmatic title (e.g. "In Full Flower").

OPS Opus number. The number may be preceded by the word "Opus" as in Opus 23. Once again, Arabic numerals are used.

ONM Number. The number may be preceded by the abbreviations "No." or "Nr." as in No. 4.

OVM Volume. The volume number may be preceded by the abbreviation "Vol." as in Vol. 2. Arabic numbers are used.

ODE Dedication. Name of person or organization to whom the work is dedicated. If the work was dedicated to more than one person, then each dedicatee's name should appear on a separate ODE record with a number designation prior to the colon.

OCO Commission. Name of person or organization that commissioned the work. If the work was commissioned by more than one person, then each commissioner's name should appear on a separate OCO record with a number designation prior to the colon.

OCL Collector. Name of person who collected or transcribed the work. If the work was collected by more than one person, then each collector's name should appear on a separate OCL record with a number designation prior to the colon.

ONB Free format note related to the title or identity of the encoded work. Nota bene. If more than one such note is encoded, each should appear on a separate ONB record with a number designation prior to the colon.

ODT Date of composition. The date (or period) of composition should be encoded using the **date or **Zeit formats described in the Humdrum Reference Manual. The **date and **Zeit formats provides a highly refined representation, including methods for representing uncertainty, approximation, and boundary dates (e.g. prior to ..., after ...).

OCY Country of composition. Local names should be used, such as `Espana'.

OPC City, town or village of composition. Local names should be used, such as `Den Haag.'

Group Information

GTL Group Title. A logical collection of works such as the "London Symphonies" by Haydn, or the four concertos by Vivaldi forming "The Seasons".

GAW Associated Work. Some works are associated with other works, such as plays, novels, paintings, films, or other musical works. E.g. Mendelssohn's Overture to Shakespeare's Midsummer Night's Dream. This reference allows associated works to be explicitly identified by author and title. E.g.

` !!!GAW: Stéphane Mallarmé, L’Après-midi d’un faune; [The Afternoon of a Faun].`

GCO Collection designation. This is a free-form text record that can be used to identify a collection of pieces, such as works appearing in a compendium or anthology. E.g. Norton Scores, Smithsonian Collection, Burkhart Anthology.

Imprint Information

PUB Publication status. This reference record identifies whether the document has ever been "published". One of the following English terms may appear: published or unpublished.

PED Publication editor. Name of the editor of the source used for the digital edition if a specific edition is being encoded.

PPR First publisher. Name of the first publisher of the work, or the name of the publisher if a specific edition is being encoded.

PDT Date first published. The date of publication should be encoded using the **date format described in the Humdrum Reference Manual.

!!!PTL: Publication title. Title of the publication (volume) from which the work is encoded. To encode linebreaks in the title, use the pipe character (|).

PPP Place first published. The location where the source edition for the digital encoding was first published (location of first edition).

PC# Publisher's catalogue number. This should not be confused with better known scholarly catalogues, such as those of Köchel, Hoboken, etc.

SCT Scholarly catalogue abbreviation and number. E.g. BWV 551

SCA Scholarly catalogue (unabbreviated) name. E.g.Koechel 117.

SMS Manuscript source name. For unpublished sources, the manuscript source name.

SML Manuscript location. For unpublished sources, the location of the manuscript source.

SMA Acknowledgement of manuscript access. This reference information may be used to encode a free format acknowledgement or note of thanks to a given manuscript owner for scholarly or other access.

YEP Publisher of electronic edition. This reference identifies the publisher of the electronic document.

YEC Date and owner of electronic copyright. This reference identifies the year and owner of the copyright for the electronic document.

YER Date electronic edition released.

YEM Copyright message. This record conveys any special text related to copyright. It might convey a simple warning (e.g. "All rights reserved."), convey registration or licensing information, or indicate that the document is shareware.

YEN Country of copyright. This reference identifies the country in which the electronic document was created, or where the copyright was established. In effect, it identifies the country under whose laws the copyright declaration is to be interpreted.

YOR Original document. This reference identifies any original source or sources from which encoded document was prepared. Note that original documents may themselves be copyrighted, and that permission may be required in order to create an electronic derivative document. Original documents may also have lapsed copyrights.

YOO Original document owner. If the electronic document was prepared from a copyrighted original document, this reference identifies the copyright owner of the original document. Note that unless the electronic and original documents have the same owner, some licensing agreement or other legal arrangement is necessary in order to create an electronic derivative document.

YOY Original copyright year. If the electronic document was prepared from a copyrighted original document, this reference identifies the year of copyright for the original document. Note that some licensing agreement or other legal arrangement is necessary in order to create an electronic derivative document.

YOE Original editor. The editor of the original document from which the electronic edition was prepared. Note that some licensing agreement or other legal arrangement may be necessary in order to create an electronic derivative document.

EED Electronic Editor. Name of the editor of the electronic document. If more than one editor was involved in the work, then each editor’s name should appear on a separate EED record with a number designation prior to the colon, such as !!!EED2: for the second editor.

ENC Encoder of the electronic document. This reference identifies the name of the person or persons who encoded the electronic document. (Not to be confused with the electronic editor.) If more than one encoder was involved in the work, then each encoder's name should appear on a separate ENC record with a number designation prior to the colon, such as !!!ENC2: for the second encoder.

!!!END: Encoding date of the electronic document. This reference gives the date on which the person or persons who encoded the electronic document (see ENC), or the date on which it was substantially encoded. Similar to the EEV, but does not change as the document is ammended.

EMD Document modification description. This record type is used to chronicle all modifications made to the original electronic document. EMD records should indicate the date of modification, the name of the person making the modification, and a brief description of the type of modification made. For each successive modification, a separate EMD record should appear with a number designation prior to the colon.

EEV Electronic edition version. This reference identifies the specific editorial version of the work. e.g. Version 1.3g, or by date in **date format. EEV record can appear in a given electronic document.

EFL File number. Some files are part of a series or group of related files. This record indicates that the current document is file x in a group of y files. The two numbers are separated by a slash as in:

` !!!EFL: 1/4`

EST Encoding status. This record indicates the current status of the document as it is being produced. Free-format text may indicate that the encoding is in-progress, list tasks remaining, or indicate that the encoding is complete. EST records are normally eliminated prior to distribution of the document.

VTS Checksum validation number. This reference encodes the checksum number for the file -- excluding the VTS record itself. When this record is eliminated from the file, any POSIX.2 standard cksum command can be used to determine whether the file originates with the publisher, or whether it has been modified in some way. (See the Humdrum veritas command described in Section 4.) Note that this validation process is easily circumvented by malicious individuals. For true security, the checksum value should be compared with a printed list of checksums provided by the electronic publisher.

Analytic Information

ACO Collection designation. This is a free-form text record that can be used to identify a collection, set, or group of related works, such as works appearing in a compendium or anthology. E.g. Norton Scores, Smithsonian Collection, Jones Anthology.

AFR Form designation. This is a free-form text record that can be used to identify the form (if appropriate) of the work. E.g. fuga, sonata-allegro, passacaglia, rounded binary, rondo.

AGN Genre designation. This is a free-form text record that can be used to identify the genre of the work. E.g. opera, string quartet, barbershop quartet.

AST Style, period, or type of work designation. This is a free-form text record that can be used to characterize the style, period, or type of work. This reference can include any term or terms deemed appropriate by the producer of the document. Designations might include keywords or keyphrases such as: Baroque, bebop, Ecole Notre Dame, minimalist, serial, reggae, slendro, heterophony, etc.

AMD Mode Classification. A combined numerical/name system for mode identification -- used especially for medieval monophonic and later polyphonic works. Modes are indicated by numbers from 1 to 12, followed by a semicolon, followed by the corresponding written name (with an initial upper-case character). Permissible mode numbers and names are:


1; Dorian 2; Hypodorian 3; Phyrgian 4; Hypophyrgian 5; Lydian 6; Hypolydian 7; Mixolydian 8; Hypomixolydian 9; Ionian 10; Hypoionian 11; Aeolian 12; Hypoaeolian —– —————-

Other non-standard mode names can be used at the discretion of the electronic editor.

AMT Metric Classification. Meters for a file may be classified as one of the following eight categories: simple duple, simple triple, simple quadruple, compound duple, compound triple, compound quadruple, irregular, or various.

AIN Instrumentation. This reference is used to list all of the instruments (including voice) used in the work. Instruments should be encoded using the abbreviations specified by the *I tandem interpretation described in Appendix II. Instrument codes must appear in alphabetical order separated by spaces. (Note that alphabetical ordering is essential in order to facilitate searches for specific combinations or subsets of instruments using the grep command.) E.g.

` !!!AIN: clars corno fagot flt oboe`

An optional part count for each instrument code can be prefixed by a space before each instrument code, such as this example for a string quartet:

` !!!AIN: 1 cello 1 viola 2 violn`

ARE Geographical region of origin. This reference identifies the geographical location from which the work originates. Location designations are encoded using the local language. The location begins with the continent designation, and becomes successively more refined. If the information is available, refinement can continue to suburban district or even street address.

` !!!ARE: Europa, Mitteleuropa, Deutschland, Wuerttemberg, Sindelfingen !!!ARE: America, North America, United States of America, Ohio, Columbus`

ARL Geographical location of origin. Like the ARE record, this reference record identifies the geographical location from which a work originates. Location designations are encoded using latitude and longitude values -- suitable for creating maps. The first numerical value indicates latitude (positive values indicating North, negative values indicating South). The second numerical value indicates longitude (positive values only, indicating distance East from the central meridian). A slash separates the latitude and longitude values. A series of trailing characters is used to indicate the degree of accuracy of the location information: % (continent), @ (country), # (province or state), : (town or village). For large regions such as countries or provinces, the geographical center of the region is used.

` !!!ARL: 51.5/10.5@`

Historical and Background Information

HAO Aural History. This is a free-form text record used to relay any story or stories about the origin, purpose or background of the work. This reference record is especially useful in ethnomusicological materials, where a particular story accompanies a song. The story may be encoded using several successive HAO records.

HTX Free-form Translation of Vocal Text. This is a free-form text record used to relay a non-literal translation of a vocal text. This reference record is again especially useful in ethnomusicological materials.

Representation Information

RLN ASCII language setting. This reference identifies the "language" code in which the file was encoded. This is applicable only to computer platforms which provide "extended ASCII" text capabilities (e.g. Danish or Spanish characters).

RDF User-defined signifiers. All Humdrum representations provide some signifiers (ASCII characters) that remain undefined. Users are free to use these undefined signifiers as they choose. When undefined signifiers appear in a given document, the RDF**interp code should be used to specify what the signifiers denote. Notice that the code RDF is followed by the name of the interpretation to which the signifier definition applies. In the following example, the letters "X" and "x" symbols that are defined within a hypothetical **piano representation. E.g.

` !!!RDFpiano: X=hands cross, left over right !!!RDFpiano: x=hands cross, right over left`

RDT Date encoded. This reference uses the Humdrum **date format to identify the date(s) when the document was encoded.

RNB Representation note. This reference provides a free-format text that conveys some document-specific note related to matters of representation.

RWG Representation warning. This reference may be used to encode explicit warnings concerning the encoded material.

Electronic Citation

Electronic editions of music might be cited in printed or other documents by including the following information. The "author" (e.g. COM), the "title" -- either original title (OTL) or translated title (e.g. XEN). The editor (EED), publisher (YEP), date of publication and copyright owner (YEC), and electronic version (EEV), In addition, a full citation ought to include the validation checksum (VTS). This number will allow others to verify that a particular electronic document is precisely the one cited. A sample electronic citation might be:

Franz Liszt, Hungarian Rhapsody No. 8 in F-sharp minor (solo piano).
Amsterdam: Rijkaard Software Publishers, 1994; H. Vo\o'r\(hc'i\o's\(hc'ek (Ed.),
Electronic edition version 2.1, checksum 891678772.

In Humdrum files it does not matter where reference records appear. Since it is common for users to inspect the beginning of a file in order to check whether the file is being properly processed, the number of reference records at the beginning of the file should be kept to a minimum. A good habit is to place the composer, title of the work, and copyright records at the beginning of the file, and to relegate all other reference records to the end of the file.

Accommodating Different Languages

A perpetual problem with reference information pertains to the language in which information is represented. Humdrum provides comprehensive methods for dealing with multiple languages and translations.

Reference information can be encoded using the original language of the source material or document as well a translations into other languages. The ISO 639 standard can be used to indicate the language of a specific reference record. This standard provides two- and three-letter codes for over 4000 living and historical languages. For example, the ISO standard codes for English, French and Swahili are EN, FR and SW respectively. The codes should be used in upper-case form, and two-letter codes should be used when available; otherwise, use the three-letter language code.

Humdrum allows any type of reference record to be encoded in any or all of these languages. In general, the primary reference information is encoded using the original language of the source material. Consider the case of a Friuli folksong (Friuli is a dialect of Italian). The title of the song should be encoded in the original language, but it may also be useful to provide translations into Italian and English.

The principal reference code indicating title is OTL. Following the OTL reference code, the language of encoding can be indicated by appending an at-sign (@), followed by the language code in upper-case. Here is the title given in three languages:

!!!OTL@@FUR: Ai preit la biele stele
!!!OTL@IT: Ho pregato la buona stella
!!!OTL@EN: The Good Star

Notice the use of the double at-sign (i.e., @@, which indicates that Friuli is the original language for this reference record type. The title can also be encoded without any language designation:

!!!OTL: Ai preit la biele stele

This implies that the title is already rendered in the primary reference language (but what language that language is is left unspecified).