[sword-devel] Parsing osisID and osisRef
dmsmith at crosswire.org
Fri Nov 20 06:28:09 MST 2009
SWORD's ability to parse arbitrary input into a list of verses is awesome. It is far more powerful than what is needed for an osisID or an osisRef.
The structure of these for biblical references is very well defined.
Here is a partial BNF for it. (I've simplified/extended the BNF with [ ] to represent optional instead of using ε for the empty production and allow them to be anywhere.)
# An osisRef can be a space separated list of osisRefs
# or two osisIDs separated by a dash
osisRef ::= <osisRef> " " <osisRef>
| <osisID> [ "-" <osisID> ]
# An osisID is a reference with optional work prefix and/or grain,
osisID ::= [ <workPrefix> ":" ] <reference> [ "!" <grain> ]
# A reference has a book name and can be followed by a chapter and a verse, separated by a period '.'
reference ::= <bookname>
| <bookname> "." <number>
| <bookname> "." <number> "." <number>
#Book names are normalized to a particular list, including the deuterocanonical books.
bookname := "Gen" | "Exod" | "Lev" | ... skipping for brevity... | "Rev"
# the numbers are a nonzero and never have leading zeros
number ::= <nzdigit> [ <digits> ]
nzdigit ::= "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"
digit ::= "0" | <nzdigit>
digits ::= <digit> [ <digits> ]
workPrefix ::= .....
grain ::= .....
I'd like to write parseOsisRef and parseOsisID and use it within osis2mod.
Right now, I have to munge the osisRefs and osisIDs to a form that ParseVerseList will understand.
The code will be much simpler and much faster than ParseVerseList. Here are some of the specialties of ParseVerseList that don't need to be handled.
a) It understands internationalized book names
b) It understands all kinds of abbreviations for book names
c) It allows roman numerals in book names.
d) It does not require a book name for a reference, but uses the last seen reference's book name as a basis.
e) Likewise, it does not require a chapter number for a verse references, but uses the last seen reference's book and chapter as a basis.
f) It allows special constructs such as "v 3", "c 4" and "9f" and "12ff" for verse, chapter, next verse and to the end of the chapter. (There are other special constructs.)
More information about the sword-devel