Chris Little chrislit at crosswire.org
Sun Oct 3 21:28:50 MST 2010

On 10/3/2010 6:19 PM, Robert Hunt wrote:
> Dear all,
> I've been investigating for the last two weeks about creating a small
> open repository under the OpenScriptures banner for storing and
> maintaining (and even documenting) XML lists of versification schemes
> and international booknames, versification mappings, USFM and OSIS
> booknames and abbreviations, etc. I've already received a positive
> response from the author of Bibledit about starting with some of his
> lists. I realise that any particular format will never please everyone,
> but I'm interested in your comments on it's potential usefulness. I
> found that I needed such things personally for a project so rather than
> reinventing them yet again, I figured that every Bible program must need
> them so why not make them available (if they're not already).
> I guess my questions are:
>     1/ Are these sorts of lists already freely available in a suitable
>     place (preferably independent of a particular program)? If so, no
>     need for me to proceed.

There's quite a lot of data currently available. Little of it is in XML 
or any other human-friendly format, but you're welcome to mine our data 
and put it in a more presentable form. However, if you work with real 
data (extracting v11n data from actual Bibles), you'll quickly discover 
that the number of v11n systems is nearly equal to the number of 
different translations (excluding those that use the KJV v11n exactly, 
since that system actually is quite common).

Most of our data is at 
https://crosswire.org/svn/sword-tools/trunk/versification/ (including 
the basicv11ns subdirectory). The v11nsystem.pl script will generate a 
v11n definition file from a variety of formats in the format that we use.

An explanation of our canon definition format, found in the XML files at 
the above address, is at 

CCEL has data (v11n & mapping) that they received from Wycliffe, 
presented in an XML (OSIS-like) format: 
http://www.ccel.org/refsys/refsys.html. However, their data is extremely 
inaccurate. (I don't know who is to blame for the inaccuracy & errors.)

There's also v11n & mapping data available as part of the STEP spec: 

As for localized book names, Logos has a ton of this data, which they 
collected through community contributions back around 2000, when they 
were gearing up to release Logos Series X. They may or may not be 
amenable to sharing this.

>     Assuming they're not:
>     2/ Where might the information be gleaned from (with suitable
>     permissions)?
>     3/ Apart from the above (versification schemes & mappings, USFM/OSIS
>     booknames/filename/abbreviation standards, international
>     booknames/abbreviations), what other lists do you suggest might be
>     useful?
>     4/ Would your program be interested in taking advantage of such XML
>     lists?
>     5/ If not, would another format be helpful?

XML is fine; we can make converters. Our interest in using this kind of 
data would be dependent on its utility and its accuracy.


