[sword-devel] Re: [sword-devel]Squeak Sword UI

Chris Little sword-devel@crosswire.org
Thu, 16 Jan 2003 10:03:11 -0700 (MST)

On Thu, 16 Jan 2003, Jimmie Houchin wrote:

> You mean I can't reverse engineer them from the text markup itself?

You can try, but you will probably fail.  It's not markup that poses a
problem.  The markup is just other people's standards that we make use of.  
The problem is reading the binary indexes and following them from one
index into another index or another part of the same index (repeating as
necessary) and finally into the data, possibly needing to decrypt,
possibly needing to decompress along the way.  Now repeat for all of the
different categories of modules we support and each different format of
module in that category.

Learning C++ and reading the code will be simpler and faster, not to
mention some aspects like decrypting (which I grant you may choose not to
support) and decompression (which you must support if you want to support
any module released or updated in the last year and a half) would be quite
impossible without using the code we use as a basis.  (I'd imagine Squeak
has some kind of zlib functionality included, but our files aren't simply
> The Java frontend, does it use the Sword libraries or will there be Java 
> code for reading Sword modules?

JSword is implementing routines to read Sword modules natively in Java.  
I don't know their progress, but it's a big task.  There's also an
advantage in porting from C++ to Java in that the two are similar enough
that porting is made simpler and (if nothing else) someone who can write
Java can probably read C++ reasonably well.  (I know that's a BIG
assumption of Java programmers--no offense to the JSword team, just Troy.)
> It is quite possible I am underestimating the task of reading/parsing 
> Sword modules. I thought the modules/text/rawtext/***/ot 
> modules/text/rawtext/***/nt were simply text files which the Sword 
> libraries parsed to create what was sent to the front end.

With the possible exception of RawLD, RawText is the simplest file format.  
It's not much challenge to write a driver to read this, but as I mentioned 
above, nothing is released in this format and any time a module is updated 
it is released in zText format.  Some day when I get a free weekend, I'll 
convert everything out of RawText and into zText--or more likely into OSIS 
marked zText to kill two birds with one stone.

Even RawText is more complex than most people who open the files up 
assume.  The verse ordering within them is arbitrary and there is no 
indication of where one verse begins and another ends.  (This latter fact 
is not so important for Bibles since it's usually pretty easy to tell that 
verses break at a newline.)  The .vss files associated with each ot/nt 
file are ordered according to the order of canon and they indicate the 
location and length of each verse record.

Simply supporting RawText would provide a decent proof of concept, but 
if you're only supporting Sword modules because of the existing library of 
books then it isn't optimal to only support a quickly disappearing segment 
of that library of books.