[sword-devel] ThML importer

Chris Little sword-devel@crosswire.org
Tue, 26 Feb 2002 09:06:45 -0800

> -----Original Message-----
> When will this be available in the windows frontend?

Do you have anything with which to bribe Troy? :)
> One of the wonderful things that this will do is open the 
> doors to non-programmers to contribute by tagging PD books with ThML.

Yes, indeed.  There are tons of books on CCEL that people can help
tagging.  And there are tons of books not on CCEL that could be marked
and contributed.

The importer itself (thml2gbs) is added to CVS now (in the utilities
sub-directory).  There's also a tool called imp2gbs that takes a file in
a format where a key is listed on one line like "$$$/book/chapter/verse"
and the intended contents of that entry are listed on the following
lines.  You can also use thml2gbs to output in this format (just run it
with no args to get the syntax for this function).  There is a bug in
imp2gbs where the final entry is not entered, but I'll get this fixed
before we release 1.5.3.

> and just curious, are the converted files compressed? What is 
> the size ratio between a flat ThML file and one converted 
> into Sword format?

Only RawGenBook is written now.  zGenBook may be forthcoming for 1.5.4.
The size of the raw ThML file and the swod module should be roughly the
same (give or take a few k).

We don't handle the style sheets yet, but handling them should be easy
to add in 1.5.4.