[sword-devel] KJV2006 Project

Troy A. Griffitts scribe at crosswire.org
Tue Feb 14 07:43:13 MST 2006


DM,
	I am really excited about your desire to update the KJV2003 text.  Much 
work went into that project, and a special thank you go to all the 
volunteers who helped tag the NT!

	http://crosswire.org/sword/kjv2003/status/

	To find the most recent raw data, all things KJV2003 live on our server 
at: ~sword/html/kjv2003

	It might be a mess in there.  All my utilities for scanning the text 
and computing which verses are complete, etc. are in there.  I think I 
have utilities to convert the data to a sword module which is probably 
the latest KJV module data we publish.

	What I, personally, would like to see done:

	Fix invalid markup: <note/><note/> is the most glaring.

	OT: The KJV2003 project was primarily aimed at a NT.  An OT with 
Strongs was freely distributable already.  The data we used for the OT 
is also in that directory.  I believe it was downloaded from MPJ's site 
ebible.org and was likely originally from a text from the Bible Foundation.

	OT: has dropped most all "'s" in our current module.  I think it's just 
a problem with my conversion scripts which try to place OSIS <w> tags 
AROUND each word/phrase, rather than AFTER a word/phrase, as the 
original markup has it.
	OT: It personally frustrates me that the body of the text we use for 
the KJV OT has lowercased all personal pronouns for God.  Can we find a 
better text?  I'm fairly certain the KJV in most of its printed 
incarnations had these uppercase.

	NT: Articles: All simple definite articles are left as empty tags in 
the verses.  The logic was that in English we have both an indefinite 
and definite article, whereas Greek only as a definite article:

	  a house	   OIKOS
	the house	hO OIKOS

	So, for consistency, English nouns were tagged the same whether they 
had a definite article in Greek, or were anartherous.  The desired 
output would be something like:

TR: <w src="1">hO</w><w src="2">OIKOS</w>
KJV: <w src="1 2">the house</w>

Currently it is:
<w src="1"></w> <w src="2">the house</w>


I think the correcting script logic is something like:

	Do I have an empty tag with strongs 3588 (article)
		Is morphology of <w src="[mysource]+1"> begin with "N-" (noun) and 
equal in other respects to my morphology?
			combine src numbers and drop the empty tag.

	The scholar who worked on tagging this text were told that this would 
be the logic applied, so they tagged accordingly.



	A better starting point than the raw data of the NT from 
~sword/html/kjv2003, is probably from a modified mod2osis output of our 
current module.  You can apply the attached patch to assure that no 
filters are working on the text and you get raw data output.

	Thank you again for your willingness to help.  This a very much needed 
effort!

	-Troy.






DM Smith wrote:
> The KJV Bible is the most downloaded Sword module at CrossWire.
> It is often the first impression that people get when looking at all the 
> different Sword front-ends.
> 
> There are some problems with the KJV that have been reported and need to 
> be fixed.
> 
> Anyone else interested in working on upgrading the KJV2003?
> 
> As I am just finishing the installer for Sword, I would like to start 
> this effort.
> 
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page

-------------- next part --------------
A non-text attachment was scrubbed...
Name: mod2osis.kjv2003.patch
Type: text/x-patch
Size: 1490 bytes
Desc: not available
Url : http://www.crosswire.org/pipermail/sword-devel/attachments/20060214/96b13f6b/mod2osis.kjv2003.bin


More information about the sword-devel mailing list