[sword-devel] KJV2006 and divineName
dmsmith555 at yahoo.com
Fri Apr 21 04:38:04 MST 2006
Ted Walther wrote:
> On Sat, Mar 25, 2006 at 06:45:58PM -0700, Troy A. Griffitts wrote:
>> <w lemma="strong:1">word1 word2</w> <w lemma="strong:2">word3 word4
>> word5</w> <w lemma="strong:3>word6</w> <w lemma="strong:4>word7 word8
>> Most printed Bibles with Strong's numbers merely insert numbers into
>> the text, imply the previous word or some number of words are related
>> to that number. Our NT human tagging allowed us to be exact, even
>> non-contiguous. We don't have this level of markup in the OT.
>> Dude, I'm so excited about all this work you're putting into this data!
>> I'm sure so many projects (inside and outside of CrossWire) will be
>> blessed by this!
> Indeed. I was just getting my project kicked off based on the KJV2003
> when I noticed some problems with the Strong's number markup in the OT.
> I don't really want to delay my project, but if KJV2006 is less than a
> few months away, I can wait.
The next beta release should be the last one with no changes until the
final release. Look for an announcement of the final beta "any day now."
You can get the current work at www.crosswire.org/~dmsmith/kjv2006.
When it is released really depends upon the ability of the windows
version of SWORD to handle it. There are 4 software changes that need to
be made before it is released. 3 are in the SWORD api and one is in
osis2mod. These changes are being worked by my guess is that they will
be completed after the Spring semester, which is soon.
If these are not changed in the code, then I can use xslt to transform
the master document into one that works around these problems.
> Really, I don't think the connective words like "And the" should be
> included as part of the Strong's number. They should be outside.
The approach has been fairly simple, the KJV uses italics in the printed
copies to indicate what was added to the Greek or Hebrew. These are
marked with <transChange>And the</transChange>. The remainder of the
words are understood to be translation from the Greek and Hebrew. To
that extent they should be surrounded with strongs numbers.
There are some empty strongs numbers as not everything in the original
was needed to be translated.
As Troy noted the OT was programmatically tagged with the strong number
that fell at a point in the text surrounded everything from the previous
number that was not italic.
The NT was not programmatically tagged but done by people using a
software tool. The result in the NT is that verse by verse every strongs
number in the TR is present in the KJV NT. The empty ones are not
necessarily at a good location.
All this to say, fixing the tagging is a manual, analytical exercise. It
should be done, but is outside the scope of this effort.
> I've noticed some verses have the first word not surrounded by
> appropriate tags giving the strongs number, yet other words are labelled
> as "NIH" which is very convenient. Could we have all words put inside
> tags like that for easier parsing?
At this time, the established goals of this effort have been reached. If
there are specific, identified mistakes we can fix those, up until the
release. However there are other things that can be done that have not
been done. Any other changes will be for a following effort. If there
is a clear algorithm that can be applied, that does not swap one set of
problems with another, I think it would make sense to make that change.
I will be checking the final beta into SVN and then it will be open to
fixes such as these. When enough have accumulated then we can
re-release. (At least that is my thought)
More information about the sword-devel