[sword-devel] KJV2006 progress report
dmsmith555 at yahoo.com
Wed Mar 8 06:16:35 MST 2006
Troy A. Griffitts wrote:
> Hey guys,
> Excited about all the work that is going on. I long for the day
> when I have more time to participate in the progress. Just a few
> comment which I hope will be helpful...
>>> 2) replace <p/> (not allowed under OSIS) with <pb/>
> Yes, <p/> is meant to be a paragraph marker, as is <milestone
> type="x-p">. I don't think <pb/> matches appropriately.
Yes, I meant <lb/>. I'll check to see if it means the same as <milestone
type="x-p"/> since that is where the ¶ appears in the text.
> Chris and I disagree on this one a bit. I disagree that every chapter
> in the KJV should be considered a new paragraph. There are serious
> interpretation errors implied with chapter/verse markers, as I think
> we all agree. Placing a <p> at the start of each chapter implies the
> translators of the text truly agree that the paragraph does begin at
> the chapter start. He may be correct about the KJV printed Bible.
> Maybe they do imply with whitespace that they think a paragraph marker
> begins at each chapter. Maybe not. Maybe their use of the pilcrow,
> ¶, paragraph symbol ma-thingy is their 'markup' for showing where they
> believe a new paragraph begins. I just don't know, but I would be
> hesitant to imply SOMEONE thinks each chapter starts a new paragraph.
> I believe this discussion first came up when looking at markup
> received from Lockman for the NASB. In my conversion, there was NO
> WAY to logically deduce where paragraph marker began. Sometime they
> were implied by start of chapter. Sometimes they were very much NOT
> implied by start of chapter (e.g. Rev 13:1, which has a paragraph
> break midway through verse 1). I wanted to stay true to the text, so
> I use paragraph milestones. I would rather stay true to the author's
> intent for the Biblical Text than have well-formed compliant markup.
> I could try to make educated decision, but it is not my place to usurp
> authority to make decision regarding their text-- especially when it
> might carry the weight of the Lockman translation committee if given
> to someone else. These are the equivalent of modern day scribal errors
> done with the same honest motives as ancient scribes.
> Sorry for the long commentary on my markup ethics.
I think making this change is kind of like capitalizing personal
pronouns for God ;) It is not part of the original. The original KJV and
current printed copies displays a ¶ symbol at these locations. It seems
that it is a translation of a Hebrew scripture tradition. To wrap up the
text with <p>...</p> differs from the original. The original KJV did not
have paragraphs beyond these marks.
>>> 8) deleted all <resp> elements as this has never been part of the
>>> OSIS standard. resp is a global attribute. I could merge it with the
>>> preceding <note type="x-strongsMarkup">....</note> However, I think
>>> these "notes" should be removed as well.
> I agree whole-heartedly with Chris on this one. Please preserve all
> data in the KJV2003 text. We're still hoping to do a proof pass of
> the strongs markup, and the notes still need to be reviewed on many of
> these entries. Anyone is welcome to strip them out if they don't need
> them for their purposes.
I'll change the <resp> elements to <milestone type="x-resp" resp="..."/>
if that works for you. Or something else that is ignored by the
> It is important not to remove any <w> data. We can currently assert
> that for every word in the original base greek text (in the KJV2003
> directory on the server), there is exactly one greek word tag in the
> corresponding verse of the KJV text. It may not be in the correct
> place, surrounding the correct words, but there is still guaranteed a
> 1 to 1 relationship.
I understand your goal and I agree with it. I have checked in the
KJV2003 as it would have been submitted to osis2mod. I am making all
changes by program so that it is a repeatable process from the "original
KJV2003". Each change can be reviewed for correctness. Today, there are
81 of these.
However, there are verses where the taggers notes expressed frustration
with the tagging tool. And these verses' OSIS is just terrible.
Sometimes splitting words into letters and tagging parts of words with
<w>. In some places a <w src="n"> is repeated several times and without
the use of splitID.
I'm thinking that what you are really wanting is that for a verse today
that has src="1" to src="x" that a <w> tag be preserved for each of
these. And if there are multiple, <w src="y">, neither having a splitID
and one surrounding text and the other not, that the ones not
surrounding text are in error.
If there are multiple <w src="s"></w> not having a splitID and none
having text, then only one needs to be preserved.
I have seen a situation where we have in a verse <w src="x"
lemma="a">...</w> and <w src="x" lemma="b">...</w>. That is, the same
word "src" is repeated, no splitID is present and two different strongs
numbers are used. I can't do anything with these programmatically, but I
would guess that this is an error.
> The source of our base Greek text (Maurice Robinson's stuff) has been
> updated with many corrections since we started the project, and we
> should be able to programmatically determine the delta between our
> version and the latest, and create a 'patch' for the KJV2003 text-- or
> at least a hit list of verses people need to review and adjust tags.
Since I have tagged the KJV2003 in SVN this should be possible. I think
that it may be possible with KJV2006 when I am done, though XSLT may
need to be applied first.
> Again, thank you so much for all the work! I'm really excited this
> text is moving forward. There is so much valuable data captured and
> making it more usable, programmatically, should enhance a number of
> Bible projects who depend on a good free English text sync'd to the
You are welcomed! My pleasure!
More information about the sword-devel