Talk:Modules in the beta repository

From CrossWire Bible Society

Revision as of 19:20, 17 June 2011 by David Haslam (Talk | contribs)
Jump to: navigation, search

I thought I'd move discussion about the implementation of modules here. It was cluttering the other page and when we'd get a new version or address a problem, we'd reset the row. Here we can keep the info until we are done.--Dmsmith 05:35, 22 June 2008 (MDT)

Talk pages are intended for discussing how to improve the wiki page, not as a forum for discussing the topic it covers. David Haslam 21:56, 21 February 2009 (UTC)
It's probably best if we keep non-talk pages largely free of discussion and try to keep them clean so that they can be good reference pages for whenever we need to look details up. So all discussion should probably be kept to either talk pages or mailing lists--and preferably the latter except in the case of collaborative whiteboarding like we have here.... That is to say, this is still not Wikipedia, Wikis were not invented by Jimbo & Co., and there's no good reason for which we should limit ourselves to Wikipedia style. --Osk 03:19, 23 February 2009 (UTC)

Contents

ABU

In Matt 5 there is a WoC display problem. The WoC has a start in verse 3 and ends at the end of the last chapter. Fortunately, the WoC start and finish in this module is on chapter boundaries. If it had started in chapter 5 and finished in chapter 7 then the display of chapter 6 would never highlight the WoC.

The SWORD Engine currently terminates WoC at a verse boundary, regardless of how it is encoded. This is because it does not keep state regarding WoC across a single verse. No frontend will display it correctly, because it is not a frontend problem.

I (DM) see several solutions (there may be others):

The easiest is to change the module. If not that, I'd suggest changing osis2mod, which probably is the best solution, resulting in easier module definition. I don't like the SWORD Engine change, because it is incomplete.



There have been 3 versions:

The version 1.1 had for Matt 5:3-4:
(Note: my comment on this was that the sIDs and the eIDs were not properly encoded.)

5.3:
<q marker="" sID="q1" who="Jesus"/><br/>
   Happy the poor in spirit;
   for theirs is the kingdom of heaven.
<q eID="q1++" marker="" who="Jesus"/>
5.4:
<q marker="" sID="q1" who="Jesus"/>
   Happy they that mourn;
   for they shall be comforted.
<q eID="q1++" marker="" who="Jesus"/>

Version 1.2 has:

5.3:
<q marker="" who="Jesus" sID="q7"/>
   Happy the poor in spirit;
   for theirs is the kingdom of heaven.
5.4:
   Happy they that mourn;
   for they shall be comforted.

Version 1.3 has:

<q marker="" who="Jesus">
   Happy the poor in spirit;
   for theirs is the kingdom of heaven.
5.4:
   Happy they that mourn;
   for they shall be comforted.

Of the above, 1.1 is in my opinion, the best. It can work in the search result.

The following variant of 1.1 will work for all frontends and given how simple the ABU is, it should produce well-formed valid XML. The way to think about this is that <q marker="" who="Jesus"> is not a quotation marker but is a WoC marker as in <woc>...</woc> that has to be started and stopped in each verse and surround each word/phrase that Jesus uttered.

5.3:
<q marker=""  who="Jesus">
   Happy the poor in spirit;
   for theirs is the kingdom of heaven.
</q>
5.4:
<q marker=""  who="Jesus">
   Happy they that mourn;
   for they shall be comforted.
</q>

Dmsmith 19:14, 21 June 2008 (MDT)


There's some confusion here.

Osk 22:35, 21 June 2008 (MDT)


OK, I've got my numbers off. The version that had q++, I thought was 1.2. I guess I never saw 1.2. I've corrected my statement above to your information.

Regarding your comment about BCV, the &ltdiv> element is milestoneable.

If BSP is the proper way to encode an OSIS Bible, then I think:

  1. <verse> should always be milesoned.
  2. osis2mod should preserve the verse element (start and end) in the text and get rid of the pre-verse hack. With BSP, this will occur more and more.

Whether we encode OSIS Bible texts as BCV or BSP, the resulting module needs to work for Bible applications. In the SWORD engine verse is the indexable unit in the SWORD engine. All of our applications display verses in isolation, at least in the search result, some elsewhere.

I think the following is the best short term solution (which is a minor variation of 1.1):

  1. If quotation marks are to be displayed in the module, mark the beginning of the quote in chapter 5 with <q sID="xxx"/>, and the end of the quote in Matt 7 with <q eID="yyy"/>. Add marker attribute to be UTF-8 curly quotes if desired. Also, if the quote is interrupted, such that quotation marks should appear in the span of the Sermon on the Mount, then put the same there.

    If quotations are not needed then these are entirely unnecessary for our code as it stands today, but they might come in handy if we had each quote in the scripture marked with who as we could analyze the text for who said what.
  1. Within each verse surround the actual words of Christ with <woc>...</woc>. Obviously, if these cross a BSP boundary, then they stop and start on either side of the boundary. Finally, to make valid OSIS replace those with <q who="Jesus" marker=""> and </q> respectively. The milestoned version (i.e. your 1.1) should have worked for all SWORD apps as of 1.5.9. But it didn't.
    It does not work for JSword because it uses xslt to do the processing, which cannot handle it.
  1. If the module should not show quote marks, use OSISQToTick=false (From memory. So, I may have goofed this.) This makes the empty marker="" unnecessary.

Ultimately, it is the responsibility of osis2mod to placate the SWORD Engine by transforming modules to what it wants to hear. I think the best long term solution is for osis2mod to handle all properly encoded documents, such as 1.2 and 1.3. (Version 1.1 was a placation.) Obviously, if one can tediously encode 1.1, that processing can be put into osis2mod.

Dmsmith 05:35, 22 June 2008 (MDT)

---

One of the longstanding principles of our employment of OSIS has been that we should accept any valid OSIS, but that we need not maintain valid OSIS in our data. So osis2mod should accept anything that is valid, but the contents of the modules themselves need not be valid OSIS. (I'm more concerned with actual markup here. The cases where people actually want to use UTF-16 encoded OSIS or single quotes instead of double in attribute values aren't significant enough for me to care.)

We should probably define a non-standard method of encoding <q> that fits within <verse> elements but that can be easily derived (by osis2mod) from either standard, valid encoding. I suspect we should just use <q/> (though <milestone/> is another possibility). Given the following input:

<verse>
  cdata
  <q osisID="q1" sID="q1" who="Jesus" marker=""/>
    cdata
</verse>
<verse>
    cdata
  <q osisID="q1" eID="q1" who="Jesus" marker=""/>
  cdata
</verse>

We could generate:

<verse>
  cdata
  <q osisID="q1" sID="q1" who="Jesus" marker=""/>
    cdata
  <q type="x-continuation" eID="" who="Jesus" marker=""/>
</verse>
<verse>
  <q type="x-continuation" sID="" who="Jesus" marker=""/>
    cdata
  <q osisID="q1" eID="q1" who="Jesus" marker=""/>
  cdata
</verse>

And given the following input:

<verse sID="v1"/>
  cdata
  <q osisID="q1" who="Jesus" marker="">
    cdata
<verse eID="v1"/>
<verse sID="v2"/>
    cdata
  </q>
  cdata
<verse eID="v2"/>

We could generate:

<verse sID="v1"/>
  cdata
  <q osisID="q1" who="Jesus" marker="">
    cdata
  <q type="x-continuation" eID="" who="Jesus" marker=""/>
<verse eID="v1"/>
<verse sID="v2"/>
  <q type="x-continuation" sID="" who="Jesus" marker=""/>
    cdata
  </q>
  cdata
<verse eID="v2"/>

I believe these will validate as OSIS and would work in BibleCS without modification. I suspect they will work in HTMLHREF frontends with little (if any) modification, but I haven't looked at the code lately. It seems to me that XSLT ought to be able to convert milestone elements to container element starts/ends for JSword, but I haven't touched XSLT in about 6 years.

Osk 20:49, 26 June 2008 (MDT)


I think we agree that osis2mod is the proper place to solve the problem and I think we agree on what osis2mod should accept. While I am not overly concerned whether the resulting OSIS is valid, I'd caveat that by saying that each element and attribute should be valid defined in OSIS.

In my opinion, osis2mod should mark up the text in such a way that we know what is created by osis2mod. That way we can reasonably reconstruct the original. We can use x- attribute values to type and subType for this purpose. In this example, the start and end of the Sermon on the Mount could be marked as original, with x-start and x-end.

Currently the milestoned version of the WoC is not handled by MS, GS or BD. That it is handled by SW indicates that the problem may be in MS or GS not using the HTMLHREF filter or that there is a minor difference between it and what SW actually uses. BD is altogether a different issue. The point is that a filter change is a 1.5.12 change.

We have a couple of choices in changing osis2mod (as discussed above):

  1. Change it to produce <q> as a container. This is how ESV and KJV are encoded and it works with 1.5.9 and with JSword/BD.
  2. Change it to produce <q> as a milestone and also fix HTMLHREF to accept it and not release the modules until 1.5.12 is released. I might be able to figure out how to get JSword/BD to handle it.

In both of these the location of the start and end markers are the same. The difference is the form.

Regarding XSLT, the guarantee of XSLT is that all output is well-formed. I have not found a way to specify the output a begin element without also outputting an end element. The processing that is necessary is to collect the content and elements between two arbitrary points (represented by milestones) and style it. I simply don't know how to do it or whether it can be done, let alone whether it can perform well.

I may be able to pre-filter and do a transformation.

On a different note: Regarding the inclusion of <verse> and the pre-verse handling:

Right now the pre-verse handling is solely for the sake of headings. These are pulled out of the stream and stored as verse heading attributes. It cannot handle other kinds of pre-verse content, such as white-space (e.g. new lines) and notes.

With regard to whitespace before a verse, generally it is appended to the prior verse. If it occurs after the heading, osis2mod yanks it and treats it as if it came before the heading.

With regard to notes it is perfectly valid OSIS to have a note attached to a heading by immediately following it. SWORD cannot handle it. One way to handle it is for osis2mod to move the note into the heading. There used to be a problem with having notes in a heading, but it may be fixed now.

--Dmsmith 08:46, 27 June 2008 (MDT)


I don't really care about whether we can release ABU (and KJC or any other modules that use <q/> in the current lineup) for 1.5.11. If there are quirks or incompatibilities, we should figure out a way to fix them, implement them as necessary (in the module, in the importer, and in the library), and then release ABU, etc. I think there are ample grounds for doing all of the minor updates to filters (TEI stuff, OSISRuby, whatever is necessary for words of Christ, etc.) and releasing 1.5.12 with just that much improvement.Osk 18:32, 30 June 2008 (MDT)


Works for me. I'd be happy to work on osis2mod. If you get to it before me, that's OK too. I think we should wait a bit to see what we decide on how to handle the Wycliffe modules. Specifically, all the pre-verse stuff; whether we should change container elements to milestones; and whether we leave in the verse elements (converted to milestones, too). --Dmsmith 20:37, 30 June 2008 (MDT)


The printing of the words of Christ in red was developed around 1900. No Bible (Including the 1769 KJV) had Christ's words marked in red prior to then.

Because the KJV 1769 is the edition that WoC red letter was first applied to, and the marks are now almost universally accepted as the intention of the translators in 1611, it's hard to argue against WOC being marked up in the KJV. However, applying circa 1900 WOC mark ups to earlier editions that don't have the words of Christ marked up at all clearly is, in my opinion, revising the edition. Editions so processed should be released as contemporary updates to those editions, not the original. At least the fact that WOC is a later update should be mentioned in the conf file.

Why? In the mid to late 1800's there was discussion on exactly which words in the Bible were Christ's. Some editions did go to great lengths to quotify or boldify the words of Christ, but not all of them agreed with the red words marked in the KJV. See Farrar Fenton's "Modern English Bible" John Chapter 3, where Fenton goes so far as to put a section heading IN text stating that the words of Christ STOP at John 3:15.. John 3:16-21 is marked 'commentary by the Evangelist'.

http://web.archive.org/web/20031001145619/http://www.ferrarfenton.com/pdf/john.pdf

Other editions (such as the ABS 1911) intentionally left the interpretation of scripture to the reader. While I don't have the 1865 ABU edition at hand, the Later ABU and 1911 ABS editions based on this edition have no quotation marks or any indication exactly where the words of Christ start or stop.

To correctly represent the translation, keeping the same mark up (OR LACK OF) is the duty of the one doing the digitization.

When I turn OFF the WOC for pre-1900 editions, I'd expect quote marks that didn't exist in the original editions to disappear. Also, If quote marks did exist in the 1865 edition, I'd expect them to remain whether or not the text is painted red. (That is, the translator's intended meaning clearly depends on the quote marks.) Changing the color should be able to toggle, but adding/removing quote marks makes the edition useless for study of exactly what the translator intended. Mikey 22:38, 18 January 2009 (UTC)

Breton (module name) ?

The Breton NT is to be known as Version Koad 21. Should the Sword module name reflect this in some way? David Haslam 08:39, 23 July 2008 (MDT)

Who at CrossWire is lobbying for permissions for this translation? David Haslam 13:34, 11 January 2009 (UTC)

Japanese Bibles

"Taisho" and "Bungo" is the same Bible. I have made the same observation in the JapBungo page. David Haslam 10:44, 15 November 2008 (UTC)

WEB

BibleTime 1.6 crash on mark 4. I've scanned through most other gospels on bibletime and its fine I try to enter Mark 4 with any verse (BT always displays the whole chapter) and it crashes. I've got the latest WEB beta (2.1) and not the latest BT (BT updated through PCLinuxOS is 1.6 on KDE3.5.9

There is something unique in Mark 4 in WEB 2.1 that BT doesn't like. Mikey 02:55, 17 January 2009 (UTC)

I could use some additional information (such as what specifically is causing the problem). I'll take a look at the source for this section and update this note accordingly if I see anything unusual, but unless there are reports of problems in other frontends, I would assume this is a BT bug. --Osk 18:28, 18 January 2009 (UTC)
Modules WEB & HNV must have been removed from CrossWire Beta. When? David Haslam 17:35, 9 December 2010 (UTC)

Abbreviations now that GnomeSword is renamed as Xiphos?

Any thoughts on where we go regarding the GS abbreviation, following the release of Xiphos (formerly known as GnomeSword)? David Haslam 11:29, 14 February 2009 (UTC)

No-one else suggested one, so I have plumbed for XP as the abbreviation. David Haslam 21:53, 21 February 2009 (UTC)
I don't think that continuing to use GS will confuse anyone. Using XP will confuse most people most of the time. Likewise for X. --Osk 03:03, 23 February 2009 (UTC)
I changed all instances (there were 3) of "XP" to "Xi". "Xi" is the abbreviation Karl has introduced for the feature matrix, so we can take it as semi-official (FWIW). --Osk 02:04, 11 March 2009 (UTC)
I fully concur. David Haslam 19:34, 11 March 2009 (UTC)

Table column width

The table in the Beta Dictionary Modules section now has a very wide column for Encoding problems, as a result of a recent edit. Most users find horizontal scrolling in wiki pages a PITB. Please come up with an improvement to how the new information is presented. David Haslam 11:08, 27 March 2009 (UTC)

This annoyance has not yet been addressed. David Haslam 09:23, 19 September 2009 (UTC)

The problem was the <pre> block (which I changed to use &lt; and a wiki-style leading space.) These will not wrap. Probably the best thing to do would be put quote in a <ref> so that it would go below the table. Instead, I reformatted the blocks to be narrower. It doesn't scroll on my wide laptop screen. --Dmsmith 22:37, 1 November 2009 (UTC)

Solved indeed! Thanks. David Haslam 15:18, 2 November 2009 (UTC)

BPBible

I'm guessing this is because you saved the search results. If you have problems like these, probably the BPBible forums are the best place to ask about it (http://bpbible.com/forums) Benpmorgan 23:58, 16 June 2009 (UTC)
Yes - I had saved the search results while using the Rotherham. Initially there was no mental connection for me to the phrase "Topic Tags" in the menu options. It took me some time to figure what was happening. David Haslam 12:29, 17 June 2009 (UTC)

"Requires 1.5.12" - should this be changed to 1.6 ?

The section on Japanese Bibles (that have Ruby annotation) states that they require 1.5.12 – Should this now be changed to SWORD 1.6 ? David Haslam 12:53, 8 July 2009 (UTC)

Status of modules in the Xiphos repository?

On August 9, 2009, the PorBiblia module became available for download from the Xiphos repository. In general, should new modules in the Xiphos repository be classed as beta or as released? David Haslam 16:41, 13 August 2009 (UTC)

Henry Tomkins Anderson's New Testament

The Anderson NT scanned text in the Internet Archive has multiple OCR errors. These require checking and correcting against a printed book, either a library copy of the original 1865 or subsequent edition, or a modern reprint.

The British Library has one copy of the 1867 edition.

A search of http://www.worldcat.org/ found that the Cincinnati, Standard Pub. Co. [©1918] edition is held in several academic libraries in the USA. The New Testament, tr. from the Sinaitic manuscript discovered by Constantine Tischendorf at Mt. Sinai. OCLC Number: 3755626.

A search of http://www.abebooks.co.uk/ found that the Anderson NT 1865 is available as a print-on-demand paperback. ISBN: 1418188247 / 1-4181-8824-7. Scholarly Publishing Office, University of Michigan Library, 2006.

David Haslam 20:06, 15 August 2009 (UTC)

I have just ordered a copy of the 2006 reprint from a reputable UK bookseller (in stock) via Amazon. David Haslam 20:27, 15 August 2009 (UTC)
The online version of Anderson at http://lookhigher.net/ seems to be a better quality digitized source. The OCR errors in the existing beta module were not observed in this. It should be simple enough to capture this from the website. David Haslam 20:36, 15 August 2009 (UTC)
It is based on the 1918 edition. David Haslam 20:41, 15 August 2009 (UTC)
The book was delivered during August. It's a facsimile edition. Text area is 69mm x 111mm, centred on pages 135mm x 234mm. David Haslam 09:22, 19 September 2009 (UTC)

The most notable and idiosyncratic OCR (or spell-check) error in the CrossWire beta module is surely this one in Acts 16:37, "But Paul said to them: Having publicly scourged us uncondemned, us who are Etonians, they threw us into prison: and do they now put us out secretly? No, verily: but let them come and lead us out."
Hmmmm.... I never knew Saint Paul went to Eton College ! David Haslam 14:49, 17 August 2009 (UTC)

There is a digital facsimile PDF download for the Anderson NT at [1] (23.7 MB). David Haslam 21:15, 21 October 2009 (UTC)

I may have confused Anderson's 1865 translation from the Greek text with a much later Anderson translation (1918) based on the Sinaitic MS. These are two different translations, being translations of different Greek texts. David Haslam 12:21, 25 May 2011 (MDT)
See http://www.bible-researcher.com/anderson1.html for historical background. David Haslam 12:27, 25 May 2011 (MDT)

Different abbreviation used here for BibleCS than in Choosing a SWORD program

An annoying inconsistency that tripped me up during the last few edits! Choosing a SWORD program uses SPW as the abbreviation for The SWORD Project for Windows. This page uses BibleCS. David Haslam 14:37, 29 October 2009 (UTC)

Wycliffe / BL modules

Does anyone know which (if any) of the modules that came from Wycliffe contain any characters in the Supplementary Private Use Areas? David Haslam 10:57, 25 November 2009 (UTC)

Modules not now in CrossWire Beta

By using the Bible Desktop or Xiphos UIs, one can easily view the modules currently in CrossWire Beta. Some of those that were listed in the main page must have been removed. I have started this section in order to cut'n'paste the table rows for those modules no longer visible in CrossWire Beta. David Haslam 16:22, 9 December 2010 (UTC)

|-valign="top" <!-- Name | Scope | Content Problems | Conf Problems | Display Problems | Ready -->
|TNT
|
|
|
|
|
|hold for 1.6.1 (OSISVariants)
|-valign="top" <!-- Name | Scope | Content Problems | Display Problems | Ready -->
|WEB
|
|
|WoC tends to go on and on; quote beginning in Matt.3.15 never ends, resulting in the rest of the NT being red (BibleCS, Xi) or with intermittent black (BD)--[[User:Osk|Osk]]
----
Booktitles are wrong (Mark called Luke, Luke John etc)[[User:Refdoc|refdoc]]:[[User_Talk:Refdoc|talk]]
|
|crashes bibletime when trying to view Mark 4 (HNV doesn't have same issue) on Bibletime 1.6
----
Software crashes are not typically the fault of the module, and in the absence of a more specific report, there's not much to go on here. BibleCS and BD have no problem with Mark.4. --[[User:Osk|Osk]] 07:13, 11 January 2009 (UTC)GS is fine [[User:Refdoc|refdoc]]:[[User_Talk:Refdoc|talk]] 01:50, 19 January 2009 (UTC)
|
|-valign="top" <!-- Name | Scope | Content Problems | Conf Problems | Display Problems | Ready -->
|[[Breton]]
|
|Full New Testament<br>http://pagesperso-orange.fr/testamant.nevez/

|
|Please identify the translation as Version '''Koad 21'''
|
|Re-import, including headings.<ref>Permission received (2009-12-15) from Luc Bernicot, chairman of Société Biblique d'Anjou. Updated text edition received 2010-05-14.</ref>
|-valign="top" <!-- Name | Scope | Content Problems | Conf Problems | Display Problems | Ready -->
|[[HunKar]]
|
|
|
|
|
|
|-valign="top" <!-- Name | Scope | Content Problems | Display Problems | Ready -->
|HNV
|
|
|
|
|
|

Tables sorted and dates added

I have also sorted each Bibles table in module name order, and added the SwordVersionDate to modules for which this was missing. David Haslam 17:08, 9 December 2010 (UTC)

Experimental modules

Strictly speaking, CrossWire experimental modules are not in the beta repository. I have left these in this page for the time being, rather than moving them to a new wiki page. However, I have restructured the page to make this more apparent. David Haslam 19:40, 9 December 2010 (UTC)

Nevertheless, this section is now totally out of date, as there are far more modules in CrossWire Experimental than listed here. A separate page is therefore warranted. David Haslam 19:52, 9 December 2010 (UTC)
Added new page Modules in the experimental repository. Section removed from this page. David Haslam 17:36, 12 December 2010 (UTC)

The number of beta modules on the main page

Providing every module name is a link (wiki or external), the number of beta modules listed in the main page can be found be editing the section called Beta Modules, select all and copy (to a new file) in Notepad++, then use its Find Dialog to count the occurrences of "|[". When I did this today, there were 105 matches, so the the discrepancy is that 4 modules are not yet tabulated here. David Haslam 20:06, 9 December 2010 (UTC)

The actual number of modules in a repository is displayed in the JSword books manager, as used in BD. David Haslam 20:21, 9 December 2010 (UTC)

The Translation Trust

David Haslam has personal contact with the director of the Translation Trust, the sponsors and joint copyright holders of the modern Turkish Bible. David Haslam 13:37, 19 April 2011 (UTC)

Personal tools
Namespaces
Variants
Actions
Navigation
Miscellaneous
Toolbox