[sword-devel] Improvements to osis2mod to handle XML comments and <header> correctly

DM Smith dmsmith at crosswire.org
Mon Apr 5 11:07:29 MST 2010


I don't see osis2mod going away. I see a web service as a way for those 
with Windows and with a fast Internet connection to do module 
submission. Since it will do binary uploads, it will be faster than 
email to upload, which increases the size of binary attachments by 
4/3-rds. It may be sufficient for linux and slow connections too. BTW, 
I'd like to see shell and bat scripts that provides the same "tool 
chain" as the web service.

There are two different points at which we are concerned with 
compatibility: With the engine and with the OSIS standard. By and large, 
it is best for osis2mod to build modules which are compatible with the 
oldest version of the SWORD engine as possible. We had stayed compatible 
with 1.5.6 for the longest of times. Recently we had to upgrade to 1.6 
(1.5.12?) in order to support v11n. With that we also changed the 
handling of pre-verse material. I anticipate that until the next big 
thing comes along osis2mod will stay compatible with 1.6.

The other is the OSIS standard. As time goes by we add more and more 
support for the whole of OSIS xml. The start of this thread was to add 
support for other legal constructs, i.e. div in header and also comments.

One of the features of tei2mod is that it suggests a minimal conf. We 
should do the same for osis2mod and the web service. It'd also be cool 
if they could take a supplied conf and parse it to determine runtime 
parameters.

In Him,
     DM

On 04/05/2010 01:49 PM, Brian J. Dumont wrote:
> I would be concerned about this and how it might be implemented.  I have
> packaged a number of very large modules and regularly compile (osis2mod
> or xml2mod as the case may be) updates to see how it's changing as I
> update.
>
> I have a relatively slow broadband connection.  It would be painful as a
> web service every hour or so, and I fear that if there is both a web
> service and a program, that they may get out of sync.
>
> Brian
>
> On 04/05/2010 01:24 PM, Daniel Owens wrote:
>    
>> Yes, I agree, and if there were a feedback mechanism for the module
>> creator to let them know how to start fixing an OSIS file or conf
>> file, it would save Chris (or whoever else approves modules) time on
>> the basic stuff.
>>
>> Daniel
>>
>> On 4/5/2010 11:09 AM, DM Smith wrote:
>>      
>>> This is a great idea. Rather than emailing source to modules at
>>> crosswire dot org, one could upload it via a web service. We could
>>> have stages of validation (xmllint) and construction (osis2mod). Such
>>> a service could evaluate the quality of the submission.
>>>
>>> In Him,
>>>      DM
>>>
>>> On 04/05/2010 12:01 PM, Weston Ruter wrote:
>>>        
>>>> Why not turn osis2mod into a web service? Then it wouldn't matter
>>>> how it is implemented since it would be abstracted away by the web
>>>> service interface. It could use the best XML libraries available
>>>> today and written in the programming language of choice, both of
>>>> which would make maintenance and the addition of new features much
>>>> easier.
>>>>
>>>> Weston
>>>>
>>>> On Mon, Apr 5, 2010 at 5:55 AM, Manfred Bergmann
>>>> <manfred.bergmann at me.com<mailto:manfred.bergmann at me.com>>  wrote:
>>>>
>>>>      Hi DM.
>>>>
>>>>      Am 05.04.2010 um 13:21 schrieb DM Smith:
>>>>
>>>>      >  Regarding using a "real" parser, it is a good idea. But we
>>>>      don't want SWORD to be dependant on an external parser.
>>>>
>>>>      What's the reason for that?
>>>>      I could understand if it would mean for the user to install
>>>>      certain libraries manually but when the sources can be integrated
>>>>      into the project and has the appropriate licence then why not?
>>>>
>>>>
>>>>      Manfred
>>>>
>>>>      >
>>>>      >  On 02/04/2010 05:31 AM, John Zaitseff wrote:
>>>>      >>  Dear SWORD developers,
>>>>      >>
>>>>      >>  Firstly, thanks for developing the SWORD library!  I have been
>>>>      using
>>>>      >>  this library, in conjunction with the BibleTime front-end, for
>>>>      many
>>>>      >>  years.
>>>>      >>
>>>>      >>  I have recently started to develop some OSIS documents of my
>>>> own.
>>>>      >>  In doing so, I found that the XML parser in osis2mod is somewhat
>>>>      >>  fragile---something that you are, no doubt, aware of.
>>>>      >>
>>>>      >>  In particular, osis2mod does not handle XML comments at all, nor
>>>>      >>  does it correctly parse the<header>  element.  Being able to
>>>>      handle
>>>>      >>  XML comments is, I think, quite important---I like to
>>>> document the
>>>>      >>  SVN revision ID, for example, in an XML comment.
>>>>      >>
>>>>      >>  Furthermore, the osis2mod XML parser looks for the first
>>>> <div>  in
>>>>      >>  the document, no matter where that occurs.  In particular, if
>>>> the
>>>>      >>  OSIS document includes a<revisionDesc>  tag in the header, it
>>>> will
>>>>      >>  have<p>  tags as well---which will be translated by
>>>> transformBSP()
>>>>      >>  into<div>  tags---and get used as the starting point for the
>>>>      >>  document!
>>>>      >>
>>>>      >>  For this reason, I have generated a quick patch that will solve
>>>>      >>  these particular problems.  Could you please apply it to the SVN
>>>>      >>  head for utilities/osis2mod.cpp.  Comments are handled
>>>> similar to
>>>>      >>  spaces: they are skipped.  And handleToken() now looks for the
>>>>      first
>>>>      >>  <div>  after the</revision>  end tag.
>>>>      >>
>>>>      >>  In general, I think that (perhaps eventually) the proper way to
>>>>      >>  parse XML is to use a library like libxml---which is designed
>>>>      >>  specifically for this purpose.
>>>>      >>
>>>>      >>  Yours truly,
>>>>      >>
>>>>      >>  John Zaitseff
>>>>          




More information about the sword-devel mailing list