[sword-devel] Questions about usfm2osis.pl

johnduffy at cgcf.net johnduffy at cgcf.net
Thu Oct 29 02:59:44 MST 2009


No need for my sake to work on that for the minute, as I suspect that the
apostrophe's etc are not all encoded differently.  


John Duffy

-----Original Message-----
From: Chris Little [mailto:chrislit at crosswire.org] 
Sent: 29 October 2009 00:25
To: SWORD Developers' Collaboration Forum
Subject: Re: [sword-devel] Questions about usfm2osis.pl

johnduffy at cgcf.net wrote:
> Chris,
> Sorry - I meant poetry not quotation marks.  I've been using Unicode
> quotations in UTF-8 instead of <<, < etc.  I read somewhere that the
> quotations in UTF-8 would not require osis markup, which would avoid the
> problem nested quotes or quotes going across chapter or section divisions.
> The original text does not make good use of quotation marks too which
> requires this approach, as it often inserts an additional opening
> mark at each new paragraph within a quoted section, while only having one
> closing quotation mark and the very end.  I trust that using Unicode
> characters will therefore avoid problems in osis and module creation.
> John Duffy

You should be fine with that.

usfm2osis.pl will mark <<, <, etc. with <q>, but will leave any marks 
encoded with actual quotation marks alone. Continuation quotation marks 
like you describe shouldn't be a problem for usfm2osis.pl. It looks for 
them and should mark them correctly.

It would be fairly trivial to make usfm2osis.pl also look for and mark 
quotation marks (provided there are unique marks for right and left 
quotes and provided that apostrophe is encoded differently from right 
and left single quotes), but I haven't seen anyone clamoring for this 


sword-devel mailing list: sword-devel at crosswire.org
Instructions to unsubscribe/change your settings at above page

More information about the sword-devel mailing list