[sword-devel] .conf files encoding/tags

DM Smith dmsmith555 at yahoo.com
Sat Oct 6 10:52:51 MST 2007


On Oct 3, 2007, at 7:46 PM, Chris Little wrote:

> .conf files are entirely plain text except for the About field,  
> which is
> RTF. RTF is only used in the About field and only RTF (or no  
> markup) may
> be used in the About field.
>
> The use of RTF here is basically a legacy issue carried from BibleCS.
>
> It shouldn't be a big deal for you because I believe we only use 4
> different tags:
> \qc (center the following)
> \par (paragraph break)
> \pard (paragraph break + reset formatting)
> \uXXXXX? (non-CP1252 characters expressed as UTF-16)

I checked the rtfhtml filter and it only handles.
\pard, \par, \qc

\qc is treated as a <center> and it is noted that the parser is "in  
center".
\pard is treated as a </center> if in center and nothing else. It is  
not treated as a paragraph break.
\par is a paragraph start. No </p> is ever output.
\ followed by a space is ignored and \ at the end of the line is also.
All other \ are output as plain text, including the \.

>
> Eeli Kaikkonen wrote:
>> I browsed through the beta area module .conf files. It's great to see
>> so many new ones with new features.
>>
>> One rant I have. Why on earth is stupid braindead rtf or other
>> strange formatting used in .conf files? See this example from TurNTB:
>>
>> About=New Turkish Bible translation, jointly translated and  
>> published by
>> K\u00305?tab\u00305? Mukaddes \u00350?irketler  
>> (www.kitabimukaddes.com)
>> and Yeni Ya\u00351?am Yay\u00305?nlar\u00305?  
>> (www.yyyayinlari.com). We
>> are grateful for the permission by Yeni Ya\u00351?am
>> Yay\u00305?nlar\u00305? to distribute this translation.
>>
>> How are the frontends supposed to display this correctly? This is  
>> year
>> 2007 and this kind of project should use utf8 in conf files also.  
>> \par
>> is quite easy to replace but why not use <br> instead, that kind  
>> of html
>> tagging is used in many places even without real html browsers.  
>> Rtf is
>> M$ proprietary format.
>
> So you propose that we abandon a markup system already in place and
> convert everything to a different arbitrarily selected markup  
> format? As
> a result, every user will have to update every module, or the about  
> text
> will be mis-rendered. Existing strategies for rendering RTF will  
> have to
> be re-written to handle <insert arbitrarily chosen markup  
> language>. And
> all to solve a problem that doesn't exist.
>
> Suggesting that we should not use RTF because it comes from MS is  
> silly.
> RTF is quite well documented and widely used. It's cross-platform and
> wasn't even developed by MS. A full description may be found at
> http://www.biblioscape.com/rtf15_spec.htm.
>
> Converting the RTF we use to HTML requires about three lines of Perl,
> which you can port to the language of your choice:
> $about =~ s/\u(\d+)\?/pack("U", $1)/eg; # assumes no surrogate pairs
> $about =~ s/\qc ?(.*?)(\pard|$)/<center>$1<\/center>$2/g;
> $about =~ s/\pard? ?/<br\/>/g;
>
>
> --Chris
>
>
>
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page




More information about the sword-devel mailing list