[sword-devel] .conf files encoding/tags

Chris Little chrislit at crosswire.org
Wed Oct 3 16:46:09 MST 2007


.conf files are entirely plain text except for the About field, which is 
RTF. RTF is only used in the About field and only RTF (or no markup) may 
be used in the About field.

The use of RTF here is basically a legacy issue carried from BibleCS.

It shouldn't be a big deal for you because I believe we only use 4 
different tags:
\qc (center the following)
\par (paragraph break)
\pard (paragraph break + reset formatting)
\uXXXXX? (non-CP1252 characters expressed as UTF-16)

Eeli Kaikkonen wrote:
> I browsed through the beta area module .conf files. It's great to see
> so many new ones with new features.
> 
> One rant I have. Why on earth is stupid braindead rtf or other
> strange formatting used in .conf files? See this example from TurNTB:
> 
> About=New Turkish Bible translation, jointly translated and published by
> K\u00305?tab\u00305? Mukaddes \u00350?irketler (www.kitabimukaddes.com)
> and Yeni Ya\u00351?am Yay\u00305?nlar\u00305? (www.yyyayinlari.com). We
> are grateful for the permission by Yeni Ya\u00351?am
> Yay\u00305?nlar\u00305? to distribute this translation.
> 
> How are the frontends supposed to display this correctly? This is year
> 2007 and this kind of project should use utf8 in conf files also. \par
> is quite easy to replace but why not use <br> instead, that kind of html
> tagging is used in many places even without real html browsers. Rtf is
> M$ proprietary format.

So you propose that we abandon a markup system already in place and 
convert everything to a different arbitrarily selected markup format? As 
a result, every user will have to update every module, or the about text 
will be mis-rendered. Existing strategies for rendering RTF will have to 
be re-written to handle <insert arbitrarily chosen markup language>. And 
all to solve a problem that doesn't exist.

Suggesting that we should not use RTF because it comes from MS is silly. 
RTF is quite well documented and widely used. It's cross-platform and 
wasn't even developed by MS. A full description may be found at 
http://www.biblioscape.com/rtf15_spec.htm.

Converting the RTF we use to HTML requires about three lines of Perl, 
which you can port to the language of your choice:
$about =~ s/\u(\d+)\?/pack("U", $1)/eg; # assumes no surrogate pairs
$about =~ s/\qc ?(.*?)(\pard|$)/<center>$1<\/center>$2/g;
$about =~ s/\pard? ?/<br\/>/g;


--Chris





More information about the sword-devel mailing list