[sword-devel] Markup Options (was Re: Config file for thml module)

Greg Hellings greg.hellings at gmail.com
Tue Nov 30 07:48:37 MST 2010

On Tue, Nov 30, 2010 at 2:14 AM, Trevor Jenkins
<trevor.jenkins at suneidesis.com> wrote:
> On Mon, 29 Nov 2010, Greg Hellings <greg.hellings at gmail.com> wrote:
>> On Mon, Nov 29, 2010 at 9:30 PM, DM Smith <dmsmith at crosswire.org> wrote:
>> > <h1>ROFL</h1> is semantic markup. It is a level 1 heading. Given that this is one of HTML's title markup, one probably can deduce that it is a title. Note that IE, FireFox, Opera, Safari are not consistent in how they render <h1>.
>> >
>> Ah, but it wasn't my title. My title was "Response".  I used <h1>
>> because I wanted very large (in most h1 cases, also bold) font that
>> popped out at the reader to let them know how loudly I was laughing.
> Ah but that's a presentational issue in itself. I resist the HTML-isation
> of email by using a text only mailer, so no email-borne virii here thanks,
> so any attempt to rely on the implicit emphasis of how *your* emailer
> presents the element. So I for one can't tell how loudly you're laughing.

My email was sent in plain-text, I was just inserting the formatting
as it would be for SWORD's imp format, HTML and XHTML to demonstrate
my point.  And yes, presentation is decidedly important, and
presentational issues are what was being discussed.

> By the way, a specific emphasis element would therefore have been more
> appropriate both as markup and for the content you wanted to provide. One
> of the best books on writing is "Bugs in Writing" by Lyn Dupré and there
> is still virtue in "Designing and Writing Online Documentation" by William
> Horton.

Oh yes, an emphasis element would have been excellent.  Let's go with
that.  <em> is for emphasis, right? So let me put that in.  Whoops!
That gives me italics - which is not what I wanted.  But at least now
someone's screen reader is shouting at them.  Or is it speaking to
them in passionate tones, and <strong> would make it shout?  I can
never remember.  And, since I wanted it bold and large, I still have
to use the CSS I included earlier to increase the font size for screen

But all of that is moot and, like most Golden Hammer Only discussions
ignores the user's objective and/or the actual state of affairs.  In
my particular case I have a set of about 120 texts in a proprietary
SGML dialect which more or less marks up only the structure of the
text - not all the way down to OSIS's ability to get to word-level
semantics, but just down to the paragraph level plus the occasional
markup for a foreign text, image or tabular display.  This is
displayed in the original software after formatting is inserted using
a stylesheet mechanism (the stylesheets actually use a full-fledged
sed-like programming language to insert RTF commands with the markup).
 My project is to get these texts rendering in an application in Linux
(the proprietary software exists in Windows and Mac already) and
looking the same way they look in their source application.

The RTF-based stylesheets can easily be transformed into CSS and that
stage of the process is almost complete.  They match the monolithic
HTML pages that I can generate very well, and the material displays
nicely in a web browser.  The SGML can easily be transformed into XML
using the 'osx' command from opensp, and then translated into any
format I desire with XSLT.  Now comes the moment of choice: do I
transform into OSIS or into ThML for import to SWORD?  My mind wants
to go with OSIS, because the result can keep more of the structure of
the original SGML without losing that to the imp format, and it can be
kept in an XML format that is highly expressive.

However, a few problems arise.  Namely, I have no control over the
display of the text in my target applications (Xiphos and Bibletime)
when I use OSIS.  You, being in the semantics-please camp, might think
that's a good thing.  You've stymied my attempt, as a content creator,
to actually accomplish my goal of influencing rendering.  But as the
content creator, I'm not willing to be stymied because the task I was
given was to make these texts render the same in SWORD as they do in
the source app.  So I have a few options.

1) I can transform to OSIS, then create an XSLT that provides the
rendering I desire into the HTML displays.  However Troy opposes the
use of XSLT and other XML processing technologies in the library, and
SWORD currently cannot render out valid OSIS - thus neither Bibletime
nor Xiphos can be my target for inserting the XSL.

2) I can provide an external CSS stylesheet along with my module.
Then I could still use OSIS and, assuming well-defined use of HTML+CSS
classes being produced from OSIS by the engine, I could style the
module the way I desired.  This would not require terribly much work
to be done on the OSISHTMLHREF filter, but both Jaak and Karl when I
spoke with them were unwilling to allow inclusion of an external CSS
file in a module.  Why? I may have misunderstood but it seemed they
were both of the opinion that presentation and appearance is of
paramount importance, and they want to control the presentation of
material in the applications.

3) I can render into imp format and stick ThML presentational markup
all throughout the text.  This, as someone who does strongly
appreciate the benefits of separating structural markup from
presentational commands, is the least desirable of the options.  But
it is the only one which currently allows me to reach my objective.
So what I do is translate the original RTF stylesheets into an XSL
which inserts the CSS representation of that markup into the source
SGML elements.  Those SGML elements are then translated into ThML by
means of the same XSL that produces the HTML pages (with one
modification for <scripRef> tags instead of <a> tags) and inserts the
CSS styling into the style attribute on the ThML elements.

Your Golden Hammer is, indeed, an excellent hammer.  And I wish I
could use OSIS for my task.  But the truth of the matter is that SWORD
and its applications currently are not able handle the task I am
asking it using OSIS.  Thus ThML, for all its drawbacks, is the
superior technology _for me and my task_.

Perhaps after these modules are already released in their current
state, I will be able to get back to my task of improving the *OSIS
filters in the engine and providing a mechanism to retrieve full OSIS
documents from the API instead of just OSIS fragments.  But in the
meantime, OSIS is insufficient for my task and ThML is capable.  Thus
I use ThML.


More information about the sword-devel mailing list