[sword-devel] Getting stuff done (Re: External links)

Greg Hellings greg.hellings at gmail.com
Tue Nov 25 17:06:00 MST 2008

On Tue, Nov 25, 2008 at 5:45 PM, Matthew Talbert <ransom1982 at gmail.com> wrote:
>> The point I was making was not that you can't encode it, but you lose the
>> semantic significance of it. The user can tell that <i>test</i> was added,
>> but the program can't - unless that is the only way <i> is ever used - which
>> it isn't. If you use italic formatting for anything else, you have lost
>> information - not presentation information - but the actual meaning is now
>> inaccessible to the program, as it can't necessarily tell what a particular
>> <i> means. If I want to mark translator added words in violet, or even allow
>> omitting them altogether, this is now not easily possible.
> I've been around long enough to know there is some disagreement here,
> but not long enough to really understand the issues. So my question
> isn't intended to create an argument, I just want to understand.
> If encoding in OSIS means that presentation information intended to be
> there by the publisher is lost, then why is that the preferred format?
> I would think that it would be really important to a publisher (or
> just to a module creator like me) that things are presented as they
> want them to be. Are you saying OSIS doesn't really allow that? If so,
> then shouldn't something else be used?

That is the basis of the move behind all of XML, not just OSIS.
Technically, ThML is also XML, so it *should* be under the same
restrictions.  However, since it comes from a history of HTML, it's
usually treated more like HTML.  The logic is as follows:

A content creator (called a module creator in SWORD lingo) should only
be concerned that they properly encode the INFORMATION in a module/web
page/database/etc.  As such, they are given a set of tags which encode
the information - i.e. <transChange> or <w> or <verse> or <poetry>,
etc.  Their job is to properly encode the information inherent in
their content.

A presenter (called a front-end developer in SWORD lingo) knows the
abilities, limitations and other considerations of their own medium.
One of the classic examples is printing a webpage to a screen versus
passing it to a text-to-speech system.  As long as the web page is
encoded properly (e.g. the use of <em> instead of <i> and <strong>
instead of <b> and so on), then the text-to-speech can alter the
speech so that it properly represents what the content creator
intended.  In SWORD parlance, a <transChange> will often be presented
to a user in italics - <i> or <em> if presented to HTML, the proper
RTF markers if presented in BibleCS, some other form if presented in
plain-text, etc.  Another example is the issue of multiple companion
modules - presenting those on the iPhone front-end will have a very
different look than in Gnomesword and yet another presentation (or
possibly even ignored all-together) on the sword web.

The argument is that the content creator doesn't know where, how or to
whom the data will be presented.  All they know is what their data
MEANS.  Therefore, they are given the tools to specify what is means,
while the people who are presenting that information are in charge of
determining HOW it looks/sounds/feels.

One of the issues comes that many people feel they are the creators of
the content and should have more control over its presentation.  And
thus there are systems, like ThML, which are are hybrids - they allow
both the semantic AND the presentational markup.  Personally I feel
that a system should be one or the other.  I usually aim for XML's
delineation between semantics in one place and presentation in
another, but that often results in a very tedious system to get
started in.  However, once it's in place fully and properly, then
fixing it, using and modifying it are, I find, usually easier than a
system where the content and presentation are tightly intertwined.
However, as Karl has noted, that system has a lower cost of entry, as
all of the data and presentation are already integrated and thus can
be presented rather quickly and straightforwardly.

So there are purists, whose systems, when done right and completed
often work brilliantly (XHTML - HTML written as XML - is usually
displayed brilliantly and almost identically across browsers).  And
then there are pragmatists, whose systems can sometimes be a bit
jumbled, convoluted and quirky (for example: browser differences in
presenting non-XML forms of HTML) but are easy for people to make
their intended presentation in.


> Matthew
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page

More information about the sword-devel mailing list