[sword-devel] XHTML Rendering of OSIS Reference Doc - Whitespace

Nic Carter niccarter at mac.com
Wed May 8 19:16:49 MST 2013

Ok, I'm gonna top post and reply to various bits, sorry :)

Is there any information that we need to keep in those divs that means we need to include either them or a placeholder for them? I just read that <a> anchor tags don't exist in XHTML like they did in HTML 4.01, but you can have <a> tags with no "href" part and it is called a "placeholder for a hyperlink", but you could add class and id info to that and that could perhaps keep the div information in the SWORD module?
But my suggestion is that we remove that div material, even though that does mean that we can't do the theoretical lossless OSIS->SWORD->OSIS conversion. But IMHO that's a good thing cause it encourages people to use the source text rather than our modules. If it's an official statement that we purposely break the possibility of lossless conversion (yes, perhaps some people on the list shudder when I suggest that, but we don't actually support that right now), that could be A Good Thing(TM)? :D

On to the next bit I wanted to comment on, I have had similar thoughts in my head about what the BibleTime guys have thought, and what Troy linked to from 2005. I think it would be really helpful to have some sort of state saved between subsequent calls to renderText() so we know what tags are currently open. Or if the state lets us know that there are currently no open tags (eg: we are retrieving a verse in isolation), the filter will figure that out and prepend and append the correct open & close tags to properly mark up the verse. The actual implementation of this could be whatever, like what Peter suggests below or as Greg said previously?

QUESTION: Is the aim for the next engine release to "solve the whitespace issues" or to rewrite the filters or the 2nd in order to solve the first or hack the filters to solve the first? :)

In summary, how about we just don't include the <div> tags. (Which is exactly what Troy said in his email.)

But will this solve the whitespace issues in modules like the ESV? Does it use those tags? I thought the whitespace issues in it are more to do with insane numbers of <br /> tags? At least, I manually parse the output of the filters and replace occurrences of 3 x <br /> and replace them with 2 instead. And that eliminates almost all of the yuck whitespace irregularities that I see in the ESV (and other modules?). Oh, and if verse 0 contains only "<br />" then I don't show that, as that seems to also happen very regularly...  :)
And this isn't a speed hit on a handheld device (that I notice?), so I'm not fussed with doing it. :)

Oh, and in regard to breaking compatibility with current modules, would there be a problem with creating a new valid value for SourceType of "OSIS2" or something like that, so if it's a current (potentially broken) module, we use the existing filters, otherwise we create new modules that will work with new filters? :)
I know that is a mess, but are we in a position of mess where the filters need to be rethought and redone more in line with what Greg suggests (and what Troy proposed in 2005)?

My random 2 cents!
Thanks, ybic
	nic...  :)

ps: Disclaimer: Nic is not an expert. Nic is not an expert. Nic is not an expert. Altho he does get to play with the output of the current filters. And the output works "well enough" for him. :)

On 09/05/2013, at 7:57 AM, Peter von Kaehne <refdoc at gmx.net> wrote:

> On Wed, 2013-05-08 at 15:15 -0500, Greg Hellings wrote: 
>> Off the cuff here, it seems the issue is the difference in semantics
>> of <div> between OSIS - where it marks a structural division within a
>> text which can be of many different levels and layers and in XHTML
>> where it represents a box of block-style layout which defaults to
>> being the full width of its container.
> That is true for the default behaviour of div. There is though now
> particular need to stick to default behaviour, is there?
> If every div carries enough class information there is nothing stopping
> a frontend to make it via CSS inline. And the cut-off, where div changes
> from being a block to being inline is one each frontend could choose
> itself. 
>> Our thought was to store information along with each verse which
>> includes a pre- and post- verse markup. This would need to become part
>> of the OSIS import process, and it would track the "semantically" open
>> elements such as <div sID="gen1" /> which, by XML standards are no
>> longer open but the OSIS semantics designate that div is open until
>> <div eID="gen1" /> is encountered. This would be in addition to the
>> actually open XML elements.
> If you make this a part of the module we will break continuity and
> compatibility of old modules in a big style.
> Why not make this a - maybe switchable - function of the engine, handled
> on the fly? This would make a lot of sense when returning arbitrary
> chunks - parse the chunk and ensure it is balanced, not just in an XML
> sense but also in an OSIS sense. Or at least the info for the missing
> bits is created and passed on upwards. This would allow keeping modules
> as they are. 
> Peter 
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page

More information about the sword-devel mailing list