dmsmith555 at yahoo.com
Thu Jun 19 06:05:47 MST 2008
Regarding searching, couldn't one encode the text with <w> elements
having an attribute with the searchable representation of the word and
change the SWORD engine to index these as well (at the same position
as the displayed word).
On Jun 19, 2008, at 4:42 AM, Peter von Kaehne wrote:
> Chris Little wrote:
>> Long s (ſ U+017F) still appears today as the left half of the
>> letter ß,
>> which comes from a ligature of ſ and z (or ʒ).
> That is actually not even very old. My dad still learnt the gothic
> writing regularly at school, while I learnt it from him out of
> as a teen.
> The ß is there exactly this - and its name in German is "SZ" (and
> secodnarily a "sharp S")
>>> Taking the glyph shaping into account I would think the version we
>>> is wrong as it tries to imitate this by using v and u - I have not
>>> enough, but I think overall a straight use of U only for both
>>> would be more correct - but equally difficult to read for some.
>> I've never seen or heard of anyone encoding texts like these with an
>> assumption of glyph shaping by the renderer. Most people encode the
>> as it appears. Some will modernize so that u means [u] and v means
>> and [w] is rendered by w.
> Positional glyphshaping is simply not anymore part of modern script -
> but Gothic script, used in print and handwriting regularly until the
> 1930/40s, had still remnants of glyphshaping - mostly around the s.
> People get confused by it and also have difficulties separating form
> content - and a lot (or most) encodings I found on the net appear to
> done by enthusiasts and not by necessity by people with any
> semblance of
> scholarship. Nothing wrong with that.
>> I believe the 1611 KJV is usually encoded with u, v, w, i, & j
>> as they appear on the page, but s & ſ are folded as s. I suppose we
>> could mirror the original orthography entirely and create a
>> derivative. I don't know whether it would be possible to do the
>> modernization at run time via ICU transliteration.
> I think it would be equally valid to use u/v/w and i/j as pronounced
> (then and now) or u and i as printed. What I do not like though at all
> is an attempt to be archaic and mistakenly print sometimes a V and
> sometimes a U simply because concepts like glyphshaping are not
> recognised. If glyphshaping is desired - I am sure someone clever
> design a nice ttf font doing just this on either basis - in Arabic and
> Farsi we use one code point for 4 shapes and there is little stopping
> anyone doin tthe same for older German or english texts.
> There are serious implications wrt search for trying to imitate the
> print-image and ignoring the content and the concept of glyphshaping -
> lots of German words are combined words and if a v sound ends up in
> middle of a word and is rendered in a u shape as consequence we still
> want to find it in both instances - voll (full) and uebervoll
> - encode one as voll and the next one as ueberuoll would loolk archaic
> but be unsearchable - much better to have this dealt with at a font
> level where a middle v becomes a U shape if necessary, but leave the
> encoding intact and searchable.
> Incidentally - and I have no answer on that - German glyph shaping
> late 19th early 20th century worked on syllables - so voll and
> would be the same, but these old texts seem to use word level - see eg
> Water Wasser. I learnt to write it as Wasſer - one shape being used
> the end of a syllable, the other at the begin or middle. But the
> image I
> linked to wrote Waſſer - shaping at word level. I am not sure how
> consistent this was used.
> sword-devel mailing list: sword-devel at crosswire.org
> Instructions to unsubscribe/change your settings at above page
More information about the sword-devel