dmsmith555 at yahoo.com
Thu Jun 19 05:33:04 MST 2008
To summarize what I have heard on the thread:
While the printer may have used leading 'V' for 'U' (and 'v' for 'u'),
the reader would have known that those were 'U' and 'u' respectively.
Likewise for i and j.
IMHO, if it is of value to retain the visual representation of the
original, then the "long s" (U+17F) should also be used. In looking at
Genesis 1 (from Wolfgang's link), it appears to follow these rules:
Leading lower case 's'
terminal 'ss' but not terminal 's'
IMHO, If it does the one then it should do the other. Should we filter
the text to do this and submit it back to the keeper of the text, so
that we can then include it as such in our repository?
Chris, as to you suggestion of using transliteration: If the text is
regular in its usage, then, very likely, it could be possible to
transliterate a "modern" german text into this old form. Not sure of
the other way. That looks lossy. (I.e. when would one know that V, v,
J, j are U, u, I, i?)
As to keeping our modules consistent with the source. I don't think
this is a proper methodology. I think we should offer correct texts.
If we can do this by working with the keeper of the source, that's
great. But there may not be an active or responsive keeper. In that
case, I think we should take a different approach.
I think that we should model the ideals of Linux's RPM. There they
take a stored copy of pristine source and potentially apply patches
before building the release artifact. The pristine source, the
patches, any necessary scripts and any other build inputs are stored
in an SRPM (source RPM). It is by executing rpmbuild against the SRPM
that the RPM is built.
In our case, this would model: store a copy of the pristine source,
separate patches to the source, and all scripts needed to construct
the module. As good community members, we would offer back to the
keeper, the changes that we applied.
On Jun 18, 2008, at 11:51 PM, Chris Little wrote:
> Peter von Kaehne wrote:
>> DM Smith wrote:
>>> I have curiosity questions. Is it that it is actually the letter 'V'
>>> or is it the letter 'U', but the glyph is looks like the letter 'V'?
>> I think the glyphs are used interchangably - a V shape is often
>> used for
>> an u in the begin of a word, but equally where a v is correct -
>> while a
>> U shape might stand where v or u would be. I und J are certainly one
>> letter/interchangably used - indeed the greek name for the I - iota
>> used in German for the name of the J - Jot.
> Printers of this era typically had 4 different glyphs total for the
> U, V, W, u, v, & w. U & V/u & v were basically positional variants of
> one another (but in earlier days were used even more interchangeably
> basically according to the typesetter's whim or the printer's personal
> spelling convention). W and w were simply set as VV and vv.
> Going back much further, all six derive from Latin V, which served as
> consonant [v], vowel [u], and semi-vowel [w].
> The corresponding situation also holds for I/J, which derive from
> I and served as both vowel [i] and semi-vowel [j].
>> I found this image file which is quite interesting:
>> Look at the behaviour of the S - begin and inside words it looks
>> like a
>> f, but at the end it looks like an s.
> Long s (ſ U+017F) still appears today as the left half of the letter
> which comes from a ligature of ſ and z (or ʒ).
>> Not looked long enough and at enough original text to say whether
>> are more letters getting shaped, but I would not be surprised.
> Have a look at the umlauts (e.g. in grösser at the end of the 4th
> These would require, e.g., U+0364 (http://www.decodeunicode.org/u+0364
>> Taking the glyph shaping into account I would think the version we
>> is wrong as it tries to imitate this by using v and u - I have not
>> enough, but I think overall a straight use of U only for both letters
>> would be more correct - but equally difficult to read for some.
> I've never seen or heard of anyone encoding texts like these with an
> assumption of glyph shaping by the renderer. Most people encode the
> as it appears. Some will modernize so that u means [u] and v means [v]
> and [w] is rendered by w.
>>> For example, in the original KJV, the letter 's' often was a long
>>> swoopy 'f' looking character. I would imagine that if we were to
>>> encode the original KJV, that we would not use the letter 'f' but a
>>> code point for the character that looks like it.
>>> (Some day, I'd like to have the original KJV as a SWORD module.)
> I believe the 1611 KJV is usually encoded with u, v, w, i, & j encoded
> as they appear on the page, but s & ſ are folded as s. I suppose we
> could mirror the original orthography entirely and create a modernized
> derivative. I don't know whether it would be possible to do the
> modernization at run time via ICU transliteration.
> sword-devel mailing list: sword-devel at crosswire.org
> Instructions to unsubscribe/change your settings at above page
More information about the sword-devel