[sword-devel] The poor man's interlinear

Mon Sep 10 07:31:47 MST 2012

That is bizarre.

If there is a conversion script then the way to test it is not by running 66 bible books through it and then doing side by side comparison, but to create a list of all "corner cases" and check for these.

Essentially there are  x characters in the former version, which combine into y combined characters, all of which have an unicode equivalent. Where is the problem? 

If he is really thinking his conversion script causes grief which he can not find by analysing the script carefully, then he should do following:

original in custom -> unicode -> reconvert to custom.

And then do a diff on the original and the reconvert.

Peter 

-------- Original-Nachricht --------
> Datum: Mon, 10 Sep 2012 02:18:01 -0700 (PDT)
> Von: David Haslam <dfhmch at googlemail.com>
> An: sword-devel at crosswire.org
> Betreff: Re: [sword-devel] The poor man\'s interlinear

> Further update ....
> 
> I referred Jonathan's reply to my friend in MissionAssist, with the
> following accompanying remarks.
> 
> Somehow, I think he's missed the main point.
> 
> i.e.  You already have a legacy to Unicode conversion, yet because of the
> complexities of the original documents and how it the legacy script works,
> the purpose of the required comparison is to discover any "corner cases"
> which the conversion algorithm didn't yet address.
> 
> Is this a right understanding of the task?
> 
> -----------------
> 
> My friend observes:
> 
> Your correspondent has not understood what we are trying to do. We are
> looking for mistakes in the algorithm.
> 
> We have already converted the encoding. It is a given that the source file
> is error-free as displayed. What we have to check is that our encoding
> conversion mapping (algorithm if you like) is accurate. Machine methods of
> doing this  have so far turned out to be circular in methodology. The only
> sure, but slow, method is sight checking. Until we find other means we
> would
> like to be able to view the text on lines stacked one above the other.
> Viewing Word files side-by-side at 75% zoom is hard on the eyes when the
> fonts in question are not standard Latin. Given that our source files
> always
> use customised TTF fonts we would need tools that operate on plain text to
> have import filters to handle RTF/DOC/DOCX/PDF as a minimum.
> 
> Many thanks for your interest in our problem.
> 
> -----------
> 
> David
> 
> 
> 
> --
> View this message in context:
> http://sword-dev.350566.n4.nabble.com/The-poor-man-s-interlinear-tp4650950p4650961.html
> Sent from the SWORD Dev mailing list archive at Nabble.com.
> 
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page