[sword-devel] Spelling (was Versification/Encoding issues)
d.haslam at ukonline.co.uk
Fri Jan 9 02:31:28 MST 2009
Using Tessaract to help the Irish New Testament project is suggested.
We should try and establish personal contact with Pastor Craig Ledbetter.
I think CrossWire could provide some useful technical help.
Peter von Kaehne wrote:
> Mike Hart wrote:
>> That's interesting, because ancle is one of the words I corrected in
>> JSFB -- the OCR had ancle, but the PDF itself, my paper KJV copy, and
>> my JPS complete Tanach (individual volumes) had ankle... I can't say
>> what verse it was, at the time I was hunting for e's that had been
>> OCR'd into c's (search for 'regular expression'
>> [bcdfghjklmnpqrstvwxy]c[bcdfgjklmnpqrstvwx] in kwrite)
> You should have a look at Troy's work with tesseract. Rather than search
> and replace a text badly ocred he seems to have figured out how to
> "educate" tesseract with one or two sample pages until it does the right
> thing. That might be way easier and with a better outcome in the long
> term for you too.
> sword-devel mailing list: sword-devel at crosswire.org
> Instructions to unsubscribe/change your settings at above page
View this message in context: http://www.nabble.com/Versification-Encoding-Issues-tp21341395p21368903.html
Sent from the SWORD Dev mailing list archive at Nabble.com.
More information about the sword-devel