[sword-devel] Spelling (was Versification/Encoding issues)

David Haslam d.haslam at ukonline.co.uk
Fri Jan 9 02:31:28 MST 2009


Using Tessaract to help the Irish New Testament project is suggested.  
See
http://www.crosswire.org/wiki/Non-CrossWire_Text-Development_Projects#Individual_Works
http://www.crosswire.org/wiki/Non-CrossWire_Text-Development_Projects#Individual_Works 

We should try and establish personal contact with Pastor Craig Ledbetter.
http://www.biblebc.com/Projects/irish_new_testament_project.htm
http://www.biblebc.com/Projects/irish_new_testament_project.htm 

I think CrossWire could provide some useful technical help.

-- David



Peter von Kaehne wrote:
> 
> Mike Hart wrote:
>> That's interesting, because ancle is one of the words I corrected in
>> JSFB -- the OCR had ancle, but the PDF itself, my paper KJV copy, and
>> my JPS complete Tanach (individual volumes) had ankle...  I can't say
>> what verse it was, at the time I was hunting for e's that had been
>> OCR'd into c's  (search for 'regular expression'
>> [bcdfghjklmnpqrstvwxy]c[bcdfgjklmnpqrstvwx] in kwrite)
> 
> You should have a look at Troy's work with tesseract. Rather than search
> and replace a text badly ocred he seems to have figured out how to
> "educate" tesseract with one or two sample pages until it does the right
> thing. That might be way easier and with a better outcome in the long
> term for you too.
> 
> Peter
> 
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page
> 
> 

-- 
View this message in context: http://www.nabble.com/Versification-Encoding-Issues-tp21341395p21368903.html
Sent from the SWORD Dev mailing list archive at Nabble.com.




More information about the sword-devel mailing list