[sword-devel] Neural Networks and Optical Character Recognition
greg.hellings at gmail.com
Fri Apr 25 14:17:45 MST 2008
On Fri, Apr 25, 2008 at 2:51 PM, Leandro DUTRA
<leandro.gfc.dutra at gmail.com> wrote:
> 2008/4/25, David Haslam <d.haslam at ukonline.co.uk>:
> > I was thinking that any historic text that used something other than a modern
> > Latin typeface, such a technique might have better chances of success.
> Indeed, but has it been proven?
A basic search of Google Scholar turns up a massive list of hits
relative to both neural networks and machine learning in the realm of
OCR. ANNs are often cited in OCR as one of the more base applications
of the theory. I'm currently engaged in NLP research at the
University of Texas at Dallas under a pair of advisers who specialize
in machine learning. If you listen to them, there's not a single
problem in this world that can't be solved with machine learning
techniques (I guess you might call it their "golden hammer.") While I
don't share their sold-out enthusiasm for it, this is a prime example
of where it is highly useful.
> And what are the chances that such a technique will show up in free
> software OCR such as Google's Tesseract?
I haven't looked at Tesseract myself, but there are plenty of FOSS
implementations of machine learning algorithms. I have used both weka
and Mallet - and they both contain an ANN appliaction. Integrating
them should not be incredibly difficult. It might require a little
more image processing than I am used to, but I have done a minor
amount of that - enough to be able to work with people who know more
and not be a hindrance.
> skype:leandro.gfc.dutra?chat Yahoo!: ymsgr:sendIM?lgcdutra
> +55 (11) 3040 7300 r155 gTalk: xmpp:leandrod at jabber.org
> +55 (11) 9406 7191 ICQ/AIM: aim:GoIM?screenname=61287803
> +55 (11) 5685 2219 MSN: msnim:chat?contact=leandro at dutra.fastmail.fm
> sword-devel mailing list: sword-devel at crosswire.org
> Instructions to unsubscribe/change your settings at above page
More information about the sword-devel