[sword-devel] Re: [sword-support] An awesome Bible

Victor Porton sword-devel@crosswire.org
Thu, 01 May 2003 04:09:43 +0600 (YEKST)


On 30-Apr-2003 Chris Little wrote:
> On Thu, 1 May 2003, Victor Porton wrote:
> 
>> On 30-Apr-2003 Chris Little wrote:
>> > If anyone is interested in our adding this Bible, please go ahead and 
>> > write to the author for permission and (hopefully) some source files
>> > other 
>> > than the PDFs.  I don't think it's going to be possible to base a
>> > module 
>> > off of the PDFs themselves.  I don't see any problem with our
>> > providing 
>> > this text.  And if whoever asks doesn't want to or feel capable of
>> > making
>> > the module himself, I would be happy to do so.
>> 
>> Could we use scanner software on PDF files (or page images produced from
>> PDF by GhostScript).
> 
> It's not an issue of getting the data out of the PDFs.  They're not 
> protected or anything, so Acrobat can export the contents to XHTML, XML, 
> HTML, or other formats.  But the PDFs are formatted with double columns 
> plus a footnote area plus sidebars, so it makes the resulting exported 
> text sufficiently inconsistent that it would have to be proof-read in its
> entirety.  If we had a source document, even just a Word document, it 
> would make it much easier.

This is exactly my point: some scanner related programs (such as
FineReader) can "decipher" these complex layouts turning them to e.g. a
good Word document.

-- 
Victor Porton (porton@ex-code.com)