[sword-devel] pdf2xml

Peter von Kaehne refdoc at gmx.net
Wed Nov 18 13:21:13 MST 2009


A user of our software advised me of this piece of software which I
think might occasionally come handy:

http://www.mobipocket.com/dev/pdf2xml/

It appears that it can deal with embedded fonts  and produces apparently
excellently structured output. I have unfortunately not managed to
compile it despite thinking it should be straightforward.

There is no make file, just a short explanation for compiling, which I
seemingly completely misunderstood.

What I did at the toplevel of the source directory was

gcc -I xpdf -I xpdf/fofi -I xpdf/goo -I xpdf/xpdf -I image/zlib -I
image/png -I /usr/include pdf2xml.cpp

But this was obviously wrong. It produced a huge pile of error messages
suggesting that it did not find all kinds of libpng related stuff. I
have obviously installed libpoppler-dev and libpng-dev.

I am aware of a whole bunch of public domain texts which are only
published in PDF format but would suddenly become accessible to us.

Peter



More information about the sword-devel mailing list