[sword-devel] Character Frequency
dfhmch at googlemail.com
Fri Jul 8 01:17:54 MST 2011
Good stuff Peter,
I guess for some projects that we've worked on, doing the character
frequency analysis on the OSIS files is doing it at the last stage in the
process before module build.
For projects that begin at USFM (or earlier), it would be great to develop a
tool that analyses character frequency of the text (for the whole Bible)
apart from all the USFM tags, etc.
One simple way to do this would be to have a script that does the following:
(a) merges all the USFM files into a single text file
(b) removes all the USFM tags (& the English stuff such as IDs & text in
(c) does the character frequency counting
For my part, (a) & (b) could easily be done by means of a TextPipe filter.
View this message in context: http://sword-dev.350566.n4.nabble.com/Character-Frequency-tp3642222p3653469.html
Sent from the SWORD Dev mailing list archive at Nabble.com.
More information about the sword-devel