public final class OSISUtil extends Object
The GNU Lesser General Public License for details.
Modifier and Type | Class and Description |
---|---|
static class |
OSISUtil.OSISFactory
A generic way of creating empty Elements of various types
|
Modifier and Type | Field and Description |
---|---|
static String |
ATTRIBUTE_CELL_ALIGN |
static String |
ATTRIBUTE_CELL_COLS |
static String |
ATTRIBUTE_CELL_ROWS |
static String |
ATTRIBUTE_DIV_BOOK |
static String |
ATTRIBUTE_FIGURE_SRC |
static String |
ATTRIBUTE_OSISTEXT_OSISIDWORK |
static String |
ATTRIBUTE_Q_WHO |
static String |
ATTRIBUTE_SPEAKER_WHO |
static String |
ATTRIBUTE_TABLE_BORDER |
static String |
ATTRIBUTE_TABLE_ROLE |
static String |
ATTRIBUTE_TEXT_OSISIDWORK |
static String |
ATTRIBUTE_W_LEMMA |
static String |
ATTRIBUTE_W_MORPH |
static String |
ATTRIBUTE_WORK_OSISWORK |
static String |
CELL_ALIGN_CENTER |
static String |
CELL_ALIGN_END |
static String |
CELL_ALIGN_JUSTIFY |
static String |
CELL_ALIGN_LEFT
Possible cell alignments
|
static String |
CELL_ALIGN_RIGHT |
static String |
CELL_ALIGN_START |
static String |
DEF_TYPE
Constant for the def (dictionary definition) type
|
static String |
DIV_PRE
Constant to help narrow down what we use div for.
|
private static Set<String> |
EXTRA_BIBLICAL_ELEMENTS |
private static OSISUtil.OSISFactory |
factory |
static String |
GENERATED_CONTENT
Constant for JSword generated content.
|
static String |
HI_ACROSTIC
Constant for acrostic highlighting
|
static String |
HI_BOLD
Constant for rendering bold text
|
static String |
HI_EMPHASIS
Constant for rendering emphatic text
|
static String |
HI_ILLUMINATED
Constant for rendering illuminated text.
|
static String |
HI_ITALIC
Constant for rendering italic text.
|
static String |
HI_LINETHROUGH
Constant for rendering strike-through text
|
static String |
HI_NORMAL
Constant for rendering normal text.
|
static String |
HI_SMALL_CAPS
Constant for rendering small caps
|
static String |
HI_SUB
Constant for rendering subscripts
|
static String |
HI_SUPER
Constant for rendering superscripts
|
static String |
HI_UNDERLINE
Constant for rendering underlined text
|
static String |
HI_X_BIG
Constant for rendering big text
|
static String |
HI_X_CAPS
Constant for rendering upper case text
|
static String |
HI_X_SMALL
Constant for rendering small text
|
static String |
HI_X_TT
Constant for rendering tt text
|
static String |
LEMMA_MISC |
static String |
LEMMA_STRONGS
Constant for a Strong's numbering lemma
|
static String |
LIST_ORDERED
Constant to help narrow down what we use "list" for.
|
static String |
LIST_UNORDERED |
private static org.slf4j.Logger |
log
The log stream
|
private static char |
MORPH_INFO_SEPARATOR |
static String |
MORPH_ROBINSONS |
static String |
MORPH_STRONGS
Constant for Strong's numbering morphology
|
static String |
NOTETYPE_REFERENCE
Constant for the cross reference note type
|
static String |
NOTETYPE_STUDY
Constant for the study note type
|
static String |
OSIS_ATTR_CANONICAL |
static String |
OSIS_ATTR_EID |
static String |
OSIS_ATTR_LANG |
static String |
OSIS_ATTR_LEVEL |
static String |
OSIS_ATTR_OSISID |
static String |
OSIS_ATTR_REF |
static String |
OSIS_ATTR_SID |
static String |
OSIS_ATTR_SUBTYPE |
static String |
OSIS_ATTR_TYPE |
static String |
OSIS_ELEMENT_ABBR |
static String |
OSIS_ELEMENT_CELL |
static String |
OSIS_ELEMENT_CHAPTER |
static String |
OSIS_ELEMENT_DIV |
static String |
OSIS_ELEMENT_FIGURE |
static String |
OSIS_ELEMENT_FOREIGN |
static String |
OSIS_ELEMENT_HEADER |
static String |
OSIS_ELEMENT_HI |
static String |
OSIS_ELEMENT_ITEM |
static String |
OSIS_ELEMENT_L |
static String |
OSIS_ELEMENT_LB |
static String |
OSIS_ELEMENT_LG |
static String |
OSIS_ELEMENT_LIST |
static String |
OSIS_ELEMENT_NAME |
static String |
OSIS_ELEMENT_NOTE |
static String |
OSIS_ELEMENT_OSIS |
static String |
OSIS_ELEMENT_OSISTEXT |
static String |
OSIS_ELEMENT_P |
static String |
OSIS_ELEMENT_Q |
static String |
OSIS_ELEMENT_REFERENCE |
static String |
OSIS_ELEMENT_ROW |
static String |
OSIS_ELEMENT_SEG |
static String |
OSIS_ELEMENT_SPEAKER |
static String |
OSIS_ELEMENT_SPEECH |
static String |
OSIS_ELEMENT_TABLE |
static String |
OSIS_ELEMENT_TITLE |
static String |
OSIS_ELEMENT_VERSE |
static String |
OSIS_ELEMENT_W |
static String |
OSIS_ELEMENT_WORK |
private static String |
OSISID_PREFIX_BIBLE
Prefix for OSIS IDs that refer to Bibles
|
static String |
POS_TYPE
Constant for the pos (part of speech) type.
|
static String |
Q_BLOCK
Constant to help narrow down what we use "q" for.
|
static String |
Q_CITATION
Constant to help narrow down what we use "q" for.
|
static String |
Q_EMBEDDED
Constant to help narrow down what we use "q" for.
|
static String |
SEG_CENTER
Constant to help narrow down what we use seg for.
|
static String |
SEG_COLORPREFIX
Constant to help narrow down what we use seg for.
|
static String |
SEG_JUSTIFYLEFT
Constant to help narrow down what we use seg for.
|
static String |
SEG_JUSTIFYRIGHT
Constant to help narrow down what we use seg for.
|
static String |
SEG_SIZEPREFIX
Constant to help narrow down what we use seg for.
|
private static char |
SPACE_SEPARATOR |
private static String |
strongsNumber |
private static Pattern |
strongsNumberPattern |
static String |
TABLE_ROLE_LABEL
Table roles (on table, row and cell elements) can be "data", the default,
or label.
|
static String |
TYPE_X_PREFIX
Constant for x- types
|
static String |
VARIANT_CLASS |
static String |
VARIANT_TYPE
Constant for the variant type segment
|
Modifier | Constructor and Description |
---|---|
private |
OSISUtil()
Prevent instantiation
|
Modifier and Type | Method and Description |
---|---|
static org.jdom2.Element |
createOsisFramework(BookMetaData bmd)
Helper method to create the boilerplate headers in an OSIS document from
the current metadata object
|
static List<org.jdom2.Content> |
diffToOsis(List<Difference> diffs)
Convert a Difference list into a pretty HTML report.
|
static OSISUtil.OSISFactory |
factory()
An accessor for the OSISFactory that creates OSIS objects
|
private static void |
getCanonicalContent(org.jdom2.Element parent,
String sID,
Iterator<org.jdom2.Content> iter,
StringBuilder buffer) |
static String |
getCanonicalText(org.jdom2.Element root)
Get the canonical text from an osis document consisting of a single
fragment.
|
static Collection<org.jdom2.Content> |
getDeepContent(org.jdom2.Element div,
String name)
Find all the instances of elements of type
find under the
element div . |
static List<org.jdom2.Content> |
getFragment(org.jdom2.Element root)
Dig past the osis and osisText element, if present, to get the meaningful
content of the document.
|
static String |
getHeadings(org.jdom2.Element root)
The text of non-reference notes.
|
static String |
getLexicalInformation(org.jdom2.Element root,
boolean includeMorphology)
concatenates strong and morphology information together
|
static String |
getMorphologiesWithStrong(org.jdom2.Element root)
A '@' separated list of morphologies and strong numbers
|
static String |
getNotes(org.jdom2.Element root)
The text of non-reference notes.
|
static String |
getPlainText(org.jdom2.Element root)
A simplified plain text version of the data in this Element with all the
markup stripped out.
|
static String |
getReferences(Book book,
Key key,
Versification v11n,
org.jdom2.Element root)
A space separate string containing osisID from the reference element.
|
static String |
getStrongsNumbers(org.jdom2.Element root)
A space separate string containing Strong's numbers.
|
private static String |
getTextContent(List<org.jdom2.Content> fragment) |
static Verse |
getVerse(Versification v11n,
org.jdom2.Element ele)
Walk up the tree from the W to find out what verse we are in.
|
private static boolean |
isCanonical(org.jdom2.Content content) |
private static void |
recurseChildren(org.jdom2.Element ele,
StringBuilder buffer)
Helper to extract the Strings from a nest of JDOM elements
|
private static void |
recurseDeepContent(org.jdom2.Element start,
String name,
List<org.jdom2.Content> reply)
Find all the instances of elements of type
find under the
element div . |
private static void |
recurseElement(Object sub,
StringBuilder buffer)
If we have a String just add it to the buffer, but if we have an Element
then try to dig the strings out of it.
|
static List<org.jdom2.Content> |
rtfToOsis(String rtf) |
private static final char SPACE_SEPARATOR
private static final char MORPH_INFO_SEPARATOR
public static final String HI_ACROSTIC
public static final String HI_BOLD
public static final String HI_EMPHASIS
public static final String HI_ILLUMINATED
public static final String HI_ITALIC
public static final String HI_LINETHROUGH
public static final String HI_NORMAL
public static final String HI_SMALL_CAPS
public static final String HI_SUB
public static final String HI_SUPER
public static final String HI_UNDERLINE
public static final String HI_X_CAPS
public static final String HI_X_BIG
public static final String HI_X_SMALL
public static final String HI_X_TT
public static final String SEG_JUSTIFYRIGHT
public static final String SEG_JUSTIFYLEFT
public static final String SEG_CENTER
public static final String DIV_PRE
public static final String SEG_COLORPREFIX
public static final String SEG_SIZEPREFIX
public static final String TYPE_X_PREFIX
public static final String NOTETYPE_STUDY
public static final String NOTETYPE_REFERENCE
public static final String VARIANT_TYPE
public static final String VARIANT_CLASS
public static final String GENERATED_CONTENT
public static final String POS_TYPE
public static final String DEF_TYPE
public static final String LEMMA_STRONGS
public static final String LEMMA_MISC
public static final String MORPH_ROBINSONS
public static final String MORPH_STRONGS
public static final String Q_BLOCK
public static final String Q_CITATION
public static final String Q_EMBEDDED
public static final String LIST_ORDERED
public static final String LIST_UNORDERED
public static final String TABLE_ROLE_LABEL
public static final String CELL_ALIGN_LEFT
public static final String CELL_ALIGN_RIGHT
public static final String CELL_ALIGN_CENTER
public static final String CELL_ALIGN_JUSTIFY
public static final String CELL_ALIGN_START
public static final String CELL_ALIGN_END
public static final String OSIS_ELEMENT_ABBR
public static final String OSIS_ELEMENT_TITLE
public static final String OSIS_ELEMENT_TABLE
public static final String OSIS_ELEMENT_SPEECH
public static final String OSIS_ELEMENT_SPEAKER
public static final String OSIS_ELEMENT_ROW
public static final String OSIS_ELEMENT_REFERENCE
public static final String OSIS_ELEMENT_NOTE
public static final String OSIS_ELEMENT_NAME
public static final String OSIS_ELEMENT_Q
public static final String OSIS_ELEMENT_LIST
public static final String OSIS_ELEMENT_P
public static final String OSIS_ELEMENT_ITEM
public static final String OSIS_ELEMENT_FIGURE
public static final String OSIS_ELEMENT_FOREIGN
public static final String OSIS_ELEMENT_W
public static final String OSIS_ELEMENT_CHAPTER
public static final String OSIS_ELEMENT_VERSE
public static final String OSIS_ELEMENT_CELL
public static final String OSIS_ELEMENT_DIV
public static final String OSIS_ELEMENT_OSIS
public static final String OSIS_ELEMENT_WORK
public static final String OSIS_ELEMENT_HEADER
public static final String OSIS_ELEMENT_OSISTEXT
public static final String OSIS_ELEMENT_SEG
public static final String OSIS_ELEMENT_LG
public static final String OSIS_ELEMENT_L
public static final String OSIS_ELEMENT_LB
public static final String OSIS_ELEMENT_HI
public static final String ATTRIBUTE_TEXT_OSISIDWORK
public static final String ATTRIBUTE_WORK_OSISWORK
public static final String OSIS_ATTR_OSISID
public static final String OSIS_ATTR_SID
public static final String OSIS_ATTR_EID
public static final String ATTRIBUTE_W_LEMMA
public static final String ATTRIBUTE_FIGURE_SRC
public static final String ATTRIBUTE_TABLE_BORDER
public static final String ATTRIBUTE_TABLE_ROLE
public static final String ATTRIBUTE_CELL_ALIGN
public static final String ATTRIBUTE_CELL_ROWS
public static final String ATTRIBUTE_CELL_COLS
public static final String OSIS_ATTR_TYPE
public static final String OSIS_ATTR_CANONICAL
public static final String OSIS_ATTR_SUBTYPE
public static final String OSIS_ATTR_REF
public static final String OSIS_ATTR_LEVEL
public static final String ATTRIBUTE_SPEAKER_WHO
public static final String ATTRIBUTE_Q_WHO
public static final String ATTRIBUTE_W_MORPH
public static final String ATTRIBUTE_OSISTEXT_OSISIDWORK
public static final String OSIS_ATTR_LANG
public static final String ATTRIBUTE_DIV_BOOK
private static final String OSISID_PREFIX_BIBLE
private static final org.slf4j.Logger log
private static OSISUtil.OSISFactory factory
private static String strongsNumber
private static Pattern strongsNumberPattern
public static OSISUtil.OSISFactory factory()
public static List<org.jdom2.Content> getFragment(org.jdom2.Element root)
root
- the element from which to get a fragmentpublic static String getCanonicalText(org.jdom2.Element root)
This means that the top level element's tagname is osis. This can contain either a osisText or an osisCorpus. If it is an osisCorpus, then it contains an osisText. However, as a simplification, since JSword constructs the whole doc for the fragment, osisCorpus can be ignored.
The osisText element contains a div element that is either a container or a milestone. Again, JSword is providing the div element and it will be provided as a container. It is this div that "contains" the actual fragment.
A verse element may either be a container or a milestone. Sword OSIS books differ in whether they provide the verse element. Most do not. The few that do are using the container model, but it has been proposed that milestones are the best practice.
The fragment may contain elements that are not a part of the original text. These are things such as notes.
Milestones require special handling. Beginning milestones elements have an sID attribute, while ending milestones have an eID with the same value as the opening. So everything between the start and the corresponding end is the content of the element. Also, for a given element, say div, they have to be properly nested as if they were container elements.
root
- the whole osis document.public static String getPlainText(org.jdom2.Element root)
root
- the whole osis document.public static String getStrongsNumbers(org.jdom2.Element root)
root
- the whole osis document.public static String getMorphologiesWithStrong(org.jdom2.Element root)
root
- the osis element in questionpublic static String getLexicalInformation(org.jdom2.Element root, boolean includeMorphology)
root
- the osis element in questionincludeMorphology
- whether to include morphologypublic static String getReferences(Book book, Key key, Versification v11n, org.jdom2.Element root)
book
- the book to which the references referkey
- the verse containing the cross referencesv11n
- the versificationroot
- the osis element in questionpublic static String getNotes(org.jdom2.Element root)
root
- the whole OSIS documentpublic static String getHeadings(org.jdom2.Element root)
root
- the whole OSIS documentprivate static void getCanonicalContent(org.jdom2.Element parent, String sID, Iterator<org.jdom2.Content> iter, StringBuilder buffer)
private static boolean isCanonical(org.jdom2.Content content)
public static Collection<org.jdom2.Content> getDeepContent(org.jdom2.Element div, String name)
find
under the
element div
.div
- the element to trawlname
- the element name to searchpublic static Verse getVerse(Versification v11n, org.jdom2.Element ele) throws BookException
v11n
- the versificationele
- The start point for our verse hunt.BookException
public static org.jdom2.Element createOsisFramework(BookMetaData bmd)
bmd
- the book's meta datapublic static List<org.jdom2.Content> diffToOsis(List<Difference> diffs)
diffs
- List of Difference objectsprivate static void recurseDeepContent(org.jdom2.Element start, String name, List<org.jdom2.Content> reply)
find
under the
element div
. For internal use only.start
- the node under which searches occurname
- element name to searchreply
- the list to modify with matching contentprivate static void recurseElement(Object sub, StringBuilder buffer)
sub
- a sub element or text nodebuffer
- the buffer to build on matchprivate static void recurseChildren(org.jdom2.Element ele, StringBuilder buffer)
ele
- The JDOM Element to dig intobuffer
- The place we accumulate strings.