[osis-core] Linguistic Annotation Module Design Document - flat data model.

Todd Tillinghast osis-core@bibletechnologieswg.org
Mon, 3 Nov 2003 11:42:26 -0700


Kirk and Steve,

It occurs to me that the flat structure of a number of attributes on
<morpheme> is likely to be limiting and under representative of richness
of the needed data model.  Did you consider a model where the data was
modeled using children elements?
Like:
<w>
   <morpheme>text_value<gender type="neuter"/><stem
type="qual"></morpheme>
</w>
OR
<w>
   <morpheme><value>text_value</value><gender type="neuter"/><stem
type="qual"></morpheme>
</w>

The opportunity this provides is that you can do the following:
(this is a _fake_ example, but I am sure you can demonstrate a more
realistic case.)
<w>
   <morpheme>
      <value>text_value</value>
      <gender type="neuter">
         <attachment type="abc" mechanism="direct"/>  
         <application type="passive" class="inanimate"/> 
      </gender>
      <stem type="qual">
   </morpheme>
</w>

The point being that there may be variable and rich data structures that
can not be modeled with simple attributes on a single element.

(Note: all of the elements would have not PCDATA other than the actual
text and would not be mixed="false".)


In the issue stated below regarding nouns, pronouns, and person you
could express:

<w>
   <morpheme>
     <value>text_value</value>
      <noun>
         <person type="first"/>
      </noun>
   </morpheme>
</w>
And have <pronoun> have a different data model.


Please forgive the artificial nature of the examples and fill in real
case values based on your more extensive understanding of the subject
matter.

Todd