[osis-users] Topic Maps

Troy A. Griffitts scribe at crosswire.org
Fri Jan 15 18:38:38 MST 2010


Thanks Patrick.  So had we planned a subjectIdentifier attribute on 
either <w> or <name> (as Peter pointed out we added likely for proper 
name indication)?

Steve, do you remember our discussion when we added marker to the <q> 
attribute, when we talked about a generalized defaulting mechanism which 
would allow the header to contain things like:

  <default>//q[@level="1"]/@marker='"'</default>
  <default>//q[@level="2"]/@marker="'"</default>
  <default>//w[@lemma="([^:]*)"]/@lemma="strong:\1"</default>

Anyway, I was just wondering what happened to this idea?  I'm not sure 
I'd want to implement a fullblown xquery parser like what would be 
required in my example above, but some basic defaulting mechanism would 
still be nice.

Patrick, in your example, I'd like to be able to say something like:

<default>//w[@subjectIdentifier="(.*)"]/@subjectIdentifier="http://crosswire.org/names/\1"</default>

so I could simply use in my doc:

<w subjectIdentifier="jerusalem1">Jerusalem</w>


But this is merely to clean up my markup in the event our docs are ever 
opened in an editor by a human, and to potentially prevent errors when 
hand editing.  Sorry, I just like to factor stuff out when possible.


Patrick Durusau wrote:
> The question is one of how much information do you want to store in the 
> identifier that appears when you mark a reference to a subject?

Yes, having this level of indirection that a subjectIdentifier provides 
serves a great purpose and is perfect if I'm 'at' an element I want to 
dig deeper into.  But my current objective is to find all place names in 
a document, which would require me to dereference each identifier, 
querying the referent for the 'type' of each subject, e.g., "geo-city".

Hence my poorly applied lemma/morph scheme:

<w lemma="placenames:jerusalem1" 
morph="placenamestype:geo-city">Jerusalem</w>

makes processing for my immediate objective easier.  You mentioned above 
that the question is 'how much information' to store in the identifier 
itself... So is this suggesting a solution like?:

<w subjectIdentifier="geo/city/jerusalem1">Jerusalem</w>

This would give me what I need to easily process the data (even if we 
had to specify the full:
subjectIdentifier="http://crosswire.org/names/geo/city/jerusalem1")


Thanks for the discussion on this!


I feel your pain.  My primary laptop died in December and I purchased a 
netbooky hp dm3 thingy to hold me over until I could order a 
replacement.  I just finished MOVING all of my data over to this new 
little thing's large (by comparison to my old system) 320Gig drive and 
days later the new drive crashed.  Now I'm booting Ubuntu on the new 
computer with my old 100Gig drive plugged into the USB port (old drive 
is PATA, new computer is SATA) until my real laptop replacement gets 
here.  And all my data on the 320Gig new drive is lost!  I was picking 
and choosing folders from my old drive and did moves instead of copies 
so I could remember what I had already grabbed.  Stupid me.  Did you 
find an affordable data recovery service?


Troy





> 
> Take your example:
> 
> <w 
> subjectIdentifier="http://www.crosswire.org/names/jerusalem">Jerusalem</w>
> 
> Elsewhere, there is a topic in a topic map that has that same 
> subjectIdentifier property and it is a records that the subject it 
> represents, is an instance of type place, along with names for it in 
> other languages and any other information you want to record about that 
> subject.
> 
> The key is the use of a subjectIdentifier to identify the subject. Why?
> 
> Because someone else, in another Bible project may have:
> 
> <w 
> subjectIdentifier="htttp//www.otherproject.org/geonames/israel/jerusalem">Jerusalem</w> 
> 
> 
> Now what?
> 
> Well, any topic can have a *set* of subjectIdentifier properties which 
> signals that both subjectIdentifiers identify the same subject.
> 
> (Note I have used the XTM syntax for the attributes but it would be 
> possible to declare equivalent subject identifiers even if they were in 
> different formats or structures. I am working on an example using XQuery 
> to make that point. Probably won't be ready for a week or so. My main 
> system died last night but due to disk mirroring and paying a lot of 
> money, I got it back late this afternoon.)
> 
> That will allow you to disambiguate all the names as well as to add far 
> more information that you could possibly put in an attribute. Such as 
> marking the morphology of a lemma and displaying for a user the 
> distribution of that lemma over a book or range of books. (Assuming you 
> represented all of those as occurrences or even associations with 
> explicit roles if you liked.
> 
> Yes, I have been thinking about topic maps and biblical texts a lot. ;-)
> 
> Hope you are having a great day!
> 
> Patrick
> 




More information about the osis-users mailing list