<div dir="ltr">So there would have to be a tokenizer and parser that determines the meaning of the token based on context.<br><br><div class="gmail_quote">On Thu, Sep 30, 2010 at 1:16 PM, DM Smith <span dir="ltr">&lt;<a href="mailto:dmsmith@crosswire.org">dmsmith@crosswire.org</a>&gt;</span> wrote:<br>

<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">

  
    
  
  <div bgcolor="#ffffff" text="#000000">
    It&#39;s not quite as simple as working with the fully spelled out
    names. SWORD allows other alternates as well. For example, perhaps
    the following would work just as well for Apostle-Works:<br>
    A-W<br>
    AW<br>
    Wrks<br>
    Wrk<br>
    Wks<br>
    Wk<br>
    and any proper prefix of Apostle-Works that does not conflict with
    another books abbreviations:<br>
    Apostle-Work<br>
    Apostle-Wor<br>
    Apostle-Wo<br>
    Apostle-W<br>
    Apostle-<br>
    Apostle<br>
    Apostl<br>
    ...<br>
    Ap<br>
    <br>
    How about prefixes on both sides of the dash?<br>
    Ap-Works<br>
    Apo-Works<br>
    Ap-Wo<br>
    <br>
    How about abbreviations of just one side or the other:<br>
    Apo-Wrks<br>
    Apostle-Wrk<br>
    A-Wks<br>
    <br>
    In Him,<br><font color="#888888">
        DM</font><div><div></div><div class="h5"><br>
    <br>
    <br>
    On 09/30/2010 01:24 PM, Weston Ruter wrote:
    <blockquote type="cite">
      <div dir="ltr">I think the fundamental problem here is that the
        SWORD reference parser is too simple. Namely, the parser needs
        to not blindly split on a hyphen character but rather tokenize
        the input stream and contextually determine what each token is
        as it processes the tokens in sequence. For example, if I had
        the following passage span (assuming the language has
        &quot;Apostle-Works&quot; as the book name for &quot;Acts&quot;):<br>
        <br>
        Apostle-Works 4:32 - Romans 3:21<br>
        <br>
        In this case, the parser would come across that first hyphen and
        could contextually determine it&#39;s not a passage span separator
        hyphen since the following token &quot;Works&quot; is not a recognized as
        a book, and also that &quot;Apostle&quot; is not a full book in itself but
        &quot;Apostle-Works&quot; is. Otherwise, there could be a pre-processor
        that does a first pass inspecting the token stream and replacing
        localized book name token sequences with their internal OSIS
        names and then just split on the hyphen as usual.<br>
        <br>
        Does that sound right?<br>
        <br>
        <div class="gmail_quote">On Thu, Sep 30, 2010 at 9:52 AM, DM
          Smith <span dir="ltr">&lt;<a href="mailto:dmsmith@crosswire.org" target="_blank">dmsmith@crosswire.org</a>&gt;</span>
          wrote:<br>
          <blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
            <div bgcolor="#ffffff" text="#000000">
              <div> On 09/30/2010 11:11 AM, David Troidl
                wrote:
                <blockquote type="cite"> Hi Robert,<br>
                  <br>
                  There are many Unicode characters for hyphens and
                  dashes.  Could you substitute, for example, the hyphen
                  from General Punctuation (&amp;#x2010;)?  This would
                  give the proper appearance, without conflicting with
                  the &#39;normal&#39; hyphen separator.<br>
                </blockquote>
              </div>
              I think this is at core a user input problem. Telling
              users that they have to use a special character that is
              not on their keyboard is a problem. I don&#39;t think it will
              do at all.<br>
              <br>
              If we parse the user input to figure out whether a hyphen
              is a range specifier or part of a name and if part of a
              name then substitute it with something else, then we
              should add that to the SWORD reference parser.
              <div>
                <div><br>
                  <br>
                  <blockquote type="cite"> <br>
                    Peace,<br>
                    <br>
                    David<br>
                    <br>
                    On 9/29/2010 5:28 PM, Robert Hunt wrote:
                    <blockquote type="cite"> On 30/09/10 10:17, Greg
                      Hellings wrote:
                      <blockquote type="cite">
                        <p>OP was not talking about a transliteration
                          from the sounds of his email, but rather the
                          original language where the hyphen is a
                          letter.</p>
                        <p>You are equivalently proposing an English
                          speaker to not use the letter s in the Bible
                          names list. It might be comprehensible but it
                          would be horrible usability and I probably
                          wouldn&#39;t take such software seriously!</p>
                      </blockquote>
                      Exactly!<br>
                      <blockquote type="cite">
                        <p>Perhaps allowing each locale to define its
                          own numerals and hyphen-like character would
                          be a good solution?</p>
                      </blockquote>
                      Yes, I&#39;m sure there&#39;s probably dozens of languages
                      in the world that are likely to have hyphens in
                      book names. Even in English, hyphen is a valid
                      letter as you can see in the sentence above. (It&#39;s
                      just fortunate that it doesn&#39;t occur in book
                      names.<br>
                      <br>
                      Surely this issue has come up many times before???<br>
                      <br>
                      Robert.<br>
                      <br>
                      <blockquote type="cite">
                        <div class="gmail_quote">On Sep 29, 2010 4:08
                          PM, &quot;Daniel Owens&quot; &lt;<a href="mailto:dhowens@pmbx.net" target="_blank">dhowens@pmbx.net</a>&gt;
                          wrote:<br type="attribution">
                          &gt; <br>
                          &gt; On 09/29/2010 03:55 PM, Robert Hunt
                          wrote:<br>
                          &gt;&gt; New Zealand.<br>
                          &gt;&gt;<br>
                          &gt;&gt; Hello all,<br>
                          &gt;&gt;<br>
                          &gt;&gt; I am spending today studying the
                          documentation on the Crosswire <br>
                          &gt;&gt; Sword wiki so I&#39;m likely to have a
                          few questions. Please let me know <br>
                          &gt;&gt; if this is not the right forum to ask
                          questions.<br>
                          &gt;&gt;<br>
                          &gt;&gt; I see in <a href="http://www.crosswire.org/wiki/DevTools:SWORD" target="_blank">http://www.crosswire.org/wiki/DevTools:SWORD</a>
                          that <br>
                          &gt;&gt; localised book names are not allowed
                          hyphens in them (because the <br>
                          &gt;&gt; hyphen is used for verse ranges). In
                          the Philippine language that we <br>
                          &gt;&gt; worked with as Bible translators, the
                          hyphen is a letter in the <br>
                          &gt;&gt; alphabet and appears in several book
                          names!<br>
                          &gt;&gt;<br>
                          &gt;&gt; Is this still a current limitation?
                          If so, what is the suggested <br>
                          &gt;&gt; work-around.<br>
                          &gt;&gt;<br>
                          &gt;&gt; Thanks,<br>
                          &gt;&gt; Robert.<br>
                          &gt;&gt;<br>
                          &gt; This problem came up with Vietnamese, and
                          I was just told to drop the <br>
                          &gt; hyphens. The result was not ideal, but in
                          the end it is still <br>
                          &gt; comprehensible in Vietnamese. I think the
                          hyphen was needed because <br>
                          &gt; Vietnamese is monosyllabic, but more
                          recent &quot;transliterations&quot; of <br>
                          &gt; foreign names have simply dropped the
                          hyphens. Would the names still be <br>
                          &gt; comprehensible without the hyphen?<br>
                          &gt; <br>
                          &gt; Daniel</div>
                      </blockquote>
                      <pre><fieldset></fieldset>
_______________________________________________
sword-devel mailing list: <a href="mailto:sword-devel@crosswire.org" target="_blank">sword-devel@crosswire.org</a>
<a href="http://www.crosswire.org/mailman/listinfo/sword-devel" target="_blank">http://www.crosswire.org/mailman/listinfo/sword-devel</a>
Instructions to unsubscribe/change your settings at above page
</pre>
                    </blockquote>
                    <pre><fieldset></fieldset>
_______________________________________________
sword-devel mailing list: <a href="mailto:sword-devel@crosswire.org" target="_blank">sword-devel@crosswire.org</a>
<a href="http://www.crosswire.org/mailman/listinfo/sword-devel" target="_blank">http://www.crosswire.org/mailman/listinfo/sword-devel</a>
Instructions to unsubscribe/change your settings at above page</pre>
                  </blockquote>
                  <br>
                </div>
              </div>
            </div>
            <br>
            _______________________________________________<br>
            sword-devel mailing list: <a href="mailto:sword-devel@crosswire.org" target="_blank">sword-devel@crosswire.org</a><br>
            <a href="http://www.crosswire.org/mailman/listinfo/sword-devel" target="_blank">http://www.crosswire.org/mailman/listinfo/sword-devel</a><br>
            Instructions to unsubscribe/change your settings at above
            page<br>
          </blockquote>
        </div>
        <br>
        <br clear="all">
        <br>
        -- <br>
        <div dir="ltr">Weston Ruter<br>
          <a href="http://weston.ruter.net/" target="_blank">http://weston.ruter.net/</a><br>
          <font size="1"><a href="http://twitter.com/westonruter" target="_blank">@westonruter</a>
            - <a href="http://www.google.com/profiles/WestonRuter#about" target="_blank">Google Profile</a></font><br>
        </div>
        <br>
      </div>
      <pre><fieldset></fieldset>
_______________________________________________
sword-devel mailing list: <a href="mailto:sword-devel@crosswire.org" target="_blank">sword-devel@crosswire.org</a>
<a href="http://www.crosswire.org/mailman/listinfo/sword-devel" target="_blank">http://www.crosswire.org/mailman/listinfo/sword-devel</a>
Instructions to unsubscribe/change your settings at above page</pre>
    </blockquote>
    <br>
  </div></div></div>

<br>_______________________________________________<br>
sword-devel mailing list: <a href="mailto:sword-devel@crosswire.org">sword-devel@crosswire.org</a><br>
<a href="http://www.crosswire.org/mailman/listinfo/sword-devel" target="_blank">http://www.crosswire.org/mailman/listinfo/sword-devel</a><br>
Instructions to unsubscribe/change your settings at above page<br></blockquote></div><br><br clear="all"><br>-- <br><div dir="ltr">Weston Ruter<br><a href="http://weston.ruter.net/" target="_blank">http://weston.ruter.net/</a><br>

<font size="1"><a href="http://twitter.com/westonruter" target="_blank">@westonruter</a> - <a href="http://www.google.com/profiles/WestonRuter#about" target="_blank">Google Profile</a></font><br></div><br>
</div>