[sword-devel] Hyphens in book names

Weston Ruter westonruter at gmail.com
Thu Sep 30 13:24:09 MST 2010


So there would have to be a tokenizer and parser that determines the meaning
of the token based on context.

On Thu, Sep 30, 2010 at 1:16 PM, DM Smith <dmsmith at crosswire.org> wrote:

>  It's not quite as simple as working with the fully spelled out names.
> SWORD allows other alternates as well. For example, perhaps the following
> would work just as well for Apostle-Works:
> A-W
> AW
> Wrks
> Wrk
> Wks
> Wk
> and any proper prefix of Apostle-Works that does not conflict with another
> books abbreviations:
> Apostle-Work
> Apostle-Wor
> Apostle-Wo
> Apostle-W
> Apostle-
> Apostle
> Apostl
> ...
> Ap
>
> How about prefixes on both sides of the dash?
> Ap-Works
> Apo-Works
> Ap-Wo
>
> How about abbreviations of just one side or the other:
> Apo-Wrks
> Apostle-Wrk
> A-Wks
>
> In Him,
>     DM
>
>
>
> On 09/30/2010 01:24 PM, Weston Ruter wrote:
>
> I think the fundamental problem here is that the SWORD reference parser is
> too simple. Namely, the parser needs to not blindly split on a hyphen
> character but rather tokenize the input stream and contextually determine
> what each token is as it processes the tokens in sequence. For example, if I
> had the following passage span (assuming the language has "Apostle-Works" as
> the book name for "Acts"):
>
> Apostle-Works 4:32 - Romans 3:21
>
> In this case, the parser would come across that first hyphen and could
> contextually determine it's not a passage span separator hyphen since the
> following token "Works" is not a recognized as a book, and also that
> "Apostle" is not a full book in itself but "Apostle-Works" is. Otherwise,
> there could be a pre-processor that does a first pass inspecting the token
> stream and replacing localized book name token sequences with their internal
> OSIS names and then just split on the hyphen as usual.
>
> Does that sound right?
>
> On Thu, Sep 30, 2010 at 9:52 AM, DM Smith <dmsmith at crosswire.org> wrote:
>
>>  On 09/30/2010 11:11 AM, David Troidl wrote:
>>
>> Hi Robert,
>>
>> There are many Unicode characters for hyphens and dashes.  Could you
>> substitute, for example, the hyphen from General Punctuation (&#x2010;)?
>> This would give the proper appearance, without conflicting with the 'normal'
>> hyphen separator.
>>
>>  I think this is at core a user input problem. Telling users that they
>> have to use a special character that is not on their keyboard is a problem.
>> I don't think it will do at all.
>>
>> If we parse the user input to figure out whether a hyphen is a range
>> specifier or part of a name and if part of a name then substitute it with
>> something else, then we should add that to the SWORD reference parser.
>>
>>
>>
>> Peace,
>>
>> David
>>
>> On 9/29/2010 5:28 PM, Robert Hunt wrote:
>>
>> On 30/09/10 10:17, Greg Hellings wrote:
>>
>> OP was not talking about a transliteration from the sounds of his email,
>> but rather the original language where the hyphen is a letter.
>>
>> You are equivalently proposing an English speaker to not use the letter s
>> in the Bible names list. It might be comprehensible but it would be horrible
>> usability and I probably wouldn't take such software seriously!
>>
>> Exactly!
>>
>> Perhaps allowing each locale to define its own numerals and hyphen-like
>> character would be a good solution?
>>
>> Yes, I'm sure there's probably dozens of languages in the world that are
>> likely to have hyphens in book names. Even in English, hyphen is a valid
>> letter as you can see in the sentence above. (It's just fortunate that it
>> doesn't occur in book names.
>>
>> Surely this issue has come up many times before???
>>
>> Robert.
>>
>>  On Sep 29, 2010 4:08 PM, "Daniel Owens" <dhowens at pmbx.net> wrote:
>> >
>> > On 09/29/2010 03:55 PM, Robert Hunt wrote:
>> >> New Zealand.
>> >>
>> >> Hello all,
>> >>
>> >> I am spending today studying the documentation on the Crosswire
>> >> Sword wiki so I'm likely to have a few questions. Please let me know
>> >> if this is not the right forum to ask questions.
>> >>
>> >> I see in http://www.crosswire.org/wiki/DevTools:SWORD that
>> >> localised book names are not allowed hyphens in them (because the
>> >> hyphen is used for verse ranges). In the Philippine language that we
>> >> worked with as Bible translators, the hyphen is a letter in the
>> >> alphabet and appears in several book names!
>> >>
>> >> Is this still a current limitation? If so, what is the suggested
>> >> work-around.
>> >>
>> >> Thanks,
>> >> Robert.
>> >>
>> > This problem came up with Vietnamese, and I was just told to drop the
>> > hyphens. The result was not ideal, but in the end it is still
>> > comprehensible in Vietnamese. I think the hyphen was needed because
>> > Vietnamese is monosyllabic, but more recent "transliterations" of
>> > foreign names have simply dropped the hyphens. Would the names still be
>> > comprehensible without the hyphen?
>> >
>> > Daniel
>>
>>
>> _______________________________________________
>> sword-devel mailing list: sword-devel at crosswire.orghttp://www.crosswire.org/mailman/listinfo/sword-devel
>> Instructions to unsubscribe/change your settings at above page
>>
>>
>> _______________________________________________
>> sword-devel mailing list: sword-devel at crosswire.orghttp://www.crosswire.org/mailman/listinfo/sword-devel
>> Instructions to unsubscribe/change your settings at above page
>>
>>
>>
>> _______________________________________________
>> sword-devel mailing list: sword-devel at crosswire.org
>> http://www.crosswire.org/mailman/listinfo/sword-devel
>> Instructions to unsubscribe/change your settings at above page
>>
>
>
>
> --
> Weston Ruter
> http://weston.ruter.net/
> @westonruter <http://twitter.com/westonruter> - Google Profile<http://www.google.com/profiles/WestonRuter#about>
>
>
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.orghttp://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page
>
>
>
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page
>



-- 
Weston Ruter
http://weston.ruter.net/
@westonruter <http://twitter.com/westonruter> - Google
Profile<http://www.google.com/profiles/WestonRuter#about>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/sword-devel/attachments/20100930/f46890b1/attachment-0001.html>


More information about the sword-devel mailing list