[sword-devel] Localized parsing symbols [was: C++ volunteer]
    Cyrille 
    lafricain79 at gmail.com
       
    Tue May 28 09:24:39 MST 2019
    
    
  
Il 28/05/2019 17:40, Troy A. Griffitts ha scritto:
>
> So, a little background surrounding why the logic is difficult to work
> out a solution for this problem:
>
> The current verse parser, which works fairly well, always has 3 sets
> of possibilities in view:
>
> OSISRef
> Current Locale
> English
>
> The parser needs to handle any of these three, typically in the
> preference order listed above.  The issue with changing out symbols
> while parsing is that some symbols (notoriously the comma) are used
> for different purposes across these 3 sets.
>
> One might think that localized output might be easier than parsing,
> e.g., once parsed, we could at least output the reference: Jn 3,16. 
> The problem here is that what the engine outputs it also expects to be
> able to parse.
>
> While we would like to solve this problem, it isn't as simple as
> adding to the locale files:
>
> ChapterVerseSeparator=,
>
> RangeSeparator=-
>
> ListSeparator=.
>
> This would be enough to define the locale, but not solve the problem. 
> We would need a fundamental change in how parsing is done, e.g.,
> explicitly telling the parser, "Hey, I'm sending you localized input,
> so don't guess.  You can count on the symbols I'm sending you to be
> localized"  Right now everyone has the convenience of just passing any
> of the 3 sets of parsing text listed above and theparser just figuring
> it out-- with the caveat that chapter, range, and list separators are
> not localizable.
>
> Hope this gives some background,
>
Yes thank you, but I just don't understand why it is already possible
with two separator (. and : ) and then not only with one? Maybe I can't
understand it because it is too much hard (technicaly) for me ;)
>
> Troy
>
>
> On 5/28/19 6:10 AM, David Haslam wrote:
>> OK - but my observations were not entirely irrelevant. 
>>
>> Some front-ends never need the user to enter a reference in an edit
>> box. Navigation is done entirely via menu selections or clicking
>> search results etc. 
>> AFAICT this is true of PocketSword. 
>>
>> Other front-ends are designed at the opposite extreme. All navigation
>> is done through an edit box. This is true (eg) of STEP Bible. 
>>
>> Best regards,
>>
>> David. 
>>
>> Sent from ProtonMail Mobile
>>
>>
>> On Tue, May 28, 2019 at 13:54, refdoc at gmx.net <refdoc at gmx.net
>> <mailto:refdoc at gmx.net>> wrote:
>>> Sorry, David, that is a complete misunderstanding. Modules need
>>> osisref. There is and will be no need to do anything to the modules.
>>> This is about the engine parser to read references locale
>>> appropriately.
>>>
>>> Sent from my mobile. Please forgive shortness, typos and weird
>>> autocorrects.
>>>
>>>
>>> -------- Original Message --------
>>> Subject: Re: [sword-devel] C++ volunteer
>>> From: David Haslam
>>> To: SWORD Developers' Collaboration Forum
>>> CC:
>>>
>>>
>>>     Parsing native references is not a simple task, as we know from
>>>     the fact that adyeths orefs.py was kicked into touch indefinitely. 
>>>
>>>     And that’s even when punctuation marks are defined in the
>>>     specified configuration file. 
>>>
>>>     Unless we might consider the possibility of adding keys to
>>>     module .conf files that define the module specific
>>>     native reference punctuation marks and separators. 
>>>
>>>     That could be a huge undertaking, considering the need to
>>>     maintain backwards compatibility. 
>>>
>>>     And it’s not as if it really is module specific entirely. A user
>>>     can be switching between modules with different languages, yet
>>>     would need the current reference to always work, no matter what. 
>>>
>>>     Best regards 
>>>
>>>     David
>>>
>>>     Sent from ProtonMail Mobile
>>>
>>>
>>>     On Tue, May 28, 2019 at 12:10, refdoc at gmx.net <refdoc at gmx.net
>>>     <mailto:refdoc at gmx.net>> wrote:
>>>>     The improvement request for allowing commas in references...
>>>>     adding commas in the suggested form would make millions of
>>>>     currently valid Anglo references invalid. The problem is a much
>>>>     wider one, references should be localised in their punctuation
>>>>     too. I am not sure how difficult this would be, but I guess we
>>>>     could make a start by defining what punctuation is used for
>>>>     which purpose , and then take it from there.
>>>>
>>>>     Cyrille, maybe start a page on the wiki and start thinking there.
>>>>
>>>>     Sent from my mobile. Please forgive shortness, typos and weird
>>>>     autocorrects.
>>>>
>>>>
>>>>     -------- Original Message --------
>>>>     Subject: Re: [sword-devel] C++ volunteer
>>>>     From: Cyrille
>>>>     To: SWORD Developers' Collaboration Forum
>>>>     CC:
>>>>
>>>>
>>>>         Hello Richard,
>>>>         Welcome!
>>>>         May I make a very selfish proposal to Richard who offers
>>>>         his help. There are two issues that I really want to be
>>>>         resolved. One of which particularly handicaps Catholic
>>>>         users, (but I discovered today that the issue wasn't been
>>>>         reported!!! I just did it):
>>>>         https://tracker.crosswire.org/browse/API-216
>>>>         And the second:
>>>>         https://tracker.crosswire.org/projects/API/issues/API-180
>>>>
>>>>         If there are more important things that I am not able to
>>>>         estimate not being a developer, I would have tried my luck ;)
>>>>
>>>>         Il 28/05/2019 01:38, Troy A. Griffitts ha scritto:
>>>>>         Richard, sorry, I meant to give you the link to our tracker:
>>>>>
>>>>>         https://tracker.crosswire.org
>>>>>
>>>>>
>>>>>         On 5/27/19 4:32 PM, Troy A. Griffitts wrote:
>>>>>>         Welcome, Richard!
>>>>>>
>>>>>>         I would start at 2 places:
>>>>>>
>>>>>>         First, have a look at our tracker here.  We are not very (very not)
>>>>>>         disciplined at keeping it current.  Skimming through there and
>>>>>>         commenting on anything that looks interesting, or even cleaning a few
>>>>>>         things up in there that you confirm are no longer a problem might be a
>>>>>>         useful exercise to get you poking around at internals and would be a
>>>>>>         blessing for us.  Our modus operandi as of late is to create a new unit
>>>>>>         test in sword/tests/testssuite/ which fails at the bug and then once
>>>>>>         fixed, the test should pass and we leave the test around to be sure we
>>>>>>         don't regress.  We can always use more tests in our tests suite.
>>>>>>
>>>>>>         Next, we have the intention to modularize our search engines support and
>>>>>>         search types.  Right now, SWModule (which represents a Bible) implements
>>>>>>         our SWSearchable interface, which is fine, but right now it has a bunch
>>>>>>         of #ifdef logic and switch statements to take different code paths
>>>>>>         depending on which search engine is compiled into SWORD and which search
>>>>>>         type is specified.  This was fine initially, but has grown to such that
>>>>>>         we now support spaghetti in there.  It should probably simply have a set
>>>>>>         of SWSearchable objects in a map<SEARCH_TYPE, SWSearchable> and proxy
>>>>>>         the search request to the appropriate SWSearchable impl based on what
>>>>>>         types are registered for the module.  This would allow us to implement
>>>>>>         new types and register them with modules which support special search
>>>>>>         types, e.g., advanced Hebrew Morphology searching.  That's the general
>>>>>>         idea anyway.
>>>>>>
>>>>>>         You should probably become familiar with SWFilter and how we use these
>>>>>>         throughout the engine. These prepare a buffer for particular
>>>>>>         objectives.  We have RenderFilters, EncodingFilters, StripFilters, ... 
>>>>>>         The last prepares an SWModule entry for searching by, typically,
>>>>>>         stripping out all markup and leaving only a plaintext buffer which can
>>>>>>         be searched.  We have some special code in the SWModule::search
>>>>>>         spaghetti which takes Greek and Hebrew modules and turns buffers into a
>>>>>>         series of Strongs#@MorphCode Strong#@MorphCode ... which allows regex
>>>>>>         searches to do some advanced morph searching... like: Find this strongs
>>>>>>         number, any morphology, followed by a any verb withing 2 words.  You
>>>>>>         have to be pretty familiar with the Strong#@MorphCode syntax to
>>>>>>         formulate something like that, but the idea is that a frontend could
>>>>>>         have a nice UI to help a user come up with some creative searches. 
>>>>>>         Anyway, these should all be probably modulized out by renaming the
>>>>>>         StripFilter concept to SearchFilter, and then pushing all this special
>>>>>>         code out to SearchFilter impls which do these special things...
>>>>>>
>>>>>>         Finally, an objective of all this search modularization is also to break
>>>>>>         out the code required to create search indexes for each of the search
>>>>>>         engines we support.  Ideally, we should be able to support the same
>>>>>>         searches either as an indexed or brute force search.  The same code
>>>>>>         which iterates a module, prepares each entry, and pushes that entry to
>>>>>>         the search engine, building the search index, should also work for a
>>>>>>         brute force search-- iterating the module, preparing each entry for the
>>>>>>         search engine.. and then performing a check on that buffer to see if it
>>>>>>         matches the search expression.
>>>>>>
>>>>>>         I hope this gives you a few things to think about. It has been good for
>>>>>>         me to refresh thoughts on all of this.  Have a look and let me know what
>>>>>>         you think.
>>>>>>
>>>>>>         Welcome!  Looking forward to sharing in service together,
>>>>>>
>>>>>>         Troy
>>>>>>
>>>>>>          
>>>>>>
>>>>>>         On 5/27/19 1:09 PM, Richard Smith wrote:
>>>>>>>         Hi,
>>>>>>>
>>>>>>>         My name's Richard Smith. I'm a C++ software engineer with 10 years
>>>>>>>         experience in various industries. I was wondering if there was any
>>>>>>>         space for a volunteer. I've started taking a look at things (building
>>>>>>>         repos on Win/unix), but if there are specific things that are
>>>>>>>         required, within my ability, I'm happy to do that.
>>>>>>>
>>>>>>>         Best Regards
>>>>>>>         Richard Smith
>>>>>>>
>>>>>>>         _______________________________________________
>>>>>>>         sword-devel mailing list: sword-devel at crosswire.org
>>>>>>>         http://www.crosswire.org/mailman/listinfo/sword-devel
>>>>>>>         Instructions to unsubscribe/change your settings at above page
>>>>>>         _______________________________________________
>>>>>>         sword-devel mailing list: sword-devel at crosswire.org
>>>>>>         http://www.crosswire.org/mailman/listinfo/sword-devel
>>>>>>         Instructions to unsubscribe/change your settings at above page
>>>>>         _______________________________________________
>>>>>         sword-devel mailing list: sword-devel at crosswire.org
>>>>>         http://www.crosswire.org/mailman/listinfo/sword-devel
>>>>>         Instructions to unsubscribe/change your settings at above page
>>>>
>>>
>>>
>>
>>
>>
>> _______________________________________________
>> sword-devel mailing list: sword-devel at crosswire.org
>> http://www.crosswire.org/mailman/listinfo/sword-devel
>> Instructions to unsubscribe/change your settings at above page
>
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/sword-devel/attachments/20190528/49313658/attachment-0001.html>
    
    
More information about the sword-devel
mailing list