[sword-devel] Search bug & New Arabic Bible, Not Shaped SVD Version

Troy A. Griffitts scribe at crosswire.org
Sun Dec 9 14:04:58 MST 2012


Yeah, so, a few comments on this thread (my apologies for be preoccupied 
over the past month or so):

SWORD has the concept of "filtering" a module's text at different points 
in processing, for different purposes.  One of these filter-points is 
for searching and we call these filters "Strip Filters".

Strip Filters are typically named something like OSISPlain or GBFPlain, 
etc.  These typically take all the markup out of an entry and prepare 
the text to be searched, but anything can be done to the text to prepare 
it for searching.  We typically remove accents and vowel points from 
Greek and Hebrew, respectively.  If diacritics need to be removed from 
Arabic, then we can certainly add a filter for this as well.  I believe 
Peter may have already done this and has referred to a patch submitted 
in November, last year.  Peter, please remind me if I have neglected to 
commit something for you.

Any Strip Filter can be added to a module by a module author with a line 
in the .conf file, such as:

LocalStripFilter=UTF8ArabicPoints

A list of filters can be found by browsing the source folder here:

http://crosswire.org/svn/sword/trunk/src/modules/filters/

They're pretty concise and don't involve much knowledge from the rest of 
the engine, making them easy to write if we need a new one.

This processing can replace or be complimentary to any processing done 
by clucene.

Since we need to strip markup, and other things clucene will likely 
never support (see PapyriPlain-- annotations like [,],?{,}, underdot) we 
need this pre-process mechanism to prepare the text before searching.  
We also maintain searching functionality apart from "fast indexed 
searching" (currently supplied by clucene, but could be supplied by any 
other fast search framework we decide we might want to integrate).

Hope this informs this thread a little,

Troy




On 11/27/2012 01:05 AM, Peter von Kaehne wrote:
> Guys, are you sure this is a problem with Clucene and not just with the strip filter?
>
> Has anyone tried out the patch? It was sent in November last year IIRC
>
> Peter
> -------- Original-Nachricht --------
>> Datum: Mon, 26 Nov 2012 23:19:20 -0600
>> Von: Greg Hellings <greg.hellings at gmail.com>
>> An: "SWORD Developers\' Collaboration Forum" <sword-devel at crosswire.org>
>> Betreff: Re: [sword-devel] Search bug & New Arabic Bible,	Not Shaped SVD Version
>> On Mon, Nov 26, 2012 at 11:15 PM, Nic Carter <niccarter at mac.com> wrote:
>>> My understanding is that we are currently locked into a really old
>> version of the C library
>>
>> False.
>>
>>> & it is no longer being maintained.
>> True
>>
>>> Instead we need to port SWORD to use the current version of the library,
>> Already done.
>>
>>> which is actively being maintained...
>> It isn't. That's the complaint. :)
>>
>> --Greg
>>
>>> I gather some work has been done on this but I'm not sure where it's
>> currently up to. It's on my todo list, along with about a million other things
>> that have piled up over the last year... :)
>>> Sent from my phone, hence this email may be short...
>>>
>>> On 27/11/2012, at 15:17, pola ashraf <5001 at hotmail.com> wrote:
>>>
>>>> we depend on a library that get updates very frequently in java but no
>> updates for its C port
>>> _______________________________________________
>>> sword-devel mailing list: sword-devel at crosswire.org
>>> http://www.crosswire.org/mailman/listinfo/sword-devel
>>> Instructions to unsubscribe/change your settings at above page
>> _______________________________________________
>> sword-devel mailing list: sword-devel at crosswire.org
>> http://www.crosswire.org/mailman/listinfo/sword-devel
>> Instructions to unsubscribe/change your settings at above page
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page




More information about the sword-devel mailing list