[sword-devel] lemma search (was sword-1.6.0RC1 available)

Troy A. Griffitts scribe at crosswire.org
Sun Apr 26 06:24:27 MST 2009


Matthew,

I'm sorry.  We don't see eye to eye on this.  I don't find your points 
below convincing or honest.

Matthew Talbert wrote:
> Troy,
> 
>>From a user's perspective, they simply want to do a search for G140
> and return all of the results for that. Then they want to do a search
> for G1401 and return all of the results for that. If the engine
> doesn't allow this (which it hasn't until this change), then from the
> user's perspective, it's a bug.

This is irrelevant.  I've given you a away to do the search correctly. 
Use it and you won't give the user the wrong results.

 > Now you're saying that it will be
> slower to do the correct thing in this case. From the user's
> perspective again, making something slower to do the correct thing
> (which used to work just fine and does for other modules) is a bad
> thing.

This again is irrelevant.  There are 2 query strings in question here.

Word//Lemma/G1234/
Word//Lemma./G1234/

The first is faster and IS NOT WHAT YOU WANT.
It is not a search for G1234 in all Lemma[.]*.*  The second one is, and 
is slow because it includes the [.]*.* logic.

The first is a perfectly valid search for what someone else might want: 
a single G1234 lemma on a word.   But this is not what you told me you 
want to offer your end user.  It is irrelevant if it is faster if it 
does not represent the search you wish to perform for your end user.


> Now we have to make all of the searches slower just so we can
> be sure of getting correct results in KJV. Again, this indicates to
> the user that there is a design flaw.

What user?  You or your end user?  Your end user never sees any of this 
logic or has any idea about any of this and probably will only notice a 
speed improvement in 1.6.x.  If you, then I've explained to you that the 
search domain has increased to include new KINDS of data:
.../Lemma.1
.../Lemma.2

Now you must specify a different search string to get correct results 
from this new kind of data.  I've given you the new expression.  Please 
just use it and let's end this conversation.

Brief comments below just to be thorough for you...


 > I never intended to attack your
> justifications for doing things this way, nor did I attempt to tell
> you that the technical design was bad.

I think the following goes on to do just that.

> It's just that if a new method
> of doing things causes a regression, and doing it correctly is much
> slower, then from the user's perspective a bad decision was made.

'regression' and 'correctly' are not valid words here.  Both expressions 
are correct for different purposes.  Because we've added new kinds of 
data and your original expression no longer gives the results you desire 
for this new kind of data, you cannot call this a regression.  It is 
progress to have the data for lemmas combined correctly when more than 
one exists for a single original language word.  It is progress that we 
now have a search expression to handle this easily for you.  I am sorry 
that we have not yet made the progress to make
/Lemma/
just as fast as:
/Lemma[.]*.*/
but I doubt we ever will be able to make both of these expressions 
equally fast.  Again, the bottom line is that your end user will likely 
only see an improvement due to progress we've made in other search speed 
improvements.


> My complaint is two-fold. First of all, I have to introduce another
> ifdef for this.

the [.]*.* logic and syntax did not exist in pre 1.6.x.  If you want to 
use it in 1.6.x and still remain backward compatible, you'll need an 
ifdef.  I'm sorry about this, but this is how things usually work.


> Secondly, the searches will be slower (they are
> already excruciatingly slow in some circumstances on some platforms).
> In addition, "doing the right thing by default" would solve this
> problem for other frontends, eg diatheke. It wasn't my intention to
> complain about "adding a '.'".

Doing the 'right thing' for you when using the old syntax will:
a) not improve the speed because we will still be doing the [.]*.* logic
b) only cause absolutely ever other search to do the [.]*.* logic.

These are both no wins for anyone.


> I hope you can see how this is a problem for the average user who just
> wants searches to return reasonable results.

No, I absolutely cannot see any way THIS topic is a problem for an 
average user.  They neither see any of this search syntax, nor have any 
idea even what an 'entryAttribute' is.

> Explaining to a user that
> this is a feature rather than a bug is a difficult thing to do.

None of this conversation is relevant to an end user, and appealing to 
such is just not an honest argument in this debate.

Please end the conversion here.  Please do not respond to this email. 
This will not change in this release, nor do I see any valid argument to 
consider changing it for a future release.  Both syntaxes mean different 
useful things and they best represent what they do in their current 
expressions.  Hopefully we will continue to make progress and make 
things even faster in future releases.  Sorry if I sound hard on this, 
but I don't even consider this a hard decision with a closely competing 
alternative.  It is designed and operates as intended.

	-Troy.





More information about the sword-devel mailing list