[bt-devel] Strongs search infinite loop fix

jdc jdc.email at gmail.com
Sun Feb 10 22:14:07 MST 2008


Martin Gruner or Eeli Kaikkonen

I have fixed the infinite loop problem in the strongs search.  It is 
actually in the strongs portion of 
CSearchResultArea::highlightSearchedText().  I have attached the file 
that contains the fix.
csearchdialogareas.cpp.gz  (single gzipped file of csearchdialogareas.cpp).

In case the attachment does not get through, the single function in the 
above file is at the end of this e-mail.

At one time, I had access as a developer to the cvs repository.  It may 
be that I am not using svn properly but this is the error I get:

Username: jim-campbell

svn: Commit failed (details follow):
svn: MKACTIVITY of 
'/svnroot/bibletime/!svn/act/db6237d4-d85e-11dc-8779-79e0e7d1eb57': 403 
Forbidden (https://bibletime.svn.sourceforge.net)



I don't have a lot of time to help but I would still like to contribute 
as often as I can.

Thanks
Jim

sourceforge username jim-campbell




const QString CSearchResultArea::highlightSearchedText(const QString& 
content, const QString& searchedText/*, const int searchFlags*/) {
    QString ret = content;

    //const bool cs = (searchFlags & CSwordModuleSearch::caseSensitive);
    const Qt::CaseSensitivity cs = Qt::CaseInsensitive;
   
    //   int index = 0;
    int index = ret.indexOf("<body", 0);
    int matchLen = 0;
    int length = searchedText.length();

    // Highlighting constants -
    // TODO: We need to make the highlight color configurable.
    const QString rep1("<span style=\"background-color:#FFFF66;\">");
    const QString rep2("</span>");
    const unsigned int repLength = rep1.length() + rep1.length();
    const QString rep3("style=\"background-color:#FFFF66;\" ");
    const unsigned int rep3Length = rep3.length();

   
    bool inQuote;
    QString newSearchText;

    newSearchText = searchedText;
   
    // find the strongs search lemma and highlight it
    // search the searched text for "strong:" until it is not found anymore
    QStringList list;
   
    // split the search string - some possibilities are "\\s|\\|", 
"\\s|\\+", or "\\s|\\|\\+"
    // TODO: find all possible seperators
    QString regExp = "\\s";
    list = searchedText.split(QRegExp(regExp));
    foreach (QString newSearchText, list) {       
        int sstIndex; // strong search text index for finding "strong:"
        int idx1, idx2, sTokenIndex, sTokenIndex2;
        QString sNumber, lemmaText;
       
        sstIndex = newSearchText.indexOf("strong:");
        if (sstIndex == -1)
            continue;
       
        // set the start index to the start of <body>
        int strongIndex = index;
       
        // Get the strongs number from the search text.
        // First, find the first space after "strong:"
        sstIndex = sstIndex + 7;
        // get the strongs number -> the text following "strong:" to the 
end of the string.
        sNumber = newSearchText.mid(sstIndex, -1);
        // find all the "lemma=" inside the the content
        while((strongIndex = ret.indexOf("lemma=", strongIndex, cs)) != 
-1) {
            // get the strongs number after the lemma and compare it 
with the
            // strongs number we are looking for
            idx1 = ret.indexOf("\"", strongIndex) + 1;
            idx2 = ret.indexOf("\"", idx1 + 1);
            lemmaText = ret.mid(idx1, idx2 - idx1);
           
            // this is interesting because we could have a strongs 
number like: G3218|G300
            // To handle this we will use some extra cpu cycles and do a 
partial match against
            // the lemmaText
            if (lemmaText.contains(sNumber)) {
                // strongs number is found now we need to highlight it
                // I believe the easiest way is to insert rep3 just 
before "lemma="
                ret = ret.insert(strongIndex, rep3);
                strongIndex += rep3Length;
                }
            strongIndex += 6; // 6 is the length of "lemma="
            }
        }
    //---------------------------------------------------------------------
    // now that the strong: stuff is out of the way continue with
    // other search options
    //---------------------------------------------------------------------
   
    // try to figure out how to use the lucene query parser
   
    //using namespace lucene::queryParser;
    //using namespace lucene::search;
    //using namespace lucene::analysis;
    //using namespace lucene::util;

    //wchar_t *buf;
    //char buf8[1000];
    //standard::WhitespaceAnalyzer analyzer;
    //lucene_utf8towcs(m_wcharBuffer, searchedText.utf8(), MAX_CONV_SIZE);
    //boost::scoped_ptr<Query> q( QueryParser::parse(m_wcharBuffer, 
_T("content"), &analyzer) );
    //StringReader reader(m_wcharBuffer);
    //TokenStream* tokenStream = analyzer.tokenStream( _T("field"), 
&reader);
    //Token token;
    //while(tokenStream->next(&token) != 0) {
    //    lucene_wcstoutf8(buf8, token.termText(), 1000);
    //    printf("%s\n", buf8);
    //}

    //===========================================================
    // since I could not figure out the lucene query parser, I
    // made a simple parser.
    //===========================================================
    QStringList words = QueryParser(newSearchText);
    for ( int wi = 0; (unsigned int)wi < words.count(); ++wi ) { 
//search for every word in the list
        QRegExp findExp;
        QString word = words[ wi ];
        if (word.contains("*")) {
            length = word.length() - 1;
            word.replace('*', "\\S*"); //match within a word
            findExp = QRegExp(word);
            findExp.setMinimal(TRUE);
        }
        else {
            length = word.length();
            findExp = QRegExp("\\b" + word + "\\b");
        }

        //       index = 0; //for every word start at the beginning
        index = ret.indexOf("<body", 0);
        findExp.setCaseSensitivity(cs);
        //while ( (index = ret.find(findExp, index)) != -1 ) { //while 
we found the word
        while ( (index = findExp.indexIn(ret, index)) != -1 ) { //while 
we found the word
            matchLen = findExp.matchedLength();
            if (!CToolClass::inHTMLTag(index, ret)) {
                length = matchLen;
                ret = ret.insert( index+length, rep2 );
                ret = ret.insert( index, rep1 );
                index += repLength;
            }
            index += length;
        }
    }
    //qWarning("\n\n\n%s", ret.latin1());
    return ret;
};
-------------- next part --------------
A non-text attachment was scrubbed...
Name: csearchdialogareas.cpp.gz
Type: application/x-gzip
Size: 8903 bytes
Desc: not available
Url : http://www.crosswire.org/pipermail/bt-devel/attachments/20080210/2056e6d5/attachment.gz 


More information about the bt-devel mailing list