Details

    • Type: Bug Bug
    • Status: Open (View Workflow)
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 1.4.1b6
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      When creating index for translation in Chinese, I found that the search result is poor in pocketsword and SPW, but good in Andbible. How to solve this problem?

        Activity

        Hide
        Nic Carter added a comment -

        Is this using a search index that you have created yourself or is it a search index that you downloaded from within PocketSword? And which module is it for?

        Show
        Nic Carter added a comment - Is this using a search index that you have created yourself or is it a search index that you downloaded from within PocketSword? And which module is it for?
        Hide
        DM Smith added a comment -

        AndBible uses JSword, which has much richer indexing and uses a Chinese analyzer for the text. SWORD uses StandardAnalyzer, which is appropriate only for some latinate languages. This is a request that SWORD have analysis based upon the language/script of the module.

        Show
        DM Smith added a comment - AndBible uses JSword, which has much richer indexing and uses a Chinese analyzer for the text. SWORD uses StandardAnalyzer, which is appropriate only for some latinate languages. This is a request that SWORD have analysis based upon the language/script of the module.
        Hide
        DM Smith added a comment -

        A bit more info: most Lucene analyzers use whitespace to delineate words. This is not appropriate in some languages, such as Chinese. There are a variety of techniques to index Chinese, but to simplify: Suppose that there A, B, C were three Chinese characters and were given as ABC. This would index A, B, AB, C, and BC. In searching for AC, this would look for A, C and AC.

        With the various whitespace analyzers, there would be no result. The text contained ABC and only ABC would find it.

        Show
        DM Smith added a comment - A bit more info: most Lucene analyzers use whitespace to delineate words. This is not appropriate in some languages, such as Chinese. There are a variety of techniques to index Chinese, but to simplify: Suppose that there A, B, C were three Chinese characters and were given as ABC. This would index A, B, AB, C, and BC. In searching for AC, this would look for A, C and AC. With the various whitespace analyzers, there would be no result. The text contained ABC and only ABC would find it.
        Hide
        Ko Chiu Shun added a comment -

        The search index is created by PocketSword itself and it is a Chinese Module.

        Show
        Ko Chiu Shun added a comment - The search index is created by PocketSword itself and it is a Chinese Module.
        Hide
        Nic Carter added a comment -

        Thanks for the clarification, I will look into it.

        Show
        Nic Carter added a comment - Thanks for the clarification, I will look into it.
        Hide
        Nic Carter added a comment -

        Ok, I need some more clarification about this. You say that "The search index is created by PocketSword itself". However, PocketSword cannot create a search index – testing revealed that this would take a very long time and so I never added that code. So, how are you creating your search index? Is this in regard to the module that you sent me by email?

        @DM thanks for that information, PS is still lacking in non-whitespace searches, such as Chinese.

        Show
        Nic Carter added a comment - Ok, I need some more clarification about this. You say that "The search index is created by PocketSword itself". However, PocketSword cannot create a search index – testing revealed that this would take a very long time and so I never added that code. So, how are you creating your search index? Is this in regard to the module that you sent me by email? @DM thanks for that information, PS is still lacking in non-whitespace searches, such as Chinese.
        Hide
        Ko Chiu Shun added a comment -

        It is my misunderstanding. The search index is created by "The sword Project" and I placed it into the module.

        Show
        Ko Chiu Shun added a comment - It is my misunderstanding. The search index is created by "The sword Project" and I placed it into the module.

          People

          • Assignee:
            Nic Carter
            Reporter:
            Ko Chiu Shun
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated: