[jsword-devel] Getting the global key list for a book

Chris Burrell chris at burrell.me.uk
Fri Feb 15 15:21:31 MST 2013


Let me push my changes:
https://github.com/tyndale/jsword/commit/c0d50633b28348185035a59bc9738672a0db5b42


(There's the missing data file stuff in there too - let me know if you want
it separated out).

The reason for the name change is that there was no method initially. it
just used contains() on the backend. The Book interface has not changed,
but tries a call to backend.getFGKL() and if the method is unsupported it
reverts back to the existing functionality. The name change was to
highlight that SwordBook has a good way of getting the FGKL


I did this before I realised we only had 1 implementation of a
AbstractPassageBook. And I moved the default from AbstractPassageBook
because it required access to backend, so I had a choice of exposing the
backend or moving the function. I'm not thinking entirely straight and it
may be have been better to expose backend instead.

I take your comments in both emails and I'll look at implementing some of
them tomorrow.

    /* (non-Javadoc)
     * @see org.crosswire.jsword.book.Book#getGlobalKeyList()
     */
    public final Key getGlobalKeyList() {
        if (global == null) {
            try {
                global = this.backend.getFastGlobalKeyList();
                return global;
            } catch (UnsupportedOperationException ex) {
                // fail silently, operation not supported by the backend
                log.debug(ex.getMessage());
            } catch (BookException ex) {
                // failing silently, as previous behaviour was to attempt to
                // return as much as we can using the slower method
                log.debug(ex.getMessage());
            }

            Versification v11n = super.getVersification();

            global = super.createEmptyKeyList();
            Key all = PassageKeyFactory.instance().getGlobalKeyList(v11n);

            for (Key key : all) {
                if (contains(key)) {
                    global.addAll(key);
                }
            }
        }

        return global;
    }


Chris


On 15 February 2013 22:08, DM Smith <dmsmith at crosswire.org> wrote:

> Contains(xxx) is correct. So don't need to concern about the indirection.
>
> There are three parts to a compressed testament. The one index points to a
> second index which points to the data.
> -- DM
>
> On Feb 15, 2013, at 5:05 PM, Chris Burrell <chris at burrell.me.uk> wrote:
>
> I'm confused about the double indirection in that the contains() method
> doesn't use it which is what I based my bit on.
>
> It simply checked for the size given an ordinal. So I'm doing the
> opposite, getting all the data first, then iterating through them looking
> for non-zero sizes
>
> I'm pretty sure it works (tested a little bit). It's shaved off heaps off
> time (several minutes to a few seconds for all 250 modules I have
> installed), but obviously that's no good if it isn't doing the right thing!
> Chris
>
>
>
> On 15 February 2013 21:59, DM Smith <dmsmith at crosswire.org> wrote:
>
>> I like the direction this is heading. I'm not sure if it properly handles
>> the double indirection of the compressed module. (I didn't look.) So
>> assuming it is....
>>
>> Just directly create and use a bitwise passage. don't get it from the
>> factory.
>>
>> The purpose of the factory was to be able to switch between
>> implementations.
>>
>> I added the two bitwise passage classes and it is pretty much all we use
>> now.
>>
>> I'm wondering whether it'd be cheaper to build a list of those that
>> aren't present. (Just wondering).
>>
>> Also have been musing on whether it'd be good to modify Bitwise passage
>> to hold two testament bit maps. And null if not needed. Or maybe one per
>> book.
>>
>> Anyway, by writing to an api we can swap out the implementation. (We
>> should add the new method as an overload by name to Passage and abstract
>> passage. That way you don't need to know what kind of passage it is.)
>>
>> It might be good to add a method to SwordUtils with the same decode
>> params and determines whether the array has content. (i.e. is zero). This
>> is in a tight loop, so would be fast.
>>
>>
>> On Feb 15, 2013, at 3:26 PM, Chris Burrell <chris at burrell.me.uk> wrote:
>>
>> Hi DM
>>
>> I'm about to test the following method for at least all Z backends:
>>
>> Maybe you can tell me what you think of it:
>>
>> *getFastGlobalKeyList*
>>
>> public Key getFastGlobalKeyList() throws BookException {
>>         ZVerseBackendState rafBook = null;
>>         try {
>>             rafBook = initState();
>>
>>             String v11nName =
>> getBookMetaData().getProperty(ConfigEntryType.VERSIFICATION).toString();
>>             Versification v11n =
>> Versifications.instance().getVersification(v11nName);
>>
>>             Testament[] testaments = new Testament[] {
>>                     Testament.OLD, Testament.NEW
>>             };
>>
>>             Key globalList =
>> PassageKeyFactory.instance().createEmptyKeyList(v11n);
>>             Passage passage = KeyUtil.getPassage(globalList, v11n);
>>             BitwisePassage bitwisePassage = null;
>>             if(passage instanceof BitwisePassage) {
>>                 bitwisePassage = (BitwisePassage) passage;
>>                 bitwisePassage.raiseEventSuppresion();
>>                 bitwisePassage.raiseNormalizeProtection();
>>             }
>>
>>
>>             for (Testament currentTestament : testaments) {
>>                 RandomAccessFile compRaf = currentTestament ==
>> Testament.NEW ? rafBook.getNtCompRaf() : rafBook.getOtCompRaf();
>>
>>                 // If Bible does not contain the desired testament, then
>> false
>>                 if (compRaf == null) {
>>                     // no keys in this testament
>>                     continue;
>>                 }
>>
>>                 int maxIndex = v11n.getCount(currentTestament);
>>
>>                 // Read in the whole index, a few hundred Kb at most.
>>                 byte[] temp = SwordUtil.readRAF(compRaf, 0,
>> COMP_ENTRY_SIZE * maxIndex);
>>
>>                 // for each block of 10 bytes, we consider the last 2
>> bytes.
>>                 for (int ii = 0; ii < temp.length; ii += 10) {
>>                     // can this be simplified to temp[8] == 0 && temp[9]
>> == 0?
>>                     int verseSize = SwordUtil.decodeLittleEndian16(temp,
>> 8);
>>
>>                     //can this be optimized even further - i.e. why
>> decodeOrdinal, when add() go simply pass in and store an ordinal
>>                     if (verseSize > 0) {
>>                         if(bitwisePassage != null) {
>>                             bitwisePassage.addVersifiedOrdinal(ii % 10);
>>                         } else {
>>                             globalList.addAll(v11n.decodeOrdinal(ii %
>> 10));
>>                         }
>>                     }
>>                 }
>>             }
>>
>>             if(bitwisePassage != null) {
>>                 bitwisePassage.lowerNormalizeProtection();
>>                 bitwisePassage.lowerEventSuppressionAndTest();
>>             }
>>
>>             return globalList;
>>         } catch (IOException e) {
>>             throw new BookException(JSMsg.gettext("Unable to read key
>> list from book."));
>>         } finally {
>>             IOUtil.close(rafBook);
>>         }
>>     }
>>
>>
>> *addVersifiedOrdinal (is just a simplification of add, so that we can
>> give the ordinals directly, rather than converting to a verse and back
>> again).*
>> /**
>>      * A shortcut to adding a key, by ordinal. The ordinal needs to be
>> taken from the same versification as the passage being created.
>>      *
>>      * @param ordinal the ordinal
>>      */
>>     public void addVersifiedOrdinal(int ordinal) {
>>         Versification v11n = getVersification();
>>         optimizeWrites();
>>
>>             store.set(ordinal);
>>
>>         // we do an extra check here because the cost of calculating the
>>         // params is non-zero and may be wasted
>>         if (suppressEvents == 0) {
>>             Verse verse = v11n.decodeOrdinal(ordinal);
>>             fireIntervalAdded(this, verse, verse);
>>         }
>>     }
>>
>>
>>
>> On 15 February 2013 20:22, DM Smith <dmsmith at crosswire.org> wrote:
>>
>>> That's what http://www.crosswire.org/tracker/browse/JS-246 is all about.
>>>
>>> There are a good number of places that we get all the keys for a module.
>>> But the construction of the list is dumb.
>>>
>>> The question is what do you want? Do you want all the possible keys in a
>>> versification? Or do you want the keys for the verses that are actually in
>>> a module?
>>>
>>> If you only want chapters that exist, consider working off of
>>> Versification to get the book names and the chapters. For each chapter, get
>>> the number of verses and grab one (maybe verse 1 or one halfway into the
>>> chapter) on the assumption that if one exists then the rest do.
>>>
>>>
>>> Calling book.contains(key) is what you want to do. It goes against the
>>> backend to determine whether the verse is in the module.
>>>
>>> Don't call book.getGlobalKeyList(); We probably should deprecate the
>>> method. It is too easy to call it and it is very expensive for what it does.
>>>
>>> I have a 3 day weekend and hope to finish the av11n work. It impacts
>>> this. Will probably add iterators for a versification.
>>>
>>> In Him,
>>>         DM
>>>
>>>
>>>
>>> On Feb 15, 2013, at 2:14 PM, Chris Burrell <chris at burrell.me.uk> wrote:
>>>
>>> > Hi all
>>> >
>>> > getGlobalKeyList() seems to do a lot of work against the file system.
>>> >
>>> > I'm attempting to create a sitemap, which includes a URL to every
>>> chapter. Presumably getGlobalKeyList is what I want to ensure that I'm only
>>> displaying those chapters that exist. I was intending to do:
>>> >
>>> > globalKeyList = getGlobalKeyList();
>>> > for(each book in versification) {
>>> >     myBookKey = book.getValidKey(book.getShortName())
>>> >     myBookKey.retainAll(globalKeyList);
>>> >
>>> >     if(myBookKey.getCardinality() > 0) {
>>> >         outputSiteMapChapter();
>>> >     }
>>> > }
>>> >
>>> > So for each version I call getGlobalKeyList(), but that in terms seem
>>> to make individual reads (through the contains()) method for every
>>> potential key in the versification.
>>> >
>>> > public final Key getGlobalKeyList() {
>>> >         if (global == null) {
>>> >             Versification v11n =
>>> Versifications.instance().getVersification(versification);
>>> >             global = keyf.createEmptyKeyList(v11n);
>>> >             Key all = keyf.getGlobalKeyList(v11n);
>>> >             for (Key key : all) {
>>> >                 if (contains(key)) {
>>> >                     global.addAll(key);
>>> >                 }
>>> >             }
>>> >         }
>>> >         return global;
>>> > }
>>> >
>>> > The contains() method above makes a backend call every time.
>>> >
>>> > Isn't there a way to obtain the list of keys from the index? I'm not
>>> quite sure where the versification comes in here, or is it there simply to
>>> map the names of the keys to their positions in the index file? If so, can
>>> we simply read all keys in OT and NT, and check
>>> >
>>> > Doing this on GlobalKeyList() would also speed up index creation.
>>> >
>>> > Finally, it seems contains() only works on a single key, or takes the
>>> first key/verse if it's a passage. Presumably that is correct?
>>> >
>>> > Chris
>>> >
>>> > _______________________________________________
>>> > jsword-devel mailing list
>>> > jsword-devel at crosswire.org
>>> > http://www.crosswire.org/mailman/listinfo/jsword-devel
>>>
>>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.crosswire.org/pipermail/jsword-devel/attachments/20130215/485c9e1b/attachment-0001.html>


More information about the jsword-devel mailing list