[sword-devel] diatheke plain output - line breaks missing?

Troy A. Griffitts scribe at crosswire.org
Mon Jan 22 21:09:14 MST 2007


>> If I can read between the lines: Speed is critical and sufficient
>> accuracy that the result can be tokenized (that is it doesn't have to
>> be pretty)

Well, kindof.  It's a matter of purpose.  The purpose for a strip filter 
is to prepare the buffer for a search, e.g. stristr(StripText(), istr)

Or practically:

http://crosswire.org/svn/sword/trunk/src/modules/swmodule.cpp

^F, stricmp
(line 649)

for example, if one searches for a phrase,
"streams of water that yield"

It should hit on Psalm 1:3
	
He is like a tree
planted by streams of water
that yields its fruit in its season,
and its leaf does not wither.
In all that he does, he prospers.

Remember we support unindexed searching (note above code).

So, in conclusion, filters have different purposes.
From: http://crosswire.org/svn/sword/trunk/include/swmodule.h

  virtual SWModule &AddRenderFilter(SWFilter *newfilter);
  virtual SWModule &AddEncodingFilter(SWFilter *newfilter);
  virtual SWModule &AddStripFilter(SWFilter *newfilter);
  virtual SWModule &AddRawFilter(SWFilter *newfilter);
  virtual SWModule &AddOptionFilter(SWOptionFilter *newfilter);


Hope this helps explain.

	-Troy.




>>
>> I compared the osishtmlhref filter and found several points where
>> whitespace is not being handled properly, potentially mushing word
>> together. I think these need to be fixed. The rest can wait for
>> another filter.
>>
>> DM
>>
>> On Jan 22, 2007, at 5:54 PM, Troy A. Griffitts wrote:
>>
>>> Hey guys,
>>>       Thanks for the patch.  I think there is some information that is
>>> lacking in the discussion:
>>>
>>> The *plain.cpp filters are primarily used in the engine as 'strip'
>>> filters.  These are filters which get called before performing a
>>> search
>>> on a verse buffer.  They are intended to prepare verse text for
>>> searching.  If you are looking for a 'render' filter which outputs end
>>> user readable ascii-only markup ("[]{}//", etc), then these do not
>>> exist.  Strip filters are the closest thing and would be a good
>>> starting
>>> point if you want to add a new FMT_ render type.
>>>
>>> Hope this helps.
>>>
>>>       -Troy.
>>>
>>>
>>> Greg Hellings wrote:
>>>> Sorry, disregard the previous patch.  It would also change the
>>>> config.h file and other things like that (which are regenerated in
>>>> the
>>>> autogen.sh script).  I have attached a patch ONLY for the
>>>> osisplain.cpp file, still from the root of the sword directory.
>>>>
>>>> Sorry,
>>>> Greg
>>>>
>>>> On 1/22/07, Greg Hellings <greg.hellings at gmail.com> wrote:
>>>>> I have, for the moment, attached a patch, made against the latest
>>>>> svn,
>>>>> which will take a tag of type <l ... type="x-br"... and change it
>>>>> into
>>>>> a new-line.  It works in the aforementioned Psalm 43:1 of ESV.  I'm
>>>>> working on Mac and don't have any other front-ends installed, so I
>>>>> don't know if it breaks them.  It's very simple and based almost
>>>>> directly off of the code for the tag right above it.  Let me know if
>>>>> it works for you.  The patch was made in the root of the sword
>>>>> directory.
>>>>>
>>>>> Cheers,
>>>>> Greg
>>>>>
>>>>> On 1/22/07, benjie <cricketc at gmail.com> wrote:
>>>>>> Thanks for looking at this. I'm pretty busy right now, but if no
>>>>>> one
>>>>>> else works on it, I'll probably see what I can do, since it's an
>>>>>> itch
>>>>>> I want scratched.:)
>>>>>>
>>>>>> -Benjie
>>>>>>
>>>>>> On Mon, Jan 22, 2007 at 10:29:22AM -0500, DM Smith wrote:
>>>>>>> I took a look at osisplain.cpp and it does not handle what OSIS
>>>>> allows.
>>>>>>> So it is not just the handling of whitespace.
>>>>>>> Some other problems (just a quick glance):
>>>>>>>     Does not handle <q>...</q>. It probably should output quote
>>>>> marks,
>>>>>>> unless suppressed in the conf.
>>>>>>>     Does not handle <divineName>Lord</divineName>. It should
>>>>> uppercase
>>>>>>> the content.
>>>>>>>     Does not handle <transChange>...</transChange>. Most systems
>>>>> output
>>>>>>> this as [...]
>>>>>>>     Does not handle milestoned elements (i.e. elements with sID
>>>>>>> and
>>>>>>> eID). Which is the root of the complaint below.
>>>>>>>
>>>>>>> More probably can be found by comparing it with the osis html
>>>>>>> filter.
>>>>>>>
>>>>>>> When I have time, I'll see what I can do. Feel free to help if you
>>>>> have
>>>>>>> the time available.
>>>>>>>
>>>>>>> benjie wrote:
>>>>>>>> Hey,
>>>>>>>>
>>>>>>>> I'm trying to work with plaintext output, but when I try to use
>>>>>>>> diatheke on Psalm 43 (for example), it doesn't display very well.
>>>>>>>> Where there are line breaks & indents in BibleTime, diatheke just
>>>>>>>> outputs words squished together. In verse 1, for example, we get
>>>>>>>> "causeagainst" and "people,from". This is with Sword 1.5.9,
>>>>>>>> and I'm
>>>>>>>> reading the ESV module. It seems that the osisplain filter
>>>>>>>> doesn't
>>>>>>>> handle the <l eID="x4672" type="x-br"/> tag correctly, from what
>>>>> I've
>>>>>>>> been looking at, unless the ESV module just has errors in it.
>>>>> But the
>>>>>>>> passages are fine in BibleTime.
>>>>>>>>
>>>>>>>> Am I just missing something, or is this a bug that can be
>>>>> corrected?
>>>>>>>> Thanks a lot,
>>>>>>>> Benjie
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> sword-devel mailing list: sword-devel at crosswire.org
>>>>>>>> http://www.crosswire.org/mailman/listinfo/sword-devel
>>>>>>>> Instructions to unsubscribe/change your settings at above page
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> sword-devel mailing list: sword-devel at crosswire.org
>>>>>>> http://www.crosswire.org/mailman/listinfo/sword-devel
>>>>>>> Instructions to unsubscribe/change your settings at above page
>>>>>> _______________________________________________
>>>>>> sword-devel mailing list: sword-devel at crosswire.org
>>>>>> http://www.crosswire.org/mailman/listinfo/sword-devel
>>>>>> Instructions to unsubscribe/change your settings at above page
>>>>>>
>>>>>
>>>>>
>>>> ---------------------------------------------------------------------
>>>> ---
>>>>
>>>> _______________________________________________
>>>> sword-devel mailing list: sword-devel at crosswire.org
>>>> http://www.crosswire.org/mailman/listinfo/sword-devel
>>>> Instructions to unsubscribe/change your settings at above page
>>>
>>> _______________________________________________
>>> sword-devel mailing list: sword-devel at crosswire.org
>>> http://www.crosswire.org/mailman/listinfo/sword-devel
>>> Instructions to unsubscribe/change your settings at above page
>>
>> _______________________________________________
>> sword-devel mailing list: sword-devel at crosswire.org
>> http://www.crosswire.org/mailman/listinfo/sword-devel
>> Instructions to unsubscribe/change your settings at above page
>>
> 
> _______________________________________________
> sword-devel mailing list: sword-devel at crosswire.org
> http://www.crosswire.org/mailman/listinfo/sword-devel
> Instructions to unsubscribe/change your settings at above page




More information about the sword-devel mailing list