<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <div class="moz-cite-prefix">You might consider using CollateX,
      which does token level (word or other) collation, and does a
      pretty good job detecting things like transpositions, etc.&nbsp; Here
      is how we use it here at the INTF:<br>
      <br>
<a class="moz-txt-link-freetext" href="http://ntvmr.uni-muenster.de/web/test/collation?key=Jn.3.16&amp;collate=graph">http://ntvmr.uni-muenster.de/web/test/collation?key=Jn.3.16&amp;collate=graph</a><br>
      <br>
      Our web service for this is here (with example parameters
      following):<br>
      <br>
      <a class="moz-txt-link-freetext" href="http://ntvmr.uni-muenster.de/community/vmr/api/collate/">http://ntvmr.uni-muenster.de/community/vmr/api/collate/</a><br>
<a class="moz-txt-link-freetext" href="http://ntvmr.uni-muenster.de/community/vmr/api/collate/?w1=Hello+world&amp;l1=x&amp;w2=Hello+cruel+world&amp;format=svg">http://ntvmr.uni-muenster.de/community/vmr/api/collate/?w1=Hello+world&amp;l1=x&amp;w2=Hello+cruel+world&amp;format=svg</a><br>
      <br>
      <br>
      <br>
      <br>
      On 08/29/2012 06:50 PM, Chris Burrell wrote:<br>
    </div>
    <blockquote
cite="mid:CACQnaRVwULCn5_mgBsWJ6-hGgw1eE8KccRzugAnRVBJYNsC=og@mail.gmail.com"
      type="cite">Hi all
      <div><br>
      </div>
      <div>The current diffing produces some fairly strange results from
        time to time. I was wondering how much work it would be to make
        it work for a word by word diff, rather than letter by letter.
        I've a quick scan through the diff-ing engine, but it looks
        fairly complicated and can't figure out how much of this is a
        copy of&nbsp;<a moz-do-not-send="true"
          href="http://code.google.com/p/google-diff-match-patch">http://code.google.com/p/google-diff-match-patch</a>
        and how much has changed.</div>
      <div><br>
      </div>
      <div>In the example below,&nbsp;</div>
      <div>
        <table class="table">
          <tbody>
            <tr class="row">
              <td dir="ltr" class="cell" valign="top"><br>
                &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;"And God saw th<u>at th</u>e light <font
                  class="strike"><b>, that it was good : and God divid</b></font><u>was
                  good. And God separat</u>ed the light from the
                darkness<font class="strike"> </font>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp;"<br>
                <br>
                The new diff would hopefully not chop "that and "the"
                &nbsp;in the first occurrence above. It would not chop
                "divid" off either, but rather have longer words, which
                would in turn make things slightly more readable.<br>
                <br>
              </td>
            </tr>
          </tbody>
        </table>
      </div>
      <div>(bold indicates strike through)</div>
      <div><br>
      </div>
      <div>Chris</div>
      <div><br>
      </div>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <br>
      <pre wrap="">_______________________________________________
jsword-devel mailing list
<a class="moz-txt-link-abbreviated" href="mailto:jsword-devel@crosswire.org">jsword-devel@crosswire.org</a>
<a class="moz-txt-link-freetext" href="http://www.crosswire.org/mailman/listinfo/jsword-devel">http://www.crosswire.org/mailman/listinfo/jsword-devel</a>
</pre>
    </blockquote>
    <br>
  </body>
</html>