[sword-devel] Sapphire, module cipher
dmsmith555 at yahoo.com
Sun Mar 5 15:40:17 MST 2006
I think that your algorithm will work for many cases, but not all. If
the sample is sufficiently large then I think that it should work with a
very high degree of success. I'm not sure what sufficiently large would
be. The way the cipher works is that it has a sliding window of 256
bytes that it works with at a time. If the cipher is highly "random" in
its creation of the next byte then within a short order a non-printable
should show up. But like tossing a coin, you can get 10 heads in a row.
It's not likely, but it is possible. It would be possible for the cipher
to produce many "printables" in a row.
I'm not a UTF-8 expert so some of what I say might be a bit inaccurate,
but it should be close enough for argument's sake;)
In the case of Chinese, nearly all bytes will be > 128
Some UTF-8 bytes > 128 are non printable, (e.g. 128 - 159)
Some byte sequences are defined as not in UTF-8 (e.g. reserved regions)
so these would be non-printable as well.
In cp1252 (the version of "latin1" used in Sword modules), some bytes in
the range of 128-159 are not defined and are not printable.
Martin Gruner wrote:
>> Still, the simpler route is Martin's check for non-printables after
>> deciphering the first 100 or so characters. (I'm assuming that it is fully
>> UTF-8 aware.)
> atm the routine treats the data as Latin1 byte sequence. This should work
> because all nonprinting characters are <= 127 (first byte 0), and all higher
> unicode UTF-8 encoded characters consist of bytes >= 128 (first byte 1). I
> found this better than parsing the stream as UTF-8, because it might contain
> rubbish without the valid key.
> sword-devel mailing list: sword-devel at crosswire.org
> Instructions to unsubscribe/change your settings at above page
More information about the sword-devel