Coverage Report - org.crosswire.common.crypt.Sapphire
 
Classes in this File Line Coverage Branch Coverage Complexity
Sapphire
0%
0/81
0%
0/26
3
 
 1  
 /**
 2  
  * Distribution License:
 3  
  * JSword is free software; you can redistribute it and/or modify it under
 4  
  * the terms of the GNU Lesser General Public License, version 2.1 or later
 5  
  * as published by the Free Software Foundation. This program is distributed
 6  
  * in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even
 7  
  * the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
 8  
  * See the GNU Lesser General Public License for more details.
 9  
  *
 10  
  * The License is available on the internet at:
 11  
  *      http://www.gnu.org/copyleft/lgpl.html
 12  
  * or by writing to:
 13  
  *      Free Software Foundation, Inc.
 14  
  *      59 Temple Place - Suite 330
 15  
  *      Boston, MA 02111-1307, USA
 16  
  *
 17  
  * © CrossWire Bible Society, 2005 - 2016
 18  
  *
 19  
  */
 20  
 package org.crosswire.common.crypt;
 21  
 
 22  
 /**
 23  
  * The Sapphire II Stream Cipher is a port of Sword's C++ implementation of
 24  
  * Michael Paul Johnson's 2 January 1995 public domain cipher. Below is the
 25  
  * documentation that he originally provided for it. It has been converted to
 26  
  * JavaDoc and the C++ fragment has been removed.
 27  
  * 
 28  
  * <h1>THE SAPPHIRE II STREAM CIPHER</h1>
 29  
  * 
 30  
  * <p>
 31  
  * The Sapphire II Stream Cipher is designed to have the following properties:
 32  
  * </p>
 33  
  * <ul>
 34  
  * 
 35  
  * <li>Be useful for generation of cryptographic check values as well as
 36  
  * protecting message privacy.</li>
 37  
  * 
 38  
  * <li>Accept a variable length key.</li>
 39  
  * 
 40  
  * <li>Strong enough to justify <i>at least</i> a 64 bit key for balanced
 41  
  * security.</li>
 42  
  * 
 43  
  * <li>Small enough to be built into other applications with several keys active
 44  
  * at once.</li>
 45  
  * 
 46  
  * <li>Key setup fast enough to support frequent key change operations but slow
 47  
  * enough to discourage brute force attack on the key.</li>
 48  
  * 
 49  
  * <li>Fast enough to not significantly impact file read &amp; write operations on
 50  
  * most current platforms.</li>
 51  
  * 
 52  
  * <li>Portable among common computers and efficient in C, C++, and Pascal.</li>
 53  
  * 
 54  
  * <li>Byte oriented.</li>
 55  
  * 
 56  
  * <li>Include both ciphertext and plain text feedback (for both optimal data
 57  
  * hiding and value in creation of cryptographic check values).</li>
 58  
  * 
 59  
  * <li>Acceptable performance as a pure pseudorandom number generator without
 60  
  * providing a data stream for encryption or decryption.</li>
 61  
  * 
 62  
  * <li>Allow <i>limited</i> key reuse without serious security degradation.</li>
 63  
  * </ul>
 64  
  * 
 65  
  * <h2>HISTORY AND RELATED CIPHERS</h2>
 66  
  * 
 67  
  * <p>
 68  
  * The Sapphire Stream Cipher is very similar to a cipher I started work on in
 69  
  * November 1993. It is also similar in some respects to the alledged RC-4 that
 70  
  * was posted to sci.crypt recently. Both operate on the principle of a mutating
 71  
  * permutation vector. Alledged RC-4 doesn't include any feedback of ciphertext
 72  
  * or plain text, however. This makes it more vulnerable to a known plain text
 73  
  * attack, and useless for creation of cryptographic check values. On the other
 74  
  * hand, alledged RC-4 is faster.
 75  
  * </p>
 76  
  * 
 77  
  * <p>
 78  
  * The Sapphire Stream Cipher is used in the shareware product Quicrypt, which
 79  
  * is available at ftp://ftp.csn.net/mpj/qcrypt10.zip and on the Colorado
 80  
  * Catacombs BBS (303-772-1062). There are two versions of Quicrypt: the
 81  
  * exportable version (with a session key limited to 32 bits but with strong
 82  
  * user keys allowed) and the commercial North American version (with a session
 83  
  * key of 128 bits). A variant of the Sapphire Stream Cipher is also used in the
 84  
  * shareware program Atbash, which has no weakened exportable version.
 85  
  * </p>
 86  
  * 
 87  
  * <p>
 88  
  * The Sapphire II Stream Cipher is a modification of the Sapphire Stream Cipher
 89  
  * designed to be much more resistant to adaptive chosen plaintext attacks (with
 90  
  * reorigination of the cipher allowed). The Sapphire II Stream Cipher is used
 91  
  * in an encryption utility called ATBASH2.
 92  
  * </p>
 93  
  * 
 94  
  * 
 95  
  * <h2>OVERVIEW</h2>
 96  
  * 
 97  
  * <p>
 98  
  * The Sapphire Stream Cipher is based on a state machine. The state consists of
 99  
  * 5 index values and a permutation vector. The permutation vector is simply an
 100  
  * array containing a permutation of the numbers from 0 through 255. Four of the
 101  
  * bytes in the permutation vector are moved to new locations (which may be the
 102  
  * same as the old location) for every byte output. The output byte is a
 103  
  * nonlinear function of all 5 of the index values and 8 of the bytes in the
 104  
  * permutation vector, thus frustrating attempts to solve for the state
 105  
  * variables based on past output. On initialization, the permutation vector
 106  
  * (called the cards array in the source code below) is shuffled based on the
 107  
  * user key. This shuffling is done in a way that is designed to minimize the
 108  
  * bias in the destinations of the bytes in the array. The biggest advantage in
 109  
  * this method is not in the elimination of the bias, per se, but in slowing
 110  
  * down the process slightly to make brute force attack more expensive.
 111  
  * Eliminating the bias (relative to that exhibited by RC-4) is nice, but this
 112  
  * advantage is probably of minimal cryptographic value. The index variables are
 113  
  * set (somewhat arbitrarily) to the permutation vector elements at locations 1,
 114  
  * 3, 5, 7, and a key dependent value (rsum) left over from the shuffling of the
 115  
  * permutation vector (cards array).
 116  
  * </p>
 117  
  * 
 118  
  * 
 119  
  * <h2>KEY SETUP</h2>
 120  
  * 
 121  
  * <p>
 122  
  * Key setup (illustrated by the function initialize(), below) consists of three
 123  
  * parts:
 124  
  * </p>
 125  
  * <ol>
 126  
  * <li>Initialize the index variables.</li>
 127  
  * <li>Set the permutation vector to a known state (a simple counting sequence).
 128  
  * </li>
 129  
  * <li>Starting at the end of the vector, swap each element of the permutation
 130  
  * vector with an element indexed somewhere from 0 to the current index (chosen
 131  
  * by the function keyrand()).</li>
 132  
  * </ol>
 133  
  * <p>
 134  
  * The keyrand() function returns a value between 0 and some maximum number
 135  
  * based on the user's key, the current state of the permutation vector, and an
 136  
  * index running sum called rsum. Note that the length of the key is used in
 137  
  * keyrand(), too, so that a key like "abcd" will not result in the same
 138  
  * permutation as a key like "abcdabcd".
 139  
  * </p>
 140  
  * 
 141  
  * 
 142  
  * <h2>ENCRYPTION</h2>
 143  
  * 
 144  
  * <p>
 145  
  * Each encryption involves updating the index values, moving (up to) 4 bytes
 146  
  * around in the permutation vector, selecting an output byte, and adding the
 147  
  * output byte bitwise modulo-2 (exclusive-or) to the plain text byte to produce
 148  
  * the cipher text byte. The index values are incremented by different rules.
 149  
  * The index called rotor just increases by one (modulo 256) each time. Ratchet
 150  
  * increases by the value in the permutation vector pointed to by rotor.
 151  
  * Avalanche increases by the value in the permutation vector pointed to by
 152  
  * another byte in the permutation vector pointed to by the last cipher text
 153  
  * byte. The last plain text and the last cipher text bytes are also kept as
 154  
  * index variables. See the function called encrypt(), below for details.
 155  
  * </p>
 156  
  * 
 157  
  * 
 158  
  * <h2>PSUEDORANDOM BYTE GENERATION</h2>
 159  
  * 
 160  
  * <p>
 161  
  * If you want to generate random numbers without encrypting any particular
 162  
  * ciphertext, simply encrypt 0. There is still plenty of complexity left in the
 163  
  * system to ensure unpredictability (if the key is not known) of the output
 164  
  * stream when this simplification is made.
 165  
  * </p>
 166  
  * 
 167  
  * 
 168  
  * <h2>DECRYPTION</h2>
 169  
  * 
 170  
  * <p>
 171  
  * Decryption is the same as encryption, except for the obvious swapping of the
 172  
  * assignments to last_plain and last_cipher and the return value. See the
 173  
  * function decrypt(), below.
 174  
  * </p>
 175  
  * 
 176  
  * 
 177  
  * <h2>C++ SOURCE CODE FRAGMENT</h2>
 178  
  * 
 179  
  * <p>
 180  
  * The original implimentation of this cipher was in Object Oriented Pascal, but
 181  
  * C++ is available for more platforms.
 182  
  * </p>
 183  
  * 
 184  
  * <h2>GENERATION OF CRYPTOGRAPHIC CHECK VALUES (HASH VALUES)</h2>
 185  
  * 
 186  
  * <p>
 187  
  * For a fast way to generate a cryptographic check value (also called a hash or
 188  
  * message integrity check value) of a message of arbitrary length:
 189  
  * </p>
 190  
  * <ol>
 191  
  * <li>Initialize either with a key (for a keyed hash value) or call hash_init
 192  
  * with no key (for a public hash value).</li>
 193  
  * 
 194  
  * <li>Encrypt all of the bytes of the message or file to be hashed. The results
 195  
  * of the encryption need not be kept for the hash generation process.
 196  
  * (Optionally decrypt encrypted text, here).</li>
 197  
  * 
 198  
  * <li>Call hash_final, which will further "stir" the permutation vector by
 199  
  * encrypting the values from 255 down to 0, then report the results of
 200  
  * encrypting 20 zeroes.</li>
 201  
  * </ol>
 202  
  * 
 203  
  * <h2>SECURITY ANALYSIS</h2>
 204  
  * 
 205  
  * <p>
 206  
  * There are several security issues to be considered. Some are easier to
 207  
  * analyze than others. The following includes more "hand waving" than
 208  
  * mathematical proofs, and looks more like it was written by an engineer than a
 209  
  * mathematician. The reader is invited to improve upon or refute the following,
 210  
  * as appropriate.
 211  
  * </p>
 212  
  * 
 213  
  * 
 214  
  * <h2>KEY LENGTH</h2>
 215  
  * 
 216  
  * <p>
 217  
  * There are really two kinds of user keys to consider: (1) random binary keys,
 218  
  * and (2) pass phrases. Analysis of random binary keys is fairly straight
 219  
  * forward. Pass phrases tend to have much less entropy per byte, but the
 220  
  * analysis made for random binary keys applies to the entropy in the pass
 221  
  * phrase. The length limit of the key (255 bytes) is adequate to allow a pass
 222  
  * phrase with enough entropy to be considered strong.
 223  
  * </p>
 224  
  * 
 225  
  * <p>
 226  
  * To be real generous to a cryptanalyst, assume dedicated Sapphire Stream
 227  
  * Cipher cracking hardware. The constant portion of the key scheduling can be
 228  
  * done in one cycle. That leaves at least 256 cycles to do the swapping
 229  
  * (probably more, because of the intricacies of keyrand(), but we'll ignore
 230  
  * that, too, for now). Assume a machine clock of about 256 MegaHertz (fairly
 231  
  * generous). That comes to about one key tried per microsecond. On average, you
 232  
  * only have to try half of the keys. Also assume that trying the key to see if
 233  
  * it works can be pipelined, so that it doesn't add time to the estimate. Based
 234  
  * on these assumptions (reasonable for major governments), and rounding to two
 235  
  * significant digits, the following key length versus cracking time estimates
 236  
  * result:
 237  
  * </p>
 238  
  * 
 239  
  * <pre>
 240  
  *     Key length, bits    Time to crack
 241  
  *     ----------------    -------------
 242  
  *                   32    35 minutes (exportable in qcrypt)
 243  
  *                   33    1.2 hours (not exportable in qcrypt)
 244  
  *                   40    6.4 days
 245  
  *                   56    1,100 years (kind of like DES's key)
 246  
  *                   64    290,000 years (good enough for most things)
 247  
  *                   80    19 billion years (kind of like Skipjack's key)
 248  
  *                  128    5.4E24 years (good enough for the clinically paranoid)
 249  
  * </pre>
 250  
  * 
 251  
  * <p>
 252  
  * Naturally, the above estimates can vary by several orders of magnitude based
 253  
  * on what you assume for attacker's hardware, budget, and motivation.
 254  
  * </p>
 255  
  * 
 256  
  * <p>
 257  
  * In the range listed above, the probability of spare keys (two keys resulting
 258  
  * in the same initial permutation vector) is small enough to ignore. The proof
 259  
  * is left to the reader.
 260  
  * </p>
 261  
  * 
 262  
  * 
 263  
  * <h2>INTERNAL STATE SPACE</h2>
 264  
  * 
 265  
  * <p>
 266  
  * For a stream cipher, internal state space should be at least as big as the
 267  
  * number of possible keys to be considered strong. The state associated with
 268  
  * the permutation vector alone (256!) constitutes overkill.
 269  
  * </p>
 270  
  * 
 271  
  * 
 272  
  * <h2>PREDICTABILITY OF THE STATE</h2>
 273  
  * 
 274  
  * <p>
 275  
  * If you have a history of stream output from initialization (or equivalently,
 276  
  * previous known plaintext and ciphertext), then rotor, last_plain, and
 277  
  * last_cipher are known to an attacker. The other two index values, flipper and
 278  
  * avalanche, cannot be solved for without knowing the contents of parts of the
 279  
  * permutation vector that change with each byte encrypted. Solving for the
 280  
  * contents of the permutation vector by keeping track of the possible positions
 281  
  * of the index variables and possible contents of the permutation vector at
 282  
  * each byte position is not possible, since more variables than known values
 283  
  * are generated at each iteration. Indeed, fewer index variables and swaps
 284  
  * could be used to achieve security, here, if it were not for the hash
 285  
  * requirements.
 286  
  * </p>
 287  
  * 
 288  
  * 
 289  
  * <h2>CRYPTOGRAPHIC CHECK VALUE</h2>
 290  
  * 
 291  
  * <p>
 292  
  * The change in state altered with each byte encrypted contributes to an
 293  
  * avalanche of generated check values that is radically different after a
 294  
  * sequence of at least 64 bytes have been encrypted. The suggested way to
 295  
  * create a cryptographic check value is to encrypt all of the bytes of a
 296  
  * message, then encrypt a sequence of bytes counting down from 255 to 0. A
 297  
  * single bit change in a message causes a radical change in the check value
 298  
  * generated (about half of the bits change). This is an essential feature of a
 299  
  * cryptographic check value.
 300  
  * </p>
 301  
  * 
 302  
  * <p>
 303  
  * Another good property of a cryptographic check value is that it is too hard
 304  
  * to compute a message that results in a certain check value. In this case, we
 305  
  * assume the attacker knows the key and the contents of a message that has the
 306  
  * desired check value, and wants to compute a bogus message having the same
 307  
  * check value. There are two obvious ways to do this attack. One is to solve
 308  
  * for a sequence that will restore the state of the permutation vector and
 309  
  * indices back to what it was before the alteration. The other one is the
 310  
  * so-called "birthday" attack that is to cryptographic hash functions what
 311  
  * brute force is to key search.
 312  
  * </p>
 313  
  * 
 314  
  * <p>
 315  
  * To generate a sequence that restores the state of the cipher to what it was
 316  
  * before the alteration probably requires at least 256 bytes, since the index
 317  
  * "rotor" marches steadily on its cycle, one by one. The values to do this
 318  
  * cannot easily be computed, due to the nonlinearity of the feedback, so there
 319  
  * would probably have to be lots of trial and error involved. In practical
 320  
  * applications, this would leave a gaping block of binary garbage in the middle
 321  
  * of a document, and would be quite obvious, so this is not a practical attack,
 322  
  * even if you could figure out how to do it (and I haven't). If anyone has a
 323  
  * method to solve for such a block of data, though, I would be most interested
 324  
  * in finding out what it is. Please email me at
 325  
  * &lt;m.p.johnson&#064;ieee.org&gt; if you find one.
 326  
  * </p>
 327  
  * 
 328  
  * <p>
 329  
  * The "birthday" attack just uses the birthday paradox to find a message that
 330  
  * has the same check value. With a 20 byte check value, you would have to find
 331  
  * at least 80 bits to change in the text such that they wouldn't be noticed (a
 332  
  * plausible situation), then try the combinations until one matches. 2 to the
 333  
  * 80th power is a big number, so this isn't practical either. If this number
 334  
  * isn't big enough, you are free to generate a longer check value with this
 335  
  * algorithm. Someone who likes 16 byte keys might prefer 32 byte check values
 336  
  * for similar stringth.
 337  
  * </p>
 338  
  * 
 339  
  * 
 340  
  * <h2>ADAPTIVE CHOSEN PLAIN TEXT ATTACKS</h2>
 341  
  * 
 342  
  * <p>
 343  
  * Let us give the attacker a keyed black box that accepts any input and
 344  
  * provides the corresponding output. Let us also provide a signal to the black
 345  
  * box that causes it to reoriginate (revert to its initial keyed state) at the
 346  
  * attacker's will. Let us also be really generous and provide a free copy of
 347  
  * the black box, identical in all respects except that the key is not provided
 348  
  * and it is not locked, so the array can be manipulated directly.
 349  
  * </p>
 350  
  * 
 351  
  * <p>
 352  
  * Since each byte encrypted only modifies at most 5 of the 256 bytes in the
 353  
  * permutation vector, and it is possible to find different sequences of two
 354  
  * bytes that leave the five index variables the same, it is possible for the
 355  
  * attacker to find sets of chosen plain texts that differ in two bytes, but
 356  
  * which have cipher texts that are the same for several of the subsequent
 357  
  * bytes. Modeling indicates that as many as ten of the following bytes
 358  
  * (although not necessarily the next ten bytes) might match. This information
 359  
  * would be useful in determining the structure of the Sapphire Stream Cipher
 360  
  * based on a captured, keyed black box. This means that it would not be a good
 361  
  * substitute for the Skipjack algorithm in the EES, but we assume that the
 362  
  * attacker already knows the algorithm, anyway. This departure from the
 363  
  * statistics expected from an ideal stream cipher with feedback doesn't seem to
 364  
  * be useful for determining any key bytes or permutation vector bytes, but it
 365  
  * is the reason why post-conditioning is required when computing a
 366  
  * cryptographic hash with the Sapphire Stream Cipher. Thanks to Bryan G.
 367  
  * Olson's &lt;olson&#064;umbc.edu&gt; continued attacks on the Sapphire Stream
 368  
  * Cipher, I have come up with the Sapphire II Stream Cipher. Thanks again to
 369  
  * Bryan for his valuable help.
 370  
  * </p>
 371  
  * 
 372  
  * <p>
 373  
  * Bryan Olson's "differential" attack of the original Sapphire Stream Cipher
 374  
  * relies on both of these facts:
 375  
  * </p>
 376  
  * 
 377  
  * <ol>
 378  
  * <li>By continual reorigination of a black box containing a keyed version of
 379  
  * the Sapphire Stream Cipher, it is possible to find a set of input strings
 380  
  * that differ only in the first two (or possibly three) bytes that have
 381  
  * identical output after the first three (or possibly four) bytes. The output
 382  
  * suffixes so obtained will not contain the values of the permutation vector
 383  
  * bytes that <i>differ</i> because of the different initial bytes encrypted.</li>
 384  
  * 
 385  
  * <li>Because the five index values are initialized to constants that are known
 386  
  * by the attacker, most of the locations of the "missing" bytes noted in the
 387  
  * above paragraph are known to the attacker (except for those indexed by the
 388  
  * ratchet index variable for encryptions after the first byte).</li>
 389  
  * </ol>
 390  
  * 
 391  
  * <p>
 392  
  * I have not yet figured out if Bryan's attack on the original Sapphire Stream
 393  
  * Cipher had complexity of more or less than the design strength goal of 2^64
 394  
  * encryptions, but some conservative estimations I made showed that it could
 395  
  * possibly come in significantly less than that. (I would probably have to
 396  
  * develop a full practical attack to accurately estimate the complexity more
 397  
  * accurately, and I have limited time for that). Fortunately, there is a way to
 398  
  * frustrate this type of attack without fully developing it.
 399  
  * </p>
 400  
  * 
 401  
  * <p>
 402  
  * Denial of condition 1 above by increased alteration of the state variables is
 403  
  * too costly, at least using the methods I tried. For example, doubling the
 404  
  * number of index variables and the number of permutation vector items
 405  
  * referenced in the output function of the stream cipher provides only doubles
 406  
  * the cost of getting the data in item 1, above. This is bad crypto-economics.
 407  
  * A better way is to change the output function such that the stream cipher
 408  
  * output byte is a combination of two permutation vector bytes instead of one.
 409  
  * That means that all possible output values can occur in the differential
 410  
  * sequences of item 1, above.
 411  
  * </p>
 412  
  * 
 413  
  * <p>
 414  
  * Denial of condition 2 above, is simpler. By making the initial values of the
 415  
  * five index variables dependent on the key, Bryan's differential attack is
 416  
  * defeated, since the attacker has no idea which elements of the permutation
 417  
  * vector were different between data sets, and exhaustive search is too
 418  
  * expensive.
 419  
  * </p>
 420  
  * 
 421  
  * 
 422  
  * <h2>OTHER HOLES</h2>
 423  
  * 
 424  
  * <p>
 425  
  * Are there any? Take you best shot and let me know if you see any. I offer no
 426  
  * challenge text with this algorithm, but you are free to use it without
 427  
  * royalties to me if it is any good.
 428  
  * </p>
 429  
  * 
 430  
  * 
 431  
  * <h2>CURRENT STATUS</h2>
 432  
  * 
 433  
  * <p>
 434  
  * This is a new (to the public) cipher, and an even newer approach to
 435  
  * cryptographic hash generation. Take your best shot at it, and please let me
 436  
  * know if you find any weaknesses (proven or suspected) in it. Use it with
 437  
  * caution, but it still looks like it fills a need for reasonably strong
 438  
  * cryptography with limited resources.
 439  
  * </p>
 440  
  * 
 441  
  * 
 442  
  * <h2>LEGAL STUFF</h2>
 443  
  * 
 444  
  * <p>
 445  
  * The intention of this document is to share some research results on an
 446  
  * informal basis. You may freely use the algorithm and code listed above as far
 447  
  * as I'm concerned, as long as you don't sue me for anything, but there may be
 448  
  * other restrictions that I am not aware of to your using it. The C++ code
 449  
  * fragment above is just intended to illustrate the algorithm being discussed,
 450  
  * and is not a complete application. I understand this document to be
 451  
  * Constitutionally protected publication, and not a munition, but don't blame
 452  
  * me if it explodes or has toxic side effects.
 453  
  * </p>
 454  
  * 
 455  
  * <pre>
 456  
  *                   ___________________________________________________________
 457  
  *                  |                                                           |
 458  
  *  |\  /| |        | Michael Paul Johnson  Colorado Catacombs BBS 303-772-1062 |
 459  
  *  | \/ |o|        | PO Box 1151, Longmont CO 80502-1151 USA      John 3:16-17 |
 460  
  *  |    | | /  _   | mpj&#064;csn.org aka mpj&#064;netcom.com m.p.johnson&#064;ieee.org       |
 461  
  *  |    |||/  /_\  | ftp://ftp.csn.net/mpj/README.MPJ          CIS: 71331,2332 |
 462  
  *  |    |||\  (    | ftp://ftp.netcom.com/pub/mp/mpj/README  -. --- ----- .... |
 463  
  *  |    ||| \ \_/  | PGPprint=F2 5E A1 C1 A6 CF EF 71  12 1F 91 92 6A ED AE A9 |
 464  
  *                  |___________________________________________________________|
 465  
  * </pre>
 466  
  * 
 467  
  * Regarding this port to Java and not the original code, the following license
 468  
  * applies:
 469  
  * 
 470  
  * @see gnu.lgpl.License The GNU Lesser General Public License for details.
 471  
  * @author Michael Paul Johnson [ kahunapule at mpj dot cx] Original code
 472  
  * @author unascribed Sword's C++ implementation
 473  
  * @author DM Smith Java port from Sword's C++ implementation
 474  
  */
 475  
 public class Sapphire {
 476  
 
 477  
     /**
 478  
      * Construct a Sapphire Stream Cipher from a key, possibly null or empty.
 479  
      * 
 480  
      * @param aKey the cipher key
 481  
      */
 482  0
     public Sapphire(byte[] aKey) {
 483  0
         byte[] key = aKey;
 484  0
         if (key == null) {
 485  0
             key = new byte[0];
 486  
         }
 487  0
         cards = new int[256];
 488  0
         if (key.length > 0) {
 489  0
             initialize(key);
 490  
         } else {
 491  0
             hashInit();
 492  
         }
 493  0
     }
 494  
 
 495  
     /**
 496  
      * Decipher a single byte, presumably the next.
 497  
      * 
 498  
      * @param b
 499  
      *            the next byte to decipher
 500  
      * @return the enciphered byte
 501  
      */
 502  
     public byte cipher(byte b) {
 503  
         // Picture a single enigma rotor with 256 positions, rewired
 504  
         // on the fly by card-shuffling.
 505  
 
 506  
         // This cipher is a variant of one invented and written
 507  
         // by Michael Paul Johnson in November, 1993.
 508  
 
 509  
         // Shuffle the deck a little more.
 510  
 
 511  
         // Convert from a byte to an int, but prevent sign extension.
 512  
         // So -16 becomes 240
 513  0
         int bVal = b & 0xFF;
 514  0
         ratchet += cards[rotor++];
 515  
         // Keep ratchet and rotor in the range of 0-255
 516  
         // The C++ code relied upon overflow of an unsigned char
 517  0
         ratchet &= 0xFF;
 518  0
         rotor &= 0xFF;
 519  0
         int swaptemp = cards[lastCipher];
 520  0
         cards[lastCipher] = cards[ratchet];
 521  0
         cards[ratchet] = cards[lastPlain];
 522  0
         cards[lastPlain] = cards[rotor];
 523  0
         cards[rotor] = swaptemp;
 524  0
         avalanche += cards[swaptemp];
 525  
         // Keep avalanche in the range of 0-255
 526  0
         avalanche &= 0xFF;
 527  
 
 528  
         // Output one byte from the state in such a way as to make it
 529  
         // very hard to figure out which one you are looking at.
 530  0
         lastPlain = bVal ^ cards[(cards[ratchet] + cards[rotor]) & 0xFF] ^ cards[cards[(cards[lastPlain] + cards[lastCipher] + cards[avalanche]) & 0xFF]];
 531  
 
 532  0
         lastCipher = bVal;
 533  
 
 534  
         // Convert back to a byte
 535  
         // E.g. 240 becomes -16
 536  0
         return (byte) lastPlain;
 537  
     }
 538  
 
 539  
     /**
 540  
      * Destroy the key and state information in RAM.
 541  
      */
 542  
     public void burn() {
 543  
         // Destroy the key and state information in RAM.
 544  0
         for (int i = 0; i < 256; i++) {
 545  0
             cards[i] = 0;
 546  
         }
 547  0
         rotor = 0;
 548  0
         ratchet = 0;
 549  0
         avalanche = 0;
 550  0
         lastPlain = 0;
 551  0
         lastCipher = 0;
 552  0
     }
 553  
 
 554  
     /**
 555  
      * @param hash the destination
 556  
      */
 557  
     public void hashFinal(byte[] hash) { // Destination
 558  0
         for (int i = 255; i >= 0; i--) {
 559  0
             cipher((byte) i);
 560  
         }
 561  0
         for (int i = 0; i < hash.length; i++) {
 562  0
             hash[i] = cipher((byte) 0);
 563  
         }
 564  0
     }
 565  
 
 566  
     /**
 567  
      * Initializes the cards array to be deterministically random based upon the
 568  
      * key.
 569  
      * <p>
 570  
      * Key size may be up to 256 bytes. Pass phrases may be used directly, with
 571  
      * longer length compensating for the low entropy expected in such keys.
 572  
      * Alternatively, shorter keys hashed from a pass phrase or generated
 573  
      * randomly may be used. For random keys, lengths of from 4 to 16 bytes are
 574  
      * recommended, depending on how secure you want this to be.
 575  
      * </p>
 576  
      * 
 577  
      * @param key
 578  
      *            used to initialize the cipher engine.
 579  
      */
 580  
     private void initialize(byte[] key) {
 581  
 
 582  
         // Start with cards all in order, one of each.
 583  0
         for (int i = 0; i < 256; i++) {
 584  0
             cards[i] = i;
 585  
         }
 586  
 
 587  
         // Swap the card at each position with some other card.
 588  
         int swaptemp;
 589  0
         int toswap = 0;
 590  0
         keypos = 0; // Start with first byte of user key.
 591  0
         rsum = 0;
 592  0
         for (int i = 255; i >= 0; i--) {
 593  0
             toswap = keyrand(i, key);
 594  0
             swaptemp = cards[i];
 595  0
             cards[i] = cards[toswap];
 596  0
             cards[toswap] = swaptemp;
 597  
         }
 598  
 
 599  
         // Initialize the indices and data dependencies.
 600  
         // Indices are set to different values instead of all 0
 601  
         // to reduce what is known about the state of the cards
 602  
         // when the first byte is emitted.
 603  0
         rotor = cards[1];
 604  0
         ratchet = cards[3];
 605  0
         avalanche = cards[5];
 606  0
         lastPlain = cards[7];
 607  0
         lastCipher = cards[rsum];
 608  
 
 609  
         // ensure that these have no useful values to those that snoop
 610  0
         toswap = 0;
 611  0
         swaptemp = toswap;
 612  0
         rsum = swaptemp;
 613  0
         keypos = rsum;
 614  0
     }
 615  
 
 616  
     /**
 617  
      * Initialize non-keyed hash computation.
 618  
      */
 619  
     private void hashInit() {
 620  
 
 621  
         // Initialize the indices and data dependencies.
 622  0
         rotor = 1;
 623  0
         ratchet = 3;
 624  0
         avalanche = 5;
 625  0
         lastPlain = 7;
 626  0
         lastCipher = 11;
 627  
 
 628  
         // Start with cards all in inverse order.
 629  
 
 630  0
         int j = 255;
 631  0
         for (int i = 0; i < 256; i++) {
 632  0
             cards[i] = j--;
 633  
         }
 634  0
     }
 635  
 
 636  
     private int keyrand(int limit, byte[] key) {
 637  
         int u; // Value from 0 to limit to return.
 638  
 
 639  0
         if (limit == 0) {
 640  0
             return 0; // Avoid divide by zero error.
 641  
         }
 642  
 
 643  0
         int retryLimiter = 0; // No infinite loops allowed.
 644  
 
 645  
         // Fill mask with enough bits to cover the desired range.
 646  0
         int mask = 1;
 647  0
         while (mask < limit) {
 648  0
             mask = (mask << 1) + 1;
 649  
         }
 650  
 
 651  
         do {
 652  
             // Convert a byte from the key to an int, but prevent sign
 653  
             // extension.
 654  
             // So -16 becomes 240
 655  
             // Also keep rsum in the range of 0-255
 656  
             // The C++ code relied upon overflow of an unsigned char
 657  0
             rsum = (cards[rsum] + (key[keypos++] & 0xFF)) & 0xFF;
 658  
 
 659  0
             if (keypos >= key.length) {
 660  0
                 keypos = 0; // Recycle the user key.
 661  0
                 rsum += key.length; // key "aaaa" != key "aaaaaaaa"
 662  0
                 rsum &= 0xFF;
 663  
             }
 664  
 
 665  0
             u = mask & rsum;
 666  
 
 667  0
             if (++retryLimiter > 11) {
 668  0
                 u %= limit; // Prevent very rare long loops.
 669  
             }
 670  0
         } while (u > limit);
 671  0
         return u;
 672  
     }
 673  
 
 674  
     private int[] cards;
 675  
     private int rotor;
 676  
     private int ratchet;
 677  
     private int avalanche;
 678  
     private int lastPlain;
 679  
     private int lastCipher;
 680  
     private int keypos;
 681  
     private int rsum;
 682  
 }