Hey guys, I have some questions about the structure of KSA. The way that I find it being explained is as follows:

The first step of this algorithm consist in initializing the S table with the
identity permutation: the values in the array are equal to their index.
Once the S array is initialized, the next step consist in shuffling the array
using the key to make it a permutation array. To do so, we simply iterate 256
times the following actions after initializing i and j to 0:
• compute j = j + S[i] + key[i mod keylength]
• swap S[i] and S[j]
• increment i

So, the way I understand it

Create an array with 256 items, each of the indexes in the array contains value = index , save it as S of i
For i = 0 to 256 (Random number or sequenced numbers?) compute j = j + S[i] + key[i mod keylength] and on for S[j] save the result S of j.

finally swap values of S of i and S of j (All values or just some) - this doesnt make sense in this way, like why would u swap same values on same indexes and that S of i is simple all numbers from 0 to 255

Also, where/how is the key itself generated?

Also, where/how is the key itself generated?

I'm glad you asked. A stream cipher (like arc4) takes in a key of n bits, and will produce a very long bitstream. The bitstream is then xor'd with the plaintext to produce the ciphertext.

The cipher itself has no concept of where the key comes from. For example, it could come from ECDH as a complex example. A simpler example might be a password.

There are a few ways to get from password to key. You can use any function that maps character-strings to b-bit bitstrings. For example, you can use a hash function (like SHA-236) to map the string to a 256-bit password. Or you can make up your own. The problem with this is that it's too easy to bruteforce/dictionary/rainbow table attack weaker passwords with modern GPU's and FPGA's. We need to slow it down. One method you can use is to iterate the hash a bunch of times (my laptop can do 20 million sha-256's in one second for reference). This is called key streching. KDF's are usually used to store passwords, and also have this key-streching property. There's currently a competition going on for memory-hard KDF's as well, which provide very strong defence against powerfull hardware.

I'll warn you about a simple attack that affects all stream cyphers: if you resuse the same key twice, you can xor the two ciphertexts back togeather to get the two plaintexts xored togeather - that's not good! This means that even if you use the same password over and over again, the key needs to be different each time (and you must include enough information in the cyphertext to recreate the key with the password)!

Since you're trying to keep it simple, here's one method you can use to derive a key:

1) Get the password from the user.
2) Hash/KDF the password into an n-bitstring.
3) Generate a n-bit randon number.
4) xor the number with the n-bitstring to get the key.
5) The first n-bits of the ciphertext are the random number.

To decrypt:

1) Get password from the user.
2) Hash/KDF the password into a n-bitstring.
3) Read the first n-bit's of the ciphertext.
4) xor the n-bitstring with the first n-bits of the ciphertext.

Here are two other things that I would keep in mind about this cryptosystem. I'm not saying it's a bad idea to use this for a little personal project, but it might be nice to be aware of:

1) Arc4 is depreiciated. It's been broken for a while, an it seems that a practical attack is around the corner (and it seems likely that some non-public agencies may have a practical attack already).

2) All stream ciphers are also vunerable to a bit-flipping attack. This can be prevented by using some kind of authentification system (hash's would work, as well as something more complicated like poly1305).

Now for your questions about the KSA.

Create an array with 256 items, each of the indexes in the array contains value = index , save it as S of i.

Correct. S[] = {1,...,255}.

(Random number or sequenced numbers?)

The first 256 non-negative integers in lexographic order: (0,1,2,...,255).

For i = 0 to 256 compute j = j + S[i] + key[i mod keylength]

A byte is always a value under 256, but j + S[i] + key[i mod keylength] can potentially generate a bigger number, so I already know there is something is off here. It turns out that the entire thing is modulus 256 (and is correct otherwise).

and on for S[j] save the result S of j.

Not quite. After computing j you swap S[j] and S[i].

Using the original posting as reference and remodeling it you what you've written in your post, it should look like this:

for i = 0 to 255
    S[i] = i
j = 0
for i = 0 to 255
    j = (j + S[i] + key[i % key-length-in-bytes]) % 256
    swap(S[i], S[j])

Hey Hiroshe! Thanks for responding and clarifying!

So, I was way off thinking that there are 2 S arrays, however, it turns out that it is the same array but 2 elements in it exchange their positions?

For example, if i = 10, for j we get some value lets say 50. Then j is pointing and is located at position 50 in the S array, then they swap their locations such as on position 10, now it will be 50 and on position 50, it will be 10, correct? Just wondering here, how do you even manage to keep track of which value is i and which is j in the resulting S array or eventually it is only j values?

And 1 more question about decrypting the key in the example you gave.

1) Get the password from the user.
2) Hash/KDF the password into an n-bitstring.
3) Generate a n-bit randon number.
4) xor the number with the n-bitstring to get the key.
5) The first n-bits of the ciphertext are the random number.

In here, n-bitstring and n-bit random number are with different size correct? So it would be n-bitstring and m-bit random number, where the number is shorter than the hashed string. Then the first m bits of the cipher text are the random number?

To decrypt:

1) Get password from the user.
2) Hash/KDF the password into a n-bitstring.
3) Read the first n-bit's of the ciphertext.
4) xor the n-bitstring with the first n-bits of the ciphertext.

To decrypt the system has to be aware of the size of the random number, m-bit number. Basically, after xoring it on the last step, you get the number and the password in plain text again, correct?

and last but not least ... how long is the key? Does it have a fixed size or it just has a maximum size? maybe?

Thanks again for responding, looking forward to hear from you again!

So, I was way off thinking that there are 2 S arrays, however, it turns out that it is the same array but 2 elements in it exchange their positions?

Yup. You're just permuting the list of numbers: (0,1,...,255). That's all the KSA does. Just swapping numbers.

For example, if i = 10, for j we get some value lets say 50. Then j is pointing and is located at position 50 in the S array, then they swap their locations such as on position 10, now it will be 50 and on position 50, it will be 10, correct?

If I understand you correctly, yes. After you get i = 10 and j = 50, you swap S[10] and S[50] (whatever values might be there). For example, if S[10] = 23 and S[50] = 6, then after the swap it will look like: S[10] = 6 and S[50] = 23.

Just wondering here, how do you even manage to keep track of which value is i and which is j in the resulting S array or eventually it is only j values?

I don't understand what you mean. You don't need to keep track of anything outside of S.

Let's go over a simplified example with 5 values (with made up j values):

start: 0,1,2,3,4

i=0, j=4: 4,1,2,3,0

i=1, j=2: 4,2,1,3,0

i=2, j=5: 4,2,0,3,1

i=3, j=1: 4,3,0,2,1

i=4, j=1: 4,1,0,2,3

And now we have S=4,1,0,2,3 which is used in the rest of the algorithm.

In here, n-bitstring and n-bit random number are with different size correct?

No. n is the same size for the entire thing (you'll probably want 256-bits so it matches with everything else).

Then the first m bits of the cipher text are the random number?

The first n-bits is just the random number, yes.

To decrypt the system has to be aware of the size of the random number, m-bit number.

Yes. The easiest way is to choose an n to use throughout the entire cryptosystem (like 256), that way you don't need to send end every which way. Note that small numbers can also be 256-bit by leading 0's.

how long is the key?

Pick a fized size for the implementation. 256-bit's is a logical choice.

Going through an example (8-bit), say that the hashed password is: 1001 0111. Then you pick a random number 0011 1101. You xor them togeather: 1010 1010. That is the key you use for the KSA. The first 8-bit's of the ciphertext will be 0011 1101.

When you decrypt, you get the hashed password, and you get the first 8-bits of the ciphertext. Xor them togeather: 1001 0111 xor 0011 1101 again gives 1010 1010. That is the key that was used for encryption and will work for decryption.

Note: Make sure that the entire system is endian independant. The largest unit of data yoou should work with is one byte (uint8_t in C/C++). You'll need to use bitwize manipulation if you needed to scale a 256-bit number into a bunch of 8-bit numbers (in the case of generating a random number, you can just assign randon values to 32 uint88_t's).

Thanks Hiroshe, very well explained and detailed! I hope this helps more fellow members with their questions about RC4

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.