You would lose speed, though, due to the function calls, as we saw from the tests on Nezachem's version. And he still had to zip the results together, so the whole thing was loaded into memory, anyway. There might be a way to combine the best of both worlds, though.
kylealanhale 0 Newbie Poster
TrustyTony 888 pyMod Team Colleague Featured Poster
> That ends the debate.
Well, as programmers we must realize that the debate never ends. Besides, neither approach looks pythonic enough to me. Since you already have the performance test set up, could you add the following?
def loop(text): def looper(t): while True: for c in t: yield c return looper(text) def crypt(text, passwd): crypto = [] for (t, p) in zip(text, loop(passwd)): crypto.append(chr(ord(t) ^ ord(p))) return ''.join(crypto)
Timing:
crypt3 took 1272 ms.
File length: 89729
897306 function calls in 2.529 CPU seconds
This code is bit funny, my version of DIY looping (cycle in itertools):
def crypt8(text, passwd):
def loopord(text):
while True:
for c in text:
yield ord(c)
return ''.join([chr(ord(t) ^ p) for (t, p) in zip(text, loopord(passwd))])
crypt8 took 1019 ms.
File length: 89729
717846 function calls in 1.993 CPU seconds
Itertools implementation did little better still:
crypt5 took 986 ms.
File length: 89729
717844 function calls in 1.976 CPU seconds
Edited by TrustyTony because: correct itertools time
TrustyTony 888 pyMod Team Colleague Featured Poster
I have attached the updated testing script if you want to see it.
I got only announcement print from the first test to terminal, so I put the messages inside the test and redirected to sys.stderr
import sys
import cProfile as profile
from time import clock
__all__ = ['Tester', 'do_tests']
class Tester(object):
'''Decorator for all tested function'''
def __init__(self,fn):
self.name = fn.__name__
self.fn = fn
def __call__(self, *args):
t = clock()
ret = self.fn(*args)
print >>sys.stderr,"Testing", self.fn.__name__
tt = clock()-t
try:
assert self.fn(ret,args[1]) == args[0]
except AssertionError:
print self.fn.__name__, "failed the decrypt test."
return
print self.fn.__name__, "took %i ms." % (tt * 1000)
print "File length:",len(ret),"\n\n"
def do_tests(file_name, password, tests):
'''tests must be a list with functions to test in it (list items are function type)'''
f = open(file_name,"r")
ftext = f.read()
f.close()
for test in tests:
sys.stdout = open("{0}_{1}.txt".format(test.name,"results"),"w")
profile.runctx("tests[{funcindex}]({filetext},'{pw}')".format(funcindex=tests.index(test),
filetext="'''{0}'''".format(ftext),
pw=password),globals(),locals())
sys.stdout = sys.__stdout__
Edited by TrustyTony because: stderr
jcao219 18 Posting Pro in Training
I got only announcement print from the first test to terminal
That means you are using IDLE or something, so the regular stdout isn't a console window.
TrustyTony 888 pyMod Team Colleague Featured Poster
You are right the original works if run directly and like normally the speed of execution is little faster, not so much different than some other occasions though.
Here would be nice piece to test actually the announcement of today: Assembly code module. Are you handy with SSE instructions?
(http://www.tahir007.com/?view=examples)
I only know very superficially x86 assembly (to know it is a mess), I knew better Z80 and ARM, now maybe those already rusty.
Of course that case better to improve AES functions (of which current state I do not know anything) or something.
By the way for these xor functions effect of psyco looks minimal.
Edited by TrustyTony because: ASM dialects
jcao219 18 Posting Pro in Training
You are right the original works if run directly and like normally the speed of execution is little faster, not so much different than some other occasions though.
Here would be nice piece to test actually the announcement of today: Assembly code module. Are you handy with SSE instructions?
(http://www.tahir007.com/?view=examples)Of course that case better to improve AES functions (of which current state I do not know anything) or something.
By the way for these xor functions effect of psyco looks minimal.
Eh.. I barely know anything about asm.
kylealanhale 0 Newbie Poster
Just in case any future reader of this thread wants a slightly more readable version of tonyjv's winning XOR crypt function:
def crypt(text, password):
password_length = len(password)
password = [ord(character) for character in password]
text = [ord(character) ^ password[index % password_length] for (index, character) in enumerate(text)]
return ''.join([chr(character_code) for character_code in text])
And, for fun:
def crypt(t, p):
l = len(p)
p = [ord(c) for c in p]
t = [ord(c) ^ p[i % l] for (i, c) in enumerate(t)]
return ''.join([chr(c) for c in t])
TrustyTony 888 pyMod Team Colleague Featured Poster
With my modified version of test function which prints running time also to terminal, I rerun the test, renaming the slow function to crypt1. So this last crypt is crypt.
First I thought your code was little slower than crypt6 and crypt7 in my file, but after moving your function not first one tested, the timing changed. So looks this timing function is not correct for the first tested function.
Interestingly cryptic version crpt got worse time!
I got following results:
Testing crypt1 took 11281 ms.
Testing crypt2 took 716 ms.
Testing crypt3 took 1195 ms.
Testing crypt4 took 1161 ms.
Testing crypt5 took 953 ms.
Testing crypt6 took 488 ms.
Testing crypt7 took 483 ms.
Testing crypt took 477 ms.
Testing crpt took 491 ms.
Testing crypt_loop took 958 ms.
Testing crypt_oneliner took 737 ms.
Enter
Good coding!
Edited by TrustyTony because: n/a
Tahir007 10 Newbie Poster
You are right the original works if run directly and like normally the speed of execution is little faster, not so much different than some other occasions though.
Here would be nice piece to test actually the announcement of today: Assembly code module. Are you handy with SSE instructions?
(http://www.tahir007.com/?view=examples)I only know very superficially x86 assembly (to know it is a mess), I knew better Z80 and ARM, now maybe those already rusty.
Of course that case better to improve AES functions (of which current state I do not know anything) or something.
By the way for these xor functions effect of psyco looks minimal.
Here is trivial implementation of crypt in Tdasm. :-)
Try crypt different file sizes to see difference in speed. :-)
On my machine.
600 KB pdf file - ~3-4 ms Tdasm implementation
600 KB pdf file - ~240 ms you implementation
Here is source:
<pre>
from tdasm import runtime
import array
import timeit
CRYPT_ASM = """
#DATA
uint32 len_pass, addr_pass
uint32 len_text, addr_text
#CODE
xor eax, eax
xor ebx, ebx ; clear eax and ebx registers
mov ecx, dword [len_text]
loop1:
dec ecx
mov eax, ecx
mov edx, 0
div dword [len_pass]
mov eax, dword [addr_pass]
mov al, byte [eax + edx] ; load byte from password
mov ebx, dword [addr_text]
xor byte [ebx + ecx], al
cmp ecx, 0
jnz loop1
#END
"""
r = runtime.Runtime()
r.create("crypt", CRYPT_ASM)
def crypt_asm(text, password):
ds = r.get_datasection("crypt")
pass_arr = array.array("c", password)
text_arr = array.array("c", text)
address, length = pass_arr.buffer_info()
ds["len_pass"] = length
ds["addr_pass"] = address
address, length = text_arr.buffer_info()
ds["len_text"] = length
ds["addr_text"] = address
r.run("crypt")
return text_arr.tostring()
def crypt(text, password):
password_length = len(password)
password = [ord(character) for character in password]
text = [ord(character) ^ password[index % password_length] for (index, character) in enumerate(text)]
return ''.join([chr(character_code) for character_code in text])
if __name__ == "__main__":
pa = "123456"
text = "ovo je samo za testiranje"
fi = open("test1.pdf", "rb")
text = fi.read()
t = timeit.Timer(lambda : crypt(text, pa))
print "time", t.timeit(1)
</pre>
Edited by Tahir007 because: n/a
Beat_Slayer commented: Nice code implementation +1
TrustyTony 888 pyMod Team Colleague Featured Poster
Here is trivial implementation of crypt in Tdasm. :-)
Try crypt different file sizes to see difference in speed. :-)
On my machine.600 KB pdf file - ~3-4 ms Tdasm implementation
600 KB pdf file - ~240 ms you implementation
Here is source:
Looks like you have more practice in ASM than with (CODE) tags (even mayby assembler is not there) ;)
Your code with proper tags (good show of for your module like I said, isn't it):
from tdasm import runtime
import array
import timeit
CRYPT_ASM = """
#DATA
uint32 len_pass, addr_pass
uint32 len_text, addr_text
#CODE
xor eax, eax
xor ebx, ebx ; clear eax and ebx registers
mov ecx, dword [len_text]
loop1:
dec ecx
mov eax, ecx
mov edx, 0
div dword [len_pass]
mov eax, dword [addr_pass]
mov al, byte [eax + edx] ; load byte from password
mov ebx, dword [addr_text]
xor byte [ebx + ecx], al
cmp ecx, 0
jnz loop1
#END
"""
r = runtime.Runtime()
r.create("crypt", CRYPT_ASM)
def crypt_asm(text, password):
ds = r.get_datasection("crypt")
pass_arr = array.array("c", password)
text_arr = array.array("c", text)
address, length = pass_arr.buffer_info()
ds["len_pass"] = length
ds["addr_pass"] = address
address, length = text_arr.buffer_info()
ds["len_text"] = length
ds["addr_text"] = address
r.run("crypt")
return text_arr.tostring()
def crypt(text, password):
password_length = len(password)
password = [ord(character) for character in password]
text = [ord(character) ^ password[index % password_length] for (index, character) in enumerate(text)]
return ''.join([chr(character_code) for character_code in text])
if __name__ == "__main__":
pa = "123456"
text = "ovo je samo za testiranje"
fi = open("test1.pdf", "rb")
text = fi.read()
t = timeit.Timer(lambda : crypt(text, pa))
print "time", t.timeit(1)
Edited by mike_2000_17 because: Fixed formatting
Tahir007 10 Newbie Poster
Looks like you have more practice in ASM than with [CODE] tags (even mayby assembler is not there) ;)
Your code with proper tags (good show of for your module like I said, isn't it):
You are right I know better ASM than Code tag. Its good show for my module. :-)
Now maybe i implement AES and SHA1 for the show. :-)
Edited by Reverend Jim because: Fixed formatting
TrustyTony 888 pyMod Team Colleague Featured Poster
Just to add results for selected versions and ASM version (crypt is the last posted version):
crypt2 took 750 ms.
crypt6 took 514 ms.
crypt7 took 505 ms.
crypt_loop took 1026 ms.
crypt_oneliner took 781 ms.
crypt took 507 ms.
crpt took 497 ms.
crypt_asm took 2 ms.
Edited by TrustyTony because: n/a
jcao219 18 Posting Pro in Training
Wow! That's some pretty good asm coding.
All I could understand was the
xor eax, eax
xor ebx, ebx
I'm impressed how fast it is.
I wonder how something in C/C++ would compare.
Edited by jcao219 because: n/a
TrustyTony 888 pyMod Team Colleague Featured Poster
That is quite well, Jcao219, because those xors is hackish version of clearing register, without need to use constant 0. (x xor x == 0)
>>> x=23423
>>> x ^ x
0
>>> x=192423472389
>>> x ^ x
0L
I believe the ASM is not very optimized one as it does not load full register and xor 4 bytes or 8 bytes at time, but one byte at time ( xor byte [ebx + ecx], al
). It is good basic version though.
I would like to prepare some functions to prepare ASM instructions little more readable way
Edited by TrustyTony because: n/a
Tahir007 10 Newbie Poster
#DATA
uint32 len_pass ;length in bytes of password
uint32 addr_pass ; address of first byte where password begins
uint32 len_text ; length in bytes of password
uint32 addr_text ; address of first byte where text begins
#CODE
xor eax, eax ; eax = 0
xor ebx, ebx ; ebx = 0
mov ecx, dword [len_text] ; ecx = number of character to crypt
; i crypt one by one character because of password, password can be 3 or 5 or ...
; character long and that why loop by one character
; if for passwrod we use some kind of padding so that
; password can be 4, 8, 12, ... bytes long than it will be very
; easy to implement MMX, SSE version that will be much faster
loop1: ; we crypt backwards from last character to first
dec ecx ; ecx = ecx - 1 array is from 0:n-1 thats why we first decrement index
mov eax, ecx ; we put current index of character in eax
mov edx, 0 ; this is because of div, we could also place xor edx, edx :-)
div dword [len_pass] ; edx = edx:eax % length_if_password
mov eax, dword [addr_pass] ; eax = address_of_first_byte_in_password
mov al, byte [eax + edx] ; al = password [eax + edx], edx = index in passwrod array
mov ebx, dword [addr_text] ; ebx = address of first byte of text to crypt
xor byte [ebx + ecx], al ; text[ebx + ecx] ecx = current index od byte to crypt
cmp ecx, 0 ; test if index in array of character reach zero to exit loop
jnz loop1
#END
Here is little more comments in assembly code.
kylealanhale 0 Newbie Poster
Very impressive!
TrustyTony 888 pyMod Team Colleague Featured Poster
Found one short manual I found about x86 is http://www.acm.uiuc.edu/sigwin/old/workshops/winasmtut.pdf
Maybe time to refresh memories from around 1986....
jcao219 18 Posting Pro in Training
Very interesting! But I have no use for ASM right now.
I should learn C before learning that stuff.
TrustyTony 888 pyMod Team Colleague Featured Poster
I did C version, C++ I do not know so well.
Unfortunately I had no energy to restudy C memory allocation sweetness. So this is command line program file to file.
Everything looks working now that I fixed the obvious thing that read in character must be declared as int, not char.
D:\test>python xorcryptp.py "Cold Roses" text_100kb.txt textp.txt
Running program took 131 ms
D:\test>xorcrypt "Cold Roses" text_100kb.txt textp.txt
102071 chars.
The total time taken by the system is: 15 ms.
D:\test>
I did version of main which took the same parameters and read file in and wrote it out.
So, because file IO is so slow I added using psyco module and got:
D:\test>python xorcryptp.py "Cold Roses" text_100kb.txt textp.txt
Running program took 56 ms
This attachment is potentially unsafe to open. It may be an executable that is capable of making changes to your file system, or it may require specific software to open. Use caution and only open this attachment if you are comfortable working with zip files.
Edited by TrustyTony because: n/a
jcao219 18 Posting Pro in Training
I see.
I might make a C# version sometime, to test .NET's speed.
jcao219 18 Posting Pro in Training
I've created a C# version.
Results:
100kb file took 3ms,
1mb file took 28ms.
Edited by jcao219 because: n/a
Tahir007 10 Newbie Poster
I see that you are trying to achieve better times. I couldn't resist to write
another version of crypt.
Here is version that is even simpler than before but twice as fast. :-)
CRYPT_ASM2 = """
#DATA
uint32 len_pass, addr_pass
uint32 len_text, addr_text
#CODE
mov edi, dword [addr_text] ; edi = point to first character in text
mov edx, dword [len_text] ; edx = lenght of text
loop2:
mov ecx, dword [len_pass] ; ecx = length of password
mov esi, dword [addr_pass] ; esi = point to first character in password
loop1:
mov al, byte [esi] ; al = *esi - for C programmers
inc esi ; esi++ - increment pointer for next password char.
xor byte [edi], al ; *edi ^= al
inc edi ; edi++ we just increment pointers
dec edx ; check if we crypt all text
cmp edx, 0 ; edx was the length of text
jz end1 ; if all text is crypt we finish
dec ecx ; check if loop through whole password
jne loop1 ; if we are not process next character
jmp loop2 ; if we are process password form begining
end1:
#END
"""
jcao219 18 Posting Pro in Training
How fast, exactly?
By the way, tonyjv, I think you forgot to close the infile and outfile in your C program.
Tahir007 10 Newbie Poster
On my machine I achieve these times.
File ~1 MB - 2.3 ms
File ~150 KB - 0.39 ms
File ~26MB - 75 ms
jcao219 18 Posting Pro in Training
That's very good.
Assembly is definitely the fastest,
and then well-written C/C++,
and then C#,
and probably Java is next,
and finally Python.
Edited by jcao219 because: n/a
TrustyTony 888 pyMod Team Colleague Featured Poster
How fast, exactly?
By the way, tonyjv, I think you forgot to close the infile and outfile in your C program.
Thanks, almost never use them in Python. So I put in the end of program:
fclose(infile);
fclose(outfile);
return 0;
The test case was too small to get measurement of the new version, so with one 3MB+ file the tests (only one Python version for obvious reason):
crypt_asm took 118 ms.
File length: 3919433
crypt_asm2 took 30 ms.
File length: 3919433
crypt took 22352 ms.
File length: 3919433
My C code file to file (not same as above, they have not file IO time)
D:\test\XorCrypting_SpeedTests>mycopy "cold roses" estonian.txt est.txt
3919433 chars.
The total time taken by the system is: 656 ms.
D:\test\XorCrypting_SpeedTests>xorcrypt "cold roses" estonian.txt est.txt
3919433 chars.
The total time taken by the system is: 765 ms.
I did version that only does copying without xor, difference between them is 109 ms.
Edited by TrustyTony because: n/a
jcao219 18 Posting Pro in Training
So we are up to 3 mb file crypto speed testing?
I'll do some more tomorrow.
Edited by jcao219 because: n/a
vegaseat 1,735 DaniWeb's Hypocrite Team Colleague
This method is great for basic cryptography in Python,
however advanced and secure encryptions such as AES offer the best degree of security.
For those of you interested in that, PyCrypto is for you.
To be honest, I was more interested in this aspect of the project:
The password is looped against the file, but you can get tricky and spell forward then backward, odd/even, or every odd character twice and every even character once. This will make it harder for grandma to decipher your secret files.
It doesn't take much of a genius to recommend a compiled language, if you want to go for speed alone. Actually, the original program was written in C with some inline assembler thrown in. Later a Delphi version gave the C version a run for the money.
Edited by vegaseat because: n/a
TrustyTony 888 pyMod Team Colleague Featured Poster
To be honest, I was more interested in this aspect of the project:
It doesn't take much of a genius to recommend a compiled language, if you want to go for speed alone. Actually, the original program was written in C with some inline assembler thrown in. Later a Delphi version gave the C version a run for the money.
OK, fine. Go ahead and use this nice crypto, as those are easy to m for your files but please crypt this small file attached with same password and function and save as tonyjv.txt.
Don't need to look it with text editor after crypting, just mail it to me :twisted:
This attachment is potentially unsafe to open. It may be an executable that is capable of making changes to your file system, or it may require specific software to open. Use caution and only open this attachment if you are comfortable working with octet-stream files.
jcao219 commented: Clever +1
jcao219 18 Posting Pro in Training
What surprises me is the .NET code seems to be faster.
(Measured from the creation of a stream for the input file,
to the closing of the output file stream after writing)
Size: 3928779
Elapsed milliseconds: 79
Edited by jcao219 because: n/a
Be a part of the DaniWeb community
We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.