alright, so I recently obtained a reference implementation of MD5. I benchmarked it, and it ran about 30% faster than OpenSSL's MD5. The function call is as such:
fast_MD5(unsigned char *pData, int len, unsigned char *pDigest);
it worked great for a simple MD5 hash cracker, but I had trouble tweaking it to work for the MD5-HMAC for my Kerberos 5 cracker. The MD5-HMAC (using OpenSSL) involves the following:
MD5_Init(&ctx);
MD5_Update(&ctx,k_ipad,64);
MD5_Update(&ctx,(uint8_t*)&T,4);
MD5_Final(K1,&ctx);
MD5_Init(&ctx);
MD5_Update(&ctx,k_opad,64);
MD5_Update(&ctx,K1,16);
MD5_Final(K1,&ctx);
in addition to other steps. Note that the MD5_Update is called twice, in order to concatenate two buffers and decrypt the concatenated result. But the fast_MD5 function does not give me the flexibility to do it this way, so I tried this:
unsigned char feedbuffer[100];
memcpy(feedbuffer,k_ipad,64);
memcpy(feedbuffer+64,(uint8_t*)&T,4);
fast_MD5(feedbuffer,68,K1);
memcpy(feedbuffer,k_opad,64);
memcpy(feedbuffer+64,K1,16);
fast_MD5(feedbuffer,80,K1);
and it worked (take buffer1 and buffer2 and concatenate them into buffer3 using memcpy and then decrypt the resultant buffer3), but it ran slower than before when I was using OpenSSL!
I thought this might have something to do with RAM access speed versus L2 cache access speed, so I tried this, hoping it would fix the problem:
unsigned char feedbuffer[100];
for(int i=0;i<64;i++)
feedbuffer[i]=k_ipad[i];
memcpy(feedbuffer+64,(uint8_t*)&T,4);//I didn't know how to convert this...
fast_MD5(feedbuffer,68,K1);
for(int i=0;i<64;i++)
feedbuffer[i]=k_opad[i];
for(int i=0;i<16;i++)
feedbuffer[i+64]=K1[i];
fast_MD5(feedbuffer,80,K1);
but again, it ran slower than before!
Is there some trick to accomplish what I am trying to accomplish faster? I don't understand, because I looked at the source code for the MD5 implementation in rfc1321 and the MD5_Update function used memcpy to process the input. So why should it run slower if I am performing the same operation outside the function, rather than inside?
Please help! Thanks!