I have a collection of (potentially) several hundred thousand image files that I need to generate hash digests for, and I'm unsure of the best algorithm to use. I'll be keying them all to a database based on their hash, so I need the best algorithm for avoiding accidental collisions (where different files end up generating the same hash).
CRC32 is out of the question because I know it's threshold is fairly low, so I'm thinking either MD5 or SHA1, but I don't know which one is better for my purposes or if there's an even better algorithm. Most of what I've found after searching has recommended SHA1 as being 'more secure', but in my case I'm only concerned with accidental collisions rather than intentional malicious ones. Would SHA1 still be the preferable one for my purposes?