I'm doing some research to determine the most efficient way to copy files. I've got 3 candidate functions:
#1
# uses generator to load segments to memory
def copy(src, dst, iteration=1000):
for x in xrange(iteration):
def _write(filesrc, filedst):
filegen = iter(lambda: filesrc.read(16384),"")
try:
while True:
filedst.write(filegen.next())
except StopIteration:
pass
with open(src, 'rb') as fsrc:
with open(dst, 'wb') as fdst:
_write(fsrc, fdst)
#2
# loads entire file to memory
def copy2(src, dst, iteration=1000):
for x in xrange(iteration):
with open(src, 'rb') as fsrc:
with open(dst, 'wb') as fdst:
fdst.write(fsrc.read())
#3
def copy3(src, dst, iteration=1000):
for x in xrange(iteration):
with open(src, 'rb') as fsrc:
with open(dst, 'wb') as fdst:
for x in iter(lambda: fsrc.read(16384),""):
fdst.write(x)
System Environment:
Win 7 64 bit
3gb ram
Intel Core 2 Duo @ 2.0 GHz
The results
when the file size is 1mb, 1000 iterations each:
>>> copy(SRC, DST)
copy took 5.96600008011 seconds
>>> copy2(SRC,DST)
copy2 took 3.85299992561 seconds
>>> copy3(SRC, DST)
copy3 took 5.35699987411 seconds
The most efficient function is the one that loads the file entirely to memory
when the file size is 107mb 5 iterations each:
>>> copy(SRC, DST, 5)
copy took 3.04099988937 seconds
>>> copy2(SRC,DST, 5)
copy2 took 17.0360000134 seconds
>>> copy3(SRC, DST, 5)
copy3 took 2.2429997921 seconds
Loading the file entirely to memory is now the slowest by far
I thought the results were interesting, if anyone has a more efficient function feel free to contribute