I am still reading the learing python o'reilly book and not sure the best way to approch my problem.

Given c:\dir1\dir2\dir3.

I want to zip all files in dir3 if those files are older than 30 days using 1 zip file (ie. dir_3_files.zip). If all files in dir3 are older 30 days, I want to zip the directory (ie. dir3.zip). I want to recursively keep doing this until I reach the top.

This is what I have so far, but I don't know where to interrupt os.walk to work with files in a directory. I hope this makes sense.

import os, os.path, stat, time
from datetime import date, timedelta

dirsNotUsed = []

def getdirs(basedir, age):
    for root, dirs, files in os.walk(basedir):
        basedate, lastused = datecheck(root, age)
        if lastused < basedate:             #Gets files older than (age) days
            dirsNotUsed.append(root)

def datecheck(root, age):
    basedate = date.today() - timedelta(days=age)
    used = os.stat(root).st_mtime    # st_mtime=modified, st_atime=accessed
    year, day, month = time.localtime(used)[:3]
    lastused = date(year, day, month)
    return basedate, lastused

def archive():
    pass

def main():
    basedir = raw_input('Choose directory to scan: ')
    age = raw_input('Only scan files older than... (days): ')
    getdirs(basedir, int(age))

if __name__ == '__main__':
    main()

This will traverse all of the subdirectories and do what I think you want it to do. I have added some print statements which is probably enough by itself to answer your question as is shows what the root, dirs, and files contain. If you want to limit it to 3 levels, then you will want to store root and the first 2 dirs and pass them as basedir and then just use the files for that particular directory. If you are on a Linux system, pipe the output to a file if there are a lot of files and dirs. It will be easier to read. There are other ways of doing this using an os.path.walk() callback but I assume you want to continue down this road as you are following the book.

import os, os.path, stat, time
from datetime import date, timedelta

dirsNotUsed = []
def getdirs(basedir, age):
    for root, dirs, files in os.walk(basedir):
        print "root =", root
        print "dirs =", dirs
        print "files =", files

        found = 1
        for file in files:
           found_file = datecheck(root, file, age)
           if not found_file :             #At least one file is not old enough
               found = 0

           """ or backup all of the files that are old enough
           if found_file:
              backup_list.append(os.path.join(root, file))
           """

        if found:
           archive(root, files)

def datecheck(root, file, age):
    basedate = date.today() - timedelta(days=age)
    fname = os.path.join(root, file)
    used = os.stat(fname).st_mtime    # st_mtime=modified, st_atime=accessed
    year, day, month = time.localtime(used)[:3]
    lastused = date(year, day, month)
    if lastused < basedate:             #Gets files older than (age) days
       return 1
    return 0                                  # Not old enough

def archive(root, files):
   for file in files:
      fname=os.path.join(root, file)
      print "archiving", fname

if __name__ == '__main__':
    basedir = raw_input('Choose directory to scan: ')
    age = raw_input('Only scan files older than... (days): ')
    getdirs(basedir, int(age))

For my first script, I thought I didn't do to bad. This makes much more sense to me now. By simply adding the extra print statements, found flag and the returns from the datecheck funtion really helped!

Thanks!

By simply adding the extra print statements

That solves a lot of problems. You can also use os.path.walk with a callback.

def processDirectory ( args, dirname, filenames ):                              
    print 'Directory',dirname                                                   
    for filename in filenames:                                                  
       print '     File',filename                                               
                                                                                
top_level_dir = "/usr/local"                                                    
os.path.walk(top_level_dir, processDirectory, None )                            
                                                                                
##os.path.walk() works with a callback: processDirectory() will be              
##called for each directory encountered.
Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.