never mind

Sorry about the earlier post. I realized what the problem was and I wanted to think about it for a while. Here is the question. How do I "negate" the effect of a "[" or "]" in a file name when I am doing globbing. For example, if my code has the following:

pattern = sys.argv[1]

for filename in glob.glob(pattern):
    print filename

works fine for patterns like "*.avi". But what if I don't want to actually specify a file pattern but instead want to specify a single file? If I want to do something to all avi files I can call

scriptname *.avi

but sometimes I just want to do

scriptname myfile.avi

works fine unless my filename has certain chars like

scriptname "The Jackal [1997].avi"

What do I have to do to the string to get glob to match the real file name?

Sorry about the earlier post. I realized what the problem was and I wanted to think about it for a while. Here is the question. How do I "negate" the effect of a "[" or "]" in a file name when I am doing globbing. For example, if my code has the following:

pattern = sys.argv[1]

for filename in glob.glob(pattern):
    print filename

works fine for patterns like "*.avi". But what if I don't want to actually specify a file pattern but instead want to specify a single file? If I want to do something to all avi files I can call

scriptname *.avi

but sometimes I just want to do

scriptname myfile.avi

works fine unless my filename has certain chars like

scriptname "The Jackal [1997].avi"

What do I have to do to the string to get glob to match the real file name?

I found a way by replacing [ with [[] and ] with []]. Here is the code

import glob
import sys
import re
s = sys.argv[1]
s = re.sub("[\[\]]", lambda m: "[%s]" % m.group(0), s)
print glob.glob(s)

""" example on the command line
$ foo.py "The Jackal [1997].avi"
['The Jackal [1997].avi']
"""

I did it slightly differently but same, without re (you loose the special meaning of [] though, so I added -e flag for exact replacement for []):

import glob
import sys

# use -e flag to replace []
pattern = ''.join(c if c not in '[]' else '[' + c + ']' for c in sys.argv[2]) if sys.argv[1]=='-e' else sys.argv[1]

print '\n'.join(glob.glob(pattern))

""" Example:
ke 14.09.2011  9:33:59,78 K:\test
>python foo.py "[x-z]*.py"
xfirstsort.py
xlsx_test.py
xltest.py
xl_test.py
xor_asm.py
yieldex.py
yieldwords.py
z_ex.py

ke 14.09.2011  9:34:08,40 K:\test
>python foo.py -e "The Jackal [1997].avi"
The Jackal [1997].avi

ke 14.09.2011  9:34:19,23 K:\test
>
"""

Either way it's not pretty, is it? At first I thought that escaping the "[" and "]" as "\[2011\]" would work but it didn't.

Either way it's not pretty, is it? At first I thought that escaping the "[" and "]" as "\[2011\]" would work but it didn't.

Python uses the fnmatch module to convert a shell pattern to a regular expression (the function fnmatch.translate). You could modify this function to treat characters [ and ] differently. In python 2.6.5, the function is

def translate(pat):
    """Translate a shell PATTERN to a regular expression.

    There is no way to quote meta-characters.
    """

    i, n = 0, len(pat)
    res = ''
    while i < n:
        c = pat[i]
        i = i+1
        if c == '*':
            res = res + '.*'
        elif c == '?':
            res = res + '.'
        elif c == '[':
            j = i
            if j < n and pat[j] == '!':
                j = j+1
            if j < n and pat[j] == ']':
                j = j+1
            while j < n and pat[j] != ']':
                j = j+1
            if j >= n:
                res = res + '\\['
            else:
                stuff = pat[i:j].replace('\\','\\\\')
                i = j+1
                if stuff[0] == '!':
                    stuff = '^' + stuff[1:]
                elif stuff[0] == '^':
                    stuff = '\\' + stuff
                res = '%s[%s]' % (res, stuff)
        else:
            res = res + re.escape(c)
    return res + '\Z(?ms)'

It shouldn't be too difficult to create your own simpler translate() function.

I think I'll play with that. Thanks.

Just commented the elif branch for '[' in my fnmatch, result, was expected: the escaping was not necessary for brackets, but no letter set operations either, then.

I took the easy (coward's) way out. I just replaced "[" and "]" with "?" before the globbing. I can't imagine that it will really matter for what I am doing even though I could probably come up with a really contrived example where it wouldn't do exactly what I want.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.