I have a problem guys. It's due to duck typing. Now I expected to run into something like this sooner or later, but I can't help but feel there's a better solution.

import re
def patternMatching(pattern, string):  
    matchList = re.findall(pattern, string)
    print '\n'.join(['%s' % v for v in matchList])

If you run that function as patternMatching(r'The (fox)', 'The fox'), it gives you 'fox', right? Because matchList is .

But what if you run it as patternMatching(r'(The) (fox)', 'The fox')? You get an error. Because matchList is [('The','fox')].

Duck typing is messing with my types, so the print function won't work.

Traceback (most recent call last):
  File "<pyshell#52>", line 1, in <module>
    print '\n'.join(['%s' % v for v in matchList])
TypeError: not all arguments converted during string formatting

What's a programmer to do? My patch solution is to check the type of the first element and then do two cases. But that won't hold if I get something like [('hello','bye'),'hello']... which I'm not sure I ever will, but the point remains.

It is only a duck if it acts, looks and sounds like one. Processed duck is not a duck.

Actually, it it is looking, acting and sounding like a duck. It's a 1-tuple, so it's still a tuple. Python seems to think a 1-tuple is better expressed simply as the element within the 1-tuple, however, which causes this problem.

I think you have to write it this way:
patternMatching(r"The|fox", 'The little brown fox')

Help on function findall in module re:

findall(pattern, string, flags=0)
Return a list of all non-overlapping matches in the string.

If one or more groups are present in the pattern, return a
list of groups; this will be a list of tuples if the pattern
has more than one group.

Empty matches are included in the result.

I think you have to write it this way:
patternMatching(r"The|fox", 'The little brown fox')

Thanks! That would normally work. Unfortunately, that's not possible. This programme is designed to accept a user's own regular expression, so I have to accept the full range of regular expression syntax. I've got this solution so far, but I figured surely something more elegant would exist.

def patternMatching(self, event):
    matchList = re.findall(self.patternArea.GetValue(), self.inputArea.GetValue())
    for n in range(0, len(matchList)):
        if type(matchList[n]) == type(''):
            matchList[n] = (matchList[n],)
    outputString = ''
    for tupleElement in matchList:
        outputString += '\n'.join(['%s' % stringElement for stringElement in tupleElement])
    self.outputArea.SetValue(outputString)

Slate: Yes, but that's counter-intuitive. If there's one group you'll be iterating over a string, but if there's more than one it will be a tuple. Kind of un-Pythonic, no? It would make much more sense for the case of one group to be a 1-tuple.

The title of the forum is about duck typing. The problem at hand has nothing to do with it IMHO.

I agree. Re.findall function is not so well suited for taking a user input regular expression, because it changes the output type depending on the input regexp.

This design decission can be argued against. A pro argument would be, that most of the time the regular expression is programmed by hand, and most of the time there is no more than one group. I don't know if it is true however. An against is your use case.

Try re.finditer as in:

def patternMatching(pattern, string):
  print '\n'.join(string[matchobj.start():matchobj.end()]
                  for matchobj in re.finditer(pattern, string))

- Paddy.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.