質問

I'm working on extracting the pattern def ([^\s]+)\([^\.]*\) in Python. However, when I have multiline input, only the first occurrence is obtained. I have specific the re.MULTILINE option on my Python regular expression but still to no avail. Lets say I have the following input:

def a():
    pass
b()
def b():
    pass

My regular expression only extracts the 'a' and doesn't continue and extract 'b'. The code I'm using is:

self.function_re = re.compile(r'def (\S+)\([^\.]*\)', re.MULTILINE)
print(self.function_re.findall(self.code))

Which outputs ['a'].

役に立ちましたか?

解決

I'm guessing your pattern for the parameter list is too greedy, and matches all the way up to the last closing parenthesis in the string. Try using def (\S+)\([^\.]*?\) (note the ? qualifier after the "zero or more" quantifier for your parameter list).

他のヒント

It's because the \([^\.]*\) part is greedy, ie. it matches the whole part from the first parenthesis down to the very last one:

>>> r = re.compile(r'def ([^\s]+)(\([^\.]*\))')
>>> r.findall(test)
[('a', '():\n        pass\nb()\ndef b()')]

If you make it non-greedy by appending the ? to the star, it should be all fine:

>>> r = re.compile(r'def ([^\s]+)\([^\.]*?\)')
>>> r.findall(test)
['a', 'b']
ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top