Python Multiprocessing Module and Closures
Python closures python
Published: 2013-01-16
Python Multiprocessing Module and Closures

At work, I wrote a Python script which uses the multiprocessing module to process many servers in parallel. The code looks something like:

1
2
3
4
5
6
7
def processServer(server):
    # Do work...

numParallelTasks = ...
servers = [...]
pool = multiprocessing.Pool(processes=numParallelTasks)
results = pool.map(processServer, servers)

I wanted to pass some extra state to processServer without using a global variable. My first attempt was to use a closure, so I wrote the following:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
def processServer(extraState):
    def processServerWorker(server):
        # Do work, using extraState as needed
    return processServerWorker

numParallelTasks = ...
servers = [...]
extraState = ...
pool = multiprocessing.Pool(processes=numParallelTasks)
results = pool.map(processServer(extraState), servers)

This failed with the following error:

1
2
3
4
5
6
7
8
9
Exception in thread Thread-2:
Traceback (most recent call last):
  File "C:\Program Files\Python 2.7.1\lib\threading.py", line 530, in __bootstrap_inner
    self.run()
  File "C:\Program Files\Python 2.7.1\lib\threading.py", line 483, in run
    self.__target(*self.__args, **self.__kwargs)
  File "C:\Program Files\Python 2.7.1\lib\multiprocessing\pool.py", line 285, in _handle_tasks
    put(task)
PicklingError: Can't pickle : attribute lookup __builtin__.function failed

The solution was to change processServer from a function closure into a class with a __call__ method as follows:

1
2
3
4
5
6
class processServer:
    def __init__(self, extraState):
        self.extraState = extraState

    def __call__(self, server):
        # Do work, using self.extraState as needed