Parallelism in Python Generators

Yesterday I stumbled upon a StackOverflow question that asked about implementing a Rosetta Code problem in parallel to speed it up. One easy way to do it is one, which is a modification of the python example on

from multiprocessing import Pool, cpu_count
from itertools import islice, count
def is_special(n, d):
tofind = str(d) * d
return tofind in str(d * n ** d)
def superd(d, N=10000):
if d != int(d) or not 2 <= d <= 9:
raise ValueError("argument must be integer from 2 to 9 inclusive")
with Pool(cpu_count() 2) as workers:
for offset in count(0, N):
worker_fn_args = zip(range(offset, offset + N), [d] * N)
is_superd_batch = workers.starmap(is_special, worker_fn_args)
yield from [n+offset for n in range(N) if is_superd_batch[n]]
if __name__ == '__main__':
for d in range(2, 10):
print(f"{d}:", ', '.join(str(n) for n in islice(superd(d), 10)))
view raw hosted with ❤ by GitHub
Source code for parallelism in generators

When I posted the question the OP commented that he/she was looking for using non-terminal sub-processes that yield these super-d values. I thought about it, but that version does not seem very practical. If the main process is interested in the results of the computation, then temporary concurrency will be the cleaner solution (like in this example). If the main thread isn’t, e.g., if you are running an old-school threaded web-server, that hands off incoming connections to a worker thread, then solutions with non-terminal sub-processes can make sense. In the latter case you are essentially “starting a program N times in parallel and shutting them down together”. This certainly makes sense, but only if those programs don’t need to communicate. Remember to KISS.

Thank you for reading and Happy Coding!


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s