Python Concurrency Decorators

thebigspacefuck · on May 12, 2016

Another page you might be interested in bookmarking:

https://wiki.python.org/moin/PythonDecoratorLibrary

A similar decorator to thread function calls for concurrency:

https://wiki.python.org/moin/PythonDecoratorLibrary#Lazy_Thu...

elpres · on May 13, 2016

Doesn't this second decorator only work with functions doing IO or releasing the GIL in any other way (like time.sleep() that they use in the example)? If all the function is doing is actual computation (i.e. using the CPU), no other code can run in parallel with it because of the GIL.

thebigspacefuck · on May 13, 2016

Yes, that's right, it is best used for I/O-bound applications. You could use it for:

* running multiple Database queries

* SSH-ing into multiple devices to run a command

* loading multiple web pages

* calling multiple APIs

lqdc13 · on May 13, 2016

A lot of these need to be updated. For one, they don't follow PEP8.

Secondly, unfortunately, the signal thing doesn't work on Windows and the threading thing is a bad example because it simulates multiplication, but if you actually had a CPU intensive task, there would be no performance benefits.

stanislavb · on May 13, 2016

And here they are a few more decorators https://python.libhunt.com/categories/284-concurrency-and-pa...

manoDev · on May 13, 2016

Never heard this name for this pattern (thunk). Isn't this the same as a Future/Promise object?

issaria · on May 13, 2016

Thunk is from haskell,.

rcthompson · on May 12, 2016

So, for me, replacing imap with pool.imap is the easy part. The hard part is dealing with things like handling exceptions, catching keyboard interrupts, and so on. Does this module do anything to address these issues?

hyperion2010 · on May 13, 2016

I have a hack for dealing with KeyboardInterrupt's on a ProcessPoolExecutor: https://github.com/tgbugs/desc/blob/master/util/process_fixe.... I used this in concert with asyncio run_in_executor which helps with some of the exception handling.

lqdc13 · on May 13, 2016

That's a nice hack when you have only a few elements in the iterable, but Futures are pretty heavyweight, so this is unfortunately a lot slower than multiprocessing.Pool when you have many elements.

The other thing is that this won't work on Windows. Efficient multiprocessing always a pain in Python in my experience.

gshulegaard · on May 12, 2016

A couple of things offhand:

- It has Python 2 and 3 support

- It's a wrapper for the Python built-in "multiprocessing" library

- It spreads out work over all cores (so the abstraction hides the ability to control the pool)

Seems like a great way to get your feet wet with multiprocessing in Python, but it likely has limited use in production...although certain infrastructures like resource limited containers might be able to accommodate it.

tomkinstinch · on May 12, 2016

It's not Py3 compatible yet, but from the commit log it seems like they're working on it (we could all help?).

vector57 · on May 13, 2016

I would love some help on this project! I have managed to get Python 3 support working in the 0.3 release for at least a few examples, and any further help with bug reporting would be very much appreciated.

gshulegaard · on May 12, 2016

I wonder what is keeping them from Python 3 support. I'll take a look, although I am currently stuck on Python 2.7 for my current project.

tomkinstinch · on May 12, 2016

See this issue comment: https://github.com/alex-sherman/deco/issues/2#issuecomment-2...

zjaffee · on May 12, 2016

Does anyone know how this library addresses the Global Interpreter Lock (GIL) issue, as multiprocessing really isn't that great/is worse than running on a single thread in many versions of python.

pdonis · on May 13, 2016

> the Global Interpreter Lock (GIL) issue

This applies to programs that run Python bytecode in multiple threads in the same process. Multiprocessing forks multiple processes, so there is no GIL issue.

rcthompson · on May 12, 2016

Is there no way to specify the number of cores to use?

jonesetc · on May 12, 2016

Looks like there is:

https://github.com/alex-sherman/deco/blob/cee63391bf4c6d66ee...

gshulegaard · on May 12, 2016

Interesting, so when they claim it automatically scales out to all cores what they mean is it defaults to 3 unless overridden.

lqdc13 · on May 13, 2016

Should really be multiprocessing.cpu_count() / 2 or something like that.

xapata · on May 13, 2016

I found out recently that some tasks do better if there are slightly more processes than cores.

tomkinstinch · on May 12, 2016

If you check out the test file[1] in the repository for deco, it looks like you can specify the number of cores by setting the "processes" attribute of the concurrent-decorated function [object].

1. https://github.com/alex-sherman/deco/blob/master/conc_test.p...

https://github.com/alex-sherman/deco/blob/cee63391bf4c6d66ee...

gshulegaard · on May 12, 2016

You could certainly limit the process pool in the multiprocessing library itself: https://docs.python.org/2/library/multiprocessing.html ... but it doesn't seem like this decorator abstraction accounts for it.

> That's it, two lines of changes is all we need in order to parallelize this program. Now this program will make use of all the cores on the machine it's running on, allowing it to run significantly faster.

> As an overview, DECO is mainly just a smart wrapper for Python's multiprocessing.pool. When @concurrent is applied to a function it replaces it with calls to pool.apply_async. Additionally when arguments are passed to pool.apply_async, DECO replaces any index mutable objects with proxies, allowing it to detect and synchronize mutations of these objects. The results of these calls can then be obtained by calling wait() on the concurrent function, invoking a synchronization event.

I haven't really dug around the source code, but it sounds like not really.

vector57 · on May 13, 2016

There is in the latest version. Using like @concurrent(...) passes all arguments directly to Pool(...)

japaget · on May 12, 2016

See paper at https://drive.google.com/file/d/0B_olmC0u8E3gWTBmN3pydGxHdEE...

Klasiaster · on May 12, 2016

With Python 3.5 there is native support for concurrency by using the keywords await and async.

For simple usage if you are familiar with Go there is this library: https://github.com/pothos/awaitchannel

zardeh · on May 12, 2016

Asynch and concurrency are very different. Python 3.5 allows first class aysnchronous calls, but concurrency is still "hard".

tomkinstinch · on May 12, 2016

Of note is that the version of deco on PyPI[1] (0.2) is incompatible with Python 3.

There have been few commits to fix compatibility, but it's not there yet.

1. https://pypi.python.org/pypi/deco

mpdehaan2 · on May 12, 2016

Perhaps a strange choice of the word "synchronized" when coming from Java, this typically implies a critical section. Here it seems to initialize a multiprocessing pool for use in the function labelled concurrent (perhaps)?

chrisseaton · on May 12, 2016

I think it's like sync in Cilk, meaning that all concurrent jobs must have finished before the part that is annotated synchronised is left. So it's not without precedence.

cyberfart · on May 13, 2016

"We have proposed DECO, a simplification of concurrent programming techniques targeted at programmers with little understanding of concurrent programming." (from paper)

gave me the chills

manoDev · on May 13, 2016

Related: https://github.com/hcarvalhoalves/python-pmap

awinter-py · on May 12, 2016

this is such a good idea! the @synchronized decorator to collect the parallelized task at the end of a parent call is very very smart & simple.

nomel · on May 13, 2016

It also has a potential for slowing down your code if used willy nilly, of course.

Multiprocessing gets pretty useless for anything outside of independent CPU bound tasks with little IPC and simple data types that can be stuck into shared memory.

If you're using multiprocessing pools so often that you think you need a decorator to clean up your code, then wow, I'd like to see what you're up to. ;)

m_mueller · on May 13, 2016

> If you're using multiprocessing pools so often that you think you need a decorator to clean up your code, then wow, I'd like to see what you're up to. ;)

.. and I think once you do that in Python you should probably use NumPy.

I've been thinking that there should be a way to just program kernels in Modern Fortran because it's the easiest to interact with NumPy data structures (NumPy is nothing but glue code around a collection of very efficient Fortran numeric code). f2py [1] is doing that basically, but I've never had the chance to setup a project like this.

[1] http://docs.scipy.org/doc/numpy-1.10.1/user/c-info.python-as...

awinter-py · on May 13, 2016

you guessed right that this isn't directly useful to my life. all my parallelism is already taken care of by a framework.

that said I really like the idea of using inner and outer function calls as a hook for spawning and collecting promises. it doesn't only have to be a join on a CPU-bound worker pool; this feels like a cleaner way to abstract IO waits than the yield statements I saw in an early prototype of tulip.

In general, this feels like a clean way to compose any library logic that involves an event loop or execution plan rather than just a function call.

tbarbugli · on May 12, 2016

source looks surprising, for instance the decorators parse the decorated function code and build an AST (still have to find out why)

madisonmay · on May 12, 2016

My best guess is that this is to circumvent issues with naming encountered by using decorated functions in conjunction with multiprocessing pools. Typically you would run into serialization errors.

tbarbugli · on May 13, 2016

it turns out that the decorator "rewrite" the fn code. that's at least how the synchronized decorator does its own magic. Once I saw that in the source code (and other few hacks) I was sure I was never going to use this lib :)

bmh_ca · on May 12, 2016

This reminds me of AppEngine's ndb tasklets.

tacos · on May 12, 2016

I've been looking for a way to replace functools.partial and pool.map with something that could cause me to make bad architectural decisions. This could be the ticket.

madisonmay · on May 12, 2016

Haha, having mucked around with similar mechanisms for dealing with concurrency via decorators I have to agree with you that this is likely to cause hard to debug behavior (especially since direct modification of the AST is going on behind the scenes here). That being said, it's an interesting thought experiment and likely an excellent class project.

eximius · on May 12, 2016

Let me sum up my feelings towards this: neat!

lormayna · on May 12, 2016

Then I can easily parallelize existing code without touching anything?

max_ · on May 12, 2016

I love developing in Flask cause of the way decorators written.