tudborg.com

EDIT

I created Asynckit.py over 5 years ago. There is plenty of other ways to do this by now. If you use Python 3, take a look at the asyncio module instead.

A few weeks ago i needed a python script to do a bunch of similar requests in parallel and return the results. I turned to PyPI for a simple little library to help me out. To my surprise i couldn’t find what i was looking for (maybe due to PyPI’s search being horrible, or my search terms, who knows).

Two days later i published the first version of asynckit to PyPI.

§So, what is asynckit?

Asynckit is a tiny library that enables you to run your existing functions in parallel and retrieve the return values when the work completes.

§What to use it for?

You could use asynckit to download a bunch of websites in parallel, like this:

 1from asynckit import Pool
 2from urllib2 import urlopen
 3
 4pool = Pool(worker_count=3)
 5
 6urls = (
 7    'http://henriktudborg.dk',
 8    'http://github.com',
 9    'http://lzy.dk',
10)
11
12results = [pool.do(lambda x: urlopen(x).read(), url) for url in urls]
13
14print [len(result.get(True)) for result in results]

(the .get(True) call on the result blocks until result is ready, then returns the stored value)

If one of the calls raised an exception, a call to .get() will re-raise it. You can check if a call raised an exception by calling .is_error() on the result object.

§When should you use it?

For those tiny python utilities in your ~/src/misc or wherever you place your tiny hacks.

Asynckit has no dependencies outside the standard library, so it is great if you try to keep dependencies to a minimum.

It also installs in a few seconds ( pip install asynckit ) and requires no configuration.

Personally i keep it installed globally and use it in most of my single-script python tools that i have build over the last month.

I find it to be a really great companion for anything involving urllib2 work (like scraping a bunch of websites).

§When shouldn’t you use it?

When you could use Celery instead. Seriously, it is super awesome!

Celery is better in almost every way, but requires an external message queue, and quite a bit of configuration.

In most larger projects you will want to look at Celery or equivalent instead.

Head over to github.com/tudborg/asynckit.py to see installation instructions and usage, or hang around for a few more examples.

§Some Asynckit examples

§Download websites in parallel, wait for all downloads to complete, then print total bytes downloaded:

 1from asynckit import Pool, AsyncList
 2from urllib2 import urlopen
 3
 4pool = Pool(worker_count=4)
 5
 6urls = (
 7    'http://henriktudborg.dk',
 8    'http://lzy.dk',
 9    'http://etilbudsavis.dk',
10    'http://github.com'
11    # more urls here
12)
13
14result = AsyncList([pool.do(lambda x: len(urlopen(x).read()), url) for url in urls])
15
16print sum(result.get(True))

AsyncList accepts a list of AsyncValue objects as first argument and returns a list of “real” values when calling .get().

§Nested and chained results

Using an AsyncValue as an argument to the .do() or .chain() methods will wait for that value to be ready before running the work that requires it.

Chained work is a way of waiting for a result, without using it as an argument. Like a cleanup job, chained after some big work.

(note that there is currently no way of chaining something to an AsyncList)

 1from asynckit import Pool, AsyncList
 2from urllib2 import urlopen
 3
 4pool = Pool(worker_count=1)
 5
 6def heavy_work(a,b):
 7    return a+b
 8
 9def proudly_display(result):
10    print result.get(True)
11
12def say_bye():
13    print "bye"
14
15# nested example. Use an AsyncValue object as argument to .do()
16# ( .do() itself returns an AsyncValue )
17nested_call = pool.do(heavy_work, 1, 
18                pool.do(heavy_work, 2, 
19                    pool.do(heavy_work, 1, 3)))
20
21#chain example, say bye after proudly displaying the result, and wait for it all to happen
22pool.do(proudly_display, nested_call).chain(say_bye).wait()

See more examples and report any issues on GitHub.