`cliche.celery` — Celery-backed task queue worker¶

Sometimes web app should provide time-consuming features that cannot immediately respond to user (and we define “immediately” as “shorter than a second or two seconds” in here). Such things should be queued and then processed by background workers. Celery does that in natural way.

We use this at serveral points like resampling images to make thumbnails, or crawling ontology data from other services. Such tasks are definitely cannot “immediately” respond.

How to define tasks¶

In order to defer some types of tasks, you have to make these functions a task. It’s not a big deal, just attach a decorator to them:

@celery.task(ignore_result=True)
def do_heavy_work(some, inputs):
    '''Do something heavy work.'''
    ...

How to defer tasks¶

It’s similar to ordinary function calls except it uses delay() method (or apply_async() method) instead of calling operator:

do_heavy_work.delay('some', inputs='...')

That command will be queued and sent to one of distributed workers. That means these argument values are serialized using json. If any argument value isn’t serializable it will error. Simple objects like numbers, strings, tuples, lists, dictionaries are safe to serialize. In the other hand, entity objects (that an instance of cliche.orm.Base and its subtypes) mostly fail to serialize, so use primary key values like entity id instead of object itself.

What things are ready for task?¶

Every deferred call of task share equivalent inital state:

You can get a database session using get_session().
You also can get a database engine using get_database_engine().

While there are several things not ready either:

Flask’s request context isn’t ready for each task. You should explicitly deal with it using request_context() method to use context locals like flask.request. See also The Request Context.
Physical computers would differ from web environment. Total memory, CPU capacity, the number of processors, IP address, operating system, Python VM (which of PyPy or CPython), and other many environments also can vary. Assume nothing on these variables.
Hence global states (e.g. module-level global variables) are completely isolated from web environment which called the task. Don’t depend on such global states.

How to run Celery worker¶

celery worker (formerly celeryd) takes Celery app object as its endpoint, and Cliche’s endpoint is cliche.celery.celery. You can omit the latter variable name and module name: cliche. Execute the following command in the shell:

$ celery worker -A cliche --config dev.cfg.yml
 -------------- celery@localhost v3.1.13 (Cipater)
---- **** -----
--- * ***  * -- Darwin-13.3.0-x86_64-i386-64bit
-- * - **** ---
- ** ---------- [config]
- ** ---------- .> app:         cliche.celery:0x1... (cliche.celery.Loader)
- ** ---------- .> transport:   redis://localhost:6379/5
- ** ---------- .> results:     disabled
- *** --- * --- .> concurrency: 4 (prefork)
-- ******* ----
--- ***** ----- [queues]
 -------------- .> celery           exchange=celery(direct) key=celery


[2014-09-12 00:31:25,150: WARNING/MainProcess] celery@localhost ready.

Note that you should pass the same configuration file (--config option) to the WSGI application. It should contain DATABASE_URL and so on.

References¶

class cliche.celery.Loader(app, **kwargs)¶: The loader used by Cliche app.

cliche.celery.get_database_engine() → sqlalchemy.engine.base.Engine¶

Get a database engine.

Returns:	a database engine
Return type:	`sqlalchemy.engine.base.Engine`

cliche.celery.get_session() -> sessionmaker(class_='Session', bind=None, expire_on_commit=True, autoflush=True, autocommit=True)¶

Get a database session.

Returns:	a database session
Return type:	`Session`

cliche.celery.get_raven_client() → raven.base.Client¶

Get a raven client.

Returns:	a raven client
Return type:	`raven.Client`

cliche.celery — Celery-backed task queue worker¶