Speed tests for SQLAlchemy vs Django vs PyDO

At work, we're preparing to move away from a 6-year-old homebrew web framework to Django. In the process, I figured I'd do some speed tests of the old ORM (PyDO, version one, last updated in 2004 and with an author's contact email at geocities.com) with the Django ORM, and SQLAlchemy for S&G.

For the speed test, I'm loading just under 5000 objects from the db, each with 18 fields, and each 'row' averaging 2.5kb in size according to Sequel Pro. The non-sequential ids of the objects I'm loading are stored in a list: rids.

To reduce the effect of random changes in db performance, I'm running each query five times and discarding outliers to come up with a ballpark number.

I'm also using profile, built into iPython.

PyDO: 6.2 seconds

PyDO predates python 2.4, so instead of using @classmethods, it resorts to some convoluted uses of a proxy object and bizarre naming conventions for class methods (which it calls static methods). It is unsurprising that it performs poorly; there's a lot of overhead with type inference and __getattr__ calls.

Loading the objects looks like this:

%prun foo = [Release.getSQLWhere('release_id in (%s)' % utils.list_to_csv(rids)) for i in range(5)]

Django: 5.3 seconds

In order to force the Django queryset to populate itself, I'm casting it to a list(). There may be a more efficient means to retrieve all of the objects, but I can't think of one. I'm also using a schema definition provided by the 'inspectdb' management command, without tweaks.

In django's queryset notation, this looks like:

%prun foo = [list(Releases.objects.filter(release_id__in=rids)) for i in range(5)]

SQAlchemy: 4.5 seconds

This was a great deal more complicated to set up, largely because I was trying to avoid writing the model's class by hand. After much digging, I discovered SQLSoup, which provides a dynamic "model" autogenerated from inspecting a table. I'm unsure whether writing the model class by hand would improve this result or make it worse - there may be some extra overhead for converting these generic models into proper Release models. Also, I'm calling all() to resolve the query into objects.

Anyway, in the SQLSoup notation, this looks like:

%prun foo = [db.releases.filter(db.releases.release_id.in_(rids)).all() for i in range(5)]

Conclusion

Interestingly, all three are spaced about 15% apart, in terms of performance. Django, however, was a breeze to set up the models for, whereas SQLAlchemy took a lot more hoop-jumping. I won't be suggesting SQLAlchemy for this particular project, as many of the problems we're facing now with the homebrew stack stem directly from its DIY nature. I'm hoping Django's full-stack approach will keep the crazy reigned in for longer.