I’d been running my web site for comedians LaffQ.com in Google App Engine for nearly two years. Google App Engine seemed pretty neat when it first came out: it was the only free hosting service I knew of where you could deploy dynamic Python apps (using Django no less, a framework I was already familiar with) with the promise of Google managing the backups, scaling, and provisioning.
But as time went on, the sunny optimism began to fade. Although Google supported Django, it uses a proprietary BigTable-backed database which was not compatible with Django’s object-relational wrapper (ORM). The originally unbelievably high free limits during the beta period which reduced drastically, in some cases absurdly, when the product came out of beta. Visiting my internal operations page just twice could blow out nearly the entire day’s quota because it produced counts of many tables — so each item counted towards a daily quota of 50000 “small operations”.
Developing for Google App Engine was always a pain. It was non-standard enough that a lot of libraries and tools wouldn’t work or need annoying changes. There was a single point of documentation, written in a sort of corporate Google-ese — not horrible, but not the nicest documentation I ever read. There were a lot of layers of abstraction. It was proprietary. It was slow to fetch data. The pdb debugger didn’t work very well. Getting data out of production was an ordeal. It was even hard to launch a instance on the command line, which meant on those rare occasions when I decided to do development on my Linux netbook, I spent most of my time getting the newest version of Google App Engine to work again.
I don’t exactly remember how or why, but late one Sunday evening in November I suddenly came up with the bright idea to port the Google App Engine infrastructure to “sqlite” (as I referred to it in my head), ie. to use the standard Django database back-end with the idea to deploy it to some unknown host using sqlite3. I started almost immediately, figuring at the very worst, it would just be an abandoned experiment.
It turned out to be… well, if not “surprisingly easy”, remarkably painless. By Monday afternoon, I had gotten most of the core public functionality working just fine (insert show, edit show, delete show, list shows). The data model ported straight across with, one exception being a GeoPt latitude/longitude structure, which I simply reformatted into a string containing a comma-separated pair (which is was probably stored as in Google anyway).
By 9 pm on Monday, my entire test suite was passing. The internal pages hadn’t been ported yet, nor the “post on Facebook” functionality.
I decided to work on the non-critical internal pages first, or else I wouldn’t have any excuse to not deploy (and that’s scary!). I ported these components which gave the code base a chance to “set” (like Jelloâ„¢) and time for that unease associated with massive changes to dissipate somewhat.
Some changes that occurred more than once:
– adding “objects” everywhere, so Model.get
becomes Model.objects.get
– changing Property
to Field
(ndb.StringProperty
=> models.CharField
for example)
– ndb.KeyProperty
=> models.ForeignKey
– put
=> save
– key.get.
=> key.
– .query()
=> .objects
– required=False
=> null=True
– obj.key.delete()
=> obj.delete()
– query.order
=> query.order_by
– query.filter
works very differently (no more fancy Google App Engine custom types based on operator overloading and deferred evaluation… fancy, but pretty opaque)
On Tuesday, I started working on the automated posts to Facebook. Not much needed to change here, but it was a bit nasty as there were only limited automated tests for this, so I had to be very cautious. Somewhere along the line, I decided I would deploy on my unlimited Dreamhost account (promo code: “RICHARD_SHARES”), which costs about $9/month and already hosts a bunch of domains. There were a few gotchas in getting the wsgi configuration working with Django (and it was tremendously difficult to debug until I hit on this idea of marshaling the requests to a file, then invoking the server by hand using the marshaled request), but this was reasonably straightforward, and I had built a local installation of Python 2.7 in the past, so I reused that.
I worked on the code to import the data. Google Data Export is another things that’s way too complex and slow, but I had done a trial run of the export on Sunday, so I used that as a testbed. I found some sample code to read the sqlite database (which is very simple) into Google Entity objects, so it was fairly simple to read properties out of the Google objects and put them into Django objects. I ended up doing it in two passes; the first pass included root objects that other objects have foreign keys to, which ensures that the second pass can refer to those already created objects without dangling foreign keys.
I waited until midnight, so the daily stats would be generated on Google, with Google data, then immediately put the site into read-only mode on Google and began the dump of data from Google. It was infuriatingly slow. It finally finished around 12:42 am, so about 40 minutes total, to produce a 21 MB sqlite database file. Finally, one little file with ALL my LaffQ data in it!
Here’s the script that converted the exported Google App Engine data to native Django objects in sqlite3: https://gist.github.com/richardkiss/4576523
I had already brought up the new site on Dreamhost, using two day old data, so it was just a matter of running the conversion tool, which read the Google sqlite database (which was essentially a BigTable dump, with one entity) and wrote out the Django sqlite database (which much more closely resembled the actual structure of data in my database). The new sqlite file was 8.7 MB. Compressed with bzip2, it was under 2 MB. That was the entirety of my web site data that took Google 40 minutes to export!!!
I copied it to Dreamhost in about five seconds, deployed it, restarted the Django app, and poked around a bit. Everything seemed to be in order and up to date, so I update DNS to point to the Dreamhost version of the web site, and waited for the change to propagate around the world (it’s funny to think about how LaffQ came up at different times for different people).
One thing I forgot is the Facebook integration required SSL. I didn’t have an SSL certificate (or a unique IP!) so I was flustered for a bit. I thought about how this used to work: it went to the laffq.appspot.com domain, the Google domain that supports SSL for free (signed with a Google appspot.com SSL certificate). Then I realized, I could just write a tiny Google App Engine app that proxied requests to https://laffq.appspot.com/ by fetching content from laffq.com. These pages are very low traffic (since they don’t really do anything), and it worked! (I had problems serving CSS to Chrome, which were resolved by making sure the content-type header was set correctly).
I’m kind of glad I didn’t remember this until it was deployed because I didn’t see the answer right away, and it might have made me not go through with it due to my tendency to know every move in advance.
All in all, the port went unbelievably well. The sqlite version is MUCH faster than Google even though memcache is no longer in the mix (I do use built-in Django caching in the same memory space as the Django app, although I’m pretty sure I could turn it off and it would still be very fast). It uses very little CPU on Dreamhost, and there are no annoyingly arbitrary quotas.
Since then, my work on LaffQ has accelerated beyond my wildest dreams. I didn’t realize just how much the Google App Engine restrictions were holding me back. Now I can back up my entire production database in just a few seconds. I can run a copy of production database locally, which gives me a much better feel for how changes will perform with production data, so I use production data in development all the time. This gives me a much better feeling about new features. Debugging is much easier. Deploying is as simple as a git push/git pull. I can look at logs. I can make tweaks in production. I can ssh to the production machine. I can diagnose problems in production. Pretty much everything about it is better. I’ve refactored, tweaked, optimized, added tests. It’s code I’m almost proud of now. (Almost.)
I saw on Hacker News the other day about how Google App Engine was down. I was happy it didn’t affect me.