What a n00b!

MySQL Memory Usage Growth After Lots of Table Creates/Drops

I should clarify somewhere here that "Lots" in the title means something on the order of half a million table creations and drops per week. This is pretty unique to our standard usage of MySQL, and I would assert unique for most.

We've been having troubles for a while now with a cluster of MySQL nodes which consume a never-ending amount of memory over time until it eventually runs out, forcing us to restart MySQL if we catch it in time, or reboot the box if we don't. After doing tons of searches (and looking through countless bug reports), it seemed like it could have been a few memory leaks, fixed in late 5.1 or early 5.5 land. As this cluster was slightly older and not updated regularly, we pushed things over to new nodes and up to a modern version of 5.5. However, this didn't seem to solve the problem. Every week or so, we'd fail over to a secondary node and restart things. The biggest hint towards it being caused by all the creates is that the "leak" flowed through replication and the secondary nodes would run out within several hours of the primary.

As it turns out, the answer was quite simple. According to the Percona MySQL Performance Blog, the MySQL dictionary cache memory is allocated with each opened table, and never given back - without any limit. This isn't so bad I suppose, except if you're opening something like half a million tables per week and you just so happen to have a machine with a finite amount of memory. Then you might have a bit of trouble.

The fix is bad news if you're a MySQL community-build user. There's no way to limit it without switching to Percona Server (which I highly recommend over the stock MySQL builds, anyway). If you have installed Percona Server in favor of MySQL you can set innodb_dict_size_limit to limit the growth. This is '0' by default which mimics current stock MySQL. Note that this is a soft limit, so if you still have that many tables open you'll just need moar memories.

I should also note that while this setting says it's dynamic, it's a danger zone. I set this to a not-terribly-aggressive-yet-still-lower-than-current-dictionary-size value on the non-active node and everything came to a screeching halt with this in the error log:

--Thread 1264216384 has waited at dict0dict.c line 744 for 688.00 seconds the semaphore:

Enabling InnoDB Compression on MySQL 5.5

I got a quick little lesson this morning in a few MySQL internals while trying to enable InnoDB compression. On the surface, it seems pretty easy. The MySQL documentation just says you can do:

ALTER TABLE table ROW_FORMAT=COMPRESSED;

However, for me this did nothing. The size of the table didn't change at all. After a bit of digging, I figured out that you have to set the default file format for InnoDB to Barracuda which has compression enabled. This is disabled by default, according to the documentation:

Although Oracle recommends using the Barracuda format for new tables where practical, in MySQL 5.5 the default file format is still Antelope, for maximum compatibility with replication configurations containing different MySQL releases.

Pretty sneaky, but easy enough to fixup:

set global variable innodb_file_format = Barracuda;

add it to your my.cnf (/etc/my.cnf on most distros):

... innodb_file_format = Barracuda ...

After that, the ALTER TABLE converted the table and cut the size on disk in half. Very nice.

Since this table is rarely read, and even more rarely written to, I didn't end up doing much by way of performance testing or tuning. However, I should also note blog entry I found that seems to indicate that performance degrades as the size of the buffer pool increases when compression is enabled. The workaround is to use multiple buffer pools. Fortunately, we're already did this as a workaround for issues with slow DROP TABLE performance on older versions of 5.5 when innodb_file_per_table is enabled.

Virtualbox (and subsequently vagrant) machines fail to boot

Today, when I went to boot one of the development environments on my laptop, I was greeted by a fun message:

the-cloud:dev wwalter$ vagrant up [default] VM already created. Booting if it's not already running... [default] Clearing any previously set forwarded ports... [default] Forwarding ports... [default] -- 22 => 2222 (adapter 1) [default] Creating shared folders metadata... [default] Clearing any previously set network interfaces... [default] Booting VM... [default] Waiting for VM to boot. This can take a few minutes. The VM failed to remain in the "running" state while attempting to boot. This is normally caused by a misconfiguration or host system incompatibilities. Please open the VirtualBox GUI and attempt to boot the virtual machine manually to get a more informative error message.

I popped open the VirtualBox GUI as the error message suggested, and ended up with a slightly less helpful version:

Failed to open a session for the virtual machine dev_1340310725. Failed to load VMMR0.r0 (VERR_SUPLIB_WORLD_WRITABLE). Result Code: NS_ERROR_FAILURE (0x80004005) Component: Console Interface: IConsole {1968b7d3-e3bf-4ceb-99e0-cb7c913317bb}

As it turns out, /Applications on my MacBook had somehow been set to be world-writable.. I'm not sure what application I would have installed that caused this, but it's definitely not cool. Anyway, easy fix is to run Disk Utility and Repair Permissions.

Permissions differ on “Applications”; should be drwxrwxr-x ; they are drwxrwxrwx . Repaired “Applications”

Know Your Library, Yo

You can definitely file this one under "what a n00b!".

First, a little backstory. When we started out creating the Flickr integration for ShutterGuides, we used what seemed to be the most mature and stable Python Flickr API binding out there, flickrapi. The only drawback with this library is that it doesn't support oauth. It seemed to be the most solid and well-documented that we found, so we went ahead anyway. In the meantime, Flickr announced that they were end-of-life-ing their old auth mechanism in favor of oauth at the end of July this year. Great. (I found a blog post at one point promising oauth, but can't seem to find it now.) Anyway, we decided that instead of adding oauth support to the library, to just migrate to a new library, we chose this one.

All right, with that out of the way, let's get to the point.

One of the swanky things about this new library is that it transparently retrieves information about Photo objects that aren't necessarily returned by Flickr by default. One of the bad things about this new library is that it transparently retrieves information about Photo objects that aren't necessarily returned by Flickr by default.

When I first developed the integration, I noticed it was a bit slower than the old one. I thought maybe I could chalk this up to extra latency from my home network to Flickr vs. our cloud provider to Flickr. However, once we got things into production, and got a real user to use it with a good amount of photos, the thing slowed to a crawl. Like, 20+s for a web request. Doh. As it turns out our call to flickr.Photosets.getPhotos (retrieve photos in a Flickr album) was not returning some of the data that we expected to be in the Photo object, and the library was graciously making an API call per photo to Flickr during our web request! (sorry about that, Flickr) The fix was trivial, we just needed to add the fields we use to the "extras" option during the original call, I just needed to notice that it was happening.

Like I said, put this one under "what a n00b!". I knew better and yet it happened anyway. Let this be a reminder to everyone to check out that new code you just integrated into your app. I will leave you with a warning that you should take just as seriously.

Celery jobs unable to access objects when queued by post_save signal in Django

I ran into a fun issue this week, working on job queueing in ShutterGuides. We use Django for the app server and django-celery for job queueing in the background. In our case, we added a post_save signal to our Guide model to trigger an import of the photos in the background.

The code was pretty typical and worked great in development, triggering after a save and throwing the job onto the queue. The workers consumed the jobs and pulled in photos exactly as expected.

However, I noticed that when using MySQL, this task seemed to not do the right thing as it found nothing to import. As it turns out, this was because the post_save signal in Django is fired before the MySQL transaction has been applied. So, if a worker picks it up off the queue immediately, it may not have access to the object in the database yet; or in the case of updates, will not have access to the newest version of the object.

The solution in the StackOverflow question referenced above seems to indicate something like Celery would solve this, which of course isn't true since the job can fire before the transaction finishes as well. My solution was fairly simple, though perhaps a bit of a workaround:

- ImportPhotosTask.delay(instance) + ImportPhotosTask.apply_async([instance], countdown=2)

There's also a small patch on Github to add a post_commit signal to Django, but I try to stay away from anything that calls itself a "monkey-patch" right in the description. Might be good code, got stopped at that statement..

Percona Live MySQL Conference 2012 - day 1

This week I am fortunate enough to be sent to the Percona Live Conference by my company. I've been sitting here tonight, pondering and recalling sessions for the day; trying to think of all those things that I learned I'm was doing wrong or better tools I could be using. And, since I don't blog enough lately, here we go!

Nothing earth-shattering so far, but some takeaways from sessions I hit up today:

Building a Multi-Master, Mult-Region Database Infrastructure in Amazon EC2

When I first realized this was going to be just a demo of some commercial software (Tungsten Enterprise), not a "lessons-learned" type of session I was pretty disappointed (I didn't read the program carefully enough!). As the speaker got deeper into the talk, though, I became more and more impressed with the technology. Continuent has built some interesting technology, letting you not only failover (and failback!) individual nodes easily, but also entire clusters. They also have some pretty hot backup/restore/replication functionality as well.

One to Many: The Story of Sharding at Box

Definitely the most engaging session of the day for me, personally. The session was a pretty open story about Box's migration from a single database architecture into a sharded architecture (which seems pretty new for them). There were quite a few gotcha's along the way, especially on the application side. One of the more relevant pieces for me was in their clever use of the Tungsten Replicator to move databases between clusters in a fairly ad-hoc fashion.

The 5 Minute DBA: MySQL DBA 101 for Non-DBA's

While I wouldn't exactly call myself a "Non-DBA".. I know enough to be break things worse than they already are, and I am certainly not a full-time DBA. This was a great session a great overview of the basics (needed occasionally), and a solid overview of some of the tools available in the Percona Toolkit that I know I should be using but currently am not. For shame.

Backing up Facebook

I have to be brutally honest and say attending this session was easily the biggest letdown of the day. I was hoping for more from Facebook on this one. We spent about 25 minutes listening to them talk about how they use mysqldump every day on every database server.. in a non-consistency-ensuring way. And when a pointed question was asked about what would happen when they would need to recover a host server to a point in time? The question was basically brushed off saying they'd use some other mechanism to restore (presumably Xtrabackup or something similar).. I thought that's what we were here for? Whatever. I did have some interesting hallway conversation with a couple of fine folks about how perhaps there's some interestingness happening on the database that they're not wanting to talk about that solves this problem for them. That very well could be (and probably is) the right answer, but then why are they at a conference giving a session on doing backups? Ugh. Again.. whatever.

Scaling MySQL Databases for the Web

This was another title that decieved me ('fool me twice', or something like that..). It was less about scaling MySQL and more about a tool developed in-house at YouTube that currently front-ends all MySQL traffic there, doing some intelligent caching and optimization.

One more thing

I usually don't enjoy or look forward to walking around in the exhibit halls at conferences like this, but I did talk with one very interesting vendor that I hadn't heard of before. NuoDB seems to be working on a pretty interesting product that's just about ready to release. I don't really have the time to dive into details about how it supposedly works, but they seem to understand and handle the scaling-back part of elasticity better than most other database vendors I've talked with. They also have a replication strategy that's similar to the way bittorrent works which is a pretty awesome idea. There seems to be nothing open (as in source) about the product, except they want others to write various language bindings for them, so we'll see how well this thing takes off. Definitely an interesting product to watch, though.

Time Machine Backups from Airport Extreme

Hopefully Apple doesn't make up some way to disable this in the future, but it seems that despite claims elsewhere on the web, the latest Airport Extreme can be used with a USB hard drive to perform Time Machine backups. I've always heard great things about the Airport devices and really wanted to do backups over it, but I already had a larger USB hard drive, so a Time Capsule seemed like a bit of a waste. This worked with an Airport Extreme 802.11n 5th generation. My Macs are both Lion and Snow Leopard.

All you have to do is first plug the hard drive directory into your Mac, partition as desired and format filesystems as an HFS (Mac OS Extended). Then, plugin the drive and enable disk sharing on the Airport. After that's done, your Aiport should show up in Finder. Once you mount it, you can select it from the Time Machine settings (it will appear as a Time Capsule).

That's it! Glad I found out and wasn't deterred by what Apple says and is on their forums. The Airport has been amazing, video streaming is far smoother now unlike with the aging wireless router I was replacing. Definitely a recommended buy!

Vagrant Fails To Start VM with Bad Default Machine Folder

I'm here at PuppetConf in Portland and one of the speakers got me all excited again about Vagrant. I've played around with Vagrant before, but he was showing off some awesome features that I didn't realize existed which made me want to play with the tool again (more on that later!). Anyway, I updated all the versions of things Vagrant-related and tried to use it again. However, when I downloaded one of the default images from the getting started page I was greeted with an awesomely descriptive error:

[default] Importing base box 'lucid32'... The VM import failed! Try running `VBoxManage import` on the box file manually for more verbose error output.

Running it with VBoxManage as instructed gave a slightly-better-yet-not-so-useful error:

VBoxManage: error: Appliance file must have .ovf extension VBoxManage: error: Details: code VBOX_E_FILE_ERROR (0x80bb0004), component Appliance, interface IAppliance, callee nsISupports Context: "Read(Bstr(pszAbsFilePath).raw(), progressRead.asOutParam())" at line 302 of file VBoxManageAppliance.cpp

A few Google searches for this error got me almost nowhere except that it was likely a VirtualBox configuration issue. Great. After digging around for far too long, I figured out that it was because when I had played with Vagrant/VirtualBox in the past I had configured VirtualBox's default machine folder to an external drive that I didn't have plugged in! Doh. Anyway, was a pretty easy fix but I thought I'd share if it saved someone some time. The default machine folder is set on the first screen in the VirtualBox preferences window (at least for me on my Mac).

Indexing the Cloud

I think from now on I'll give all my machines names based on buzzwords. It makes for some awesome error / status messages. Today's message from Spotlight after upgrading to OS X Lion definitely didn't let me down. I had to share.

The cloud is a big place..

As an aside.. be sure to give plenty of time after the upgrade is finished for Spotlight to index things again. While my laptop isn't unusable, it slows things down quite a bit.