Athemeus.com

Postgres, ActiveRecord and Tomfoolery

by Jonathan.

So today, I went to work on doing some Multi Table Inheritance in Rails. I've had experience with this from my days with Hibernate on Java. The 'standard' Rails Way seems to usually revolve around Single Table Inheritance, wherein you have multiple models sharing one table space and the model gets differentiated by some 'type' column. 

For a lot of situations, this makes sense. Let's say that you were writing a web based version of the classic XCOM UFO Defense. There are maybe, what, 10-12? Different types of aliens. Each behaves differently, so having a different model for each makes sense. But generally speaking, the kinds of information they have will be the same.

But for some situations, this doesn't make sense. Sometimes there are important semantic differences between the two things. Sometimes they simply differ in a great deal of attributes. The latter is something that might be considered meta-data though, and there are different solutions for that, like attribute tables. But perhaps the best example is legacy systems, and this is where Postgres comes into play. 

Sometimes, you need to make changes, but you can't. You can't because there are a bajillion other things that are touching a given set of tables. Strange weavings from Shell Script Alchemists from a far-away land, running via crontab on maligned and forgotten boxen. Perhaps you need to have some new data that doesn't adhere to previous constraints. Who wants to comb through old code to make sure it doesn't thread dump when it finds a null column value? So you have OldThing and NewThing. Since Thing is really sort of an abstraction, at this point it starts to feel like a code smell in Ruby since all that Thing commonality can just get mixed in. Except that it would be really nice to use 

Thing.where(:foo => 'bar')

...and get back a mix of NewThings and OldThings. That would be swell, actually. Except there really *isn't* Thing. Well, you could use a View, couldn't you? A couple of weeks ago, I noticed the following litter in my logs.


SELECT a.attname, format_type(a.atttypid, a.atttypmod), d.adsrc, a.attnotnull
 FROM pg_attribute a LEFT JOIN pg_attrdef d
 ON a.attrelid = d.adrelid AND a.attnum = d.adnum
 WHERE a.attrelid = '"oldthings"'::regclass
 AND a.attnum > 0 AND NOT a.attisdropped
 ORDER BY a.attnum

For pretty much every query on all the pages. "Ah, it's getting attributes", I thought. I discovered someone had created a gem to squash this output, and I went on my merry way. But let's think about it; if someone takes the time to write a gem to suppress your output, *is that a good thing*? The real rub here though is that because of this the table_exists? method will always return false. Strictly speaking, I suppose this is true, as this isn't really a table. But in the context of read-only data, the difference between running a select from a view and from a table is nil. You can get into an argument of semantics about tolerant configuration, but ultimately I think it's more about ActiveRecord trusting the configuration I've given it. I see a couple of other things about this strategy that irritates me:

  1. This method of metadata gathering is definitely Postgres specific
  2. It assumes that the database user has access to these special system tables. 
  3. It's not really accurate.

As for 1, this is kind of nitpicky, but I like to think about my underlying database as little as I can. It's a reality to deal with in crunch time and optimization, for sure, but it just leaves things less portable. I concede that perhaps scenario 2 is unlikely or even impossible; I just don't know enough about the underworkings of Postgres, but it seems feasible enough to me especially if your database limits the trust it gives you. But ultimately, it's just not accurate because if your schema search path is set to something,otherthing, then you can put your view in schema something, and create a table with exactly the same structure in schema otherthing and now the aforementioned spam-query will return all the information you want and ActiveRecord will gladly treat your view as a table, allowing you to where() it up all you like. Again, I could be completely wrong, but I don't see why the adapter couldn't fall back when the pg_attribute query fails by doing a select * from whatever limit 0(1?) and using the metadata from the result set to fill in the blanks. 

Of course the hack isn't really an answer. Maybe the answer is doing a faux materialized view like in Enterprise Rails, but who wants to go write a bunch of triggers and such?