to make it do something right - the more I’m tempted to rewrite it from scratch.
That whole “I never took (or never understood) a database theory course at university, so I’m just going to pretend it doesn’t matter, and that a relational database may be treated as nothing more than a glorified filestore for my Objects” attitude just doesn’t cut any ice with me, but seems sadly prevalent within the Rails community. Yes, good object-oriented design is really important - but you need a sophisticated relational approach in order to get a handle on the data model behind any kind of non-trivial inheritance & mixin hierarchy, and to persist it in a logically sound and efficiently-indexable fashion.
What’s more, I contend that your ORM tool needs to understand something of the relational algebra in order to represent what is going on in a sufficiently elegant, flexible way - otherwise you’ll always be piling hack ontop of hack whenever you want to map the results of a moderately complex query over to the OO side. Joining SQL strings together is not the way forward - these things are syntax trees with structure!
Ahem. Sorry if I sound exasperated. SQLAlchemy on the Python side gets this kind of thing ABSOLUTELY SPOT ON, and dare I say it, so do some of the Java ORM frameworks (shame about the XML config files and Java’s tendency towards boilerplate code and bloated syntax, but don’t throw the baby out with the bathwater Rails-ers)
The problem with Rails’ ActiveRecord is that it’s neither here nor there - neither the kind of lightweight, simple ‘map objects to database rows and nothing much else’ approach originally implied by Fowler’s Active Record design pattern - nor the kind of powerful ORM tool which is capable of turning the kind of tricks that are increasingly demanded of it in anything like an elegant fashion.
It seems the Rails team’s solution to some of the endemic problems with ActiveRecord’s messy guts is to wrap them up in a huge plastic bag known as caching - an acceptable pragmatic approach, I accept, in many situations, but one which would not be nearly so necessary had a different approach been taken to ActiveRecord’s architecture.
I feel that superior approach needn’t have come at the cost of ActiveRecord’s ‘convention over configuration’ and ‘easy to get started with’ benefits either - it just would have required a little more forethought and a little humility in learning about the Relational Model before attempting a tool which maps complex data models to a Relational Database.
Crap, I’m starting to sound like Fabian Pascal now aren’t I.
I sympathise with you
I also got frustrated by the Rails community’s lack of respect for data. I wrote the original patch to let you store BigDecimal objects as SQL decimal/numeric types. I got people asking why I wasn’t happy with floats (for financial data!), and it made me want to cry. Some people are baffled by the need for constraints of any kind, and some of the discussions on the rails list would make Codd turn in his grave.
If I am right, DHH has said that he believes foreign keys are unnecessary - and if he disputes something THAT basic, there’s really no hope for the project.
I am going to be using Nitro for my next site, hopefully Og will prove better. But it is still an implementation of ActiveRecord so may be subject to some of the same limitations. You probably know more than me already. There’s plenty of Ruby ORM libraries around, most of them basic by the look
Also you may be interested in a project I want to start soon, as a replacement for AR Migrations. It gets frustrating using AR Migrations with even two people working on the database, and although it could be hacked to alleviate some of the glaring problems (eg adding migration folders with multiple files per DB revision), it’s still tied to ActiveRecord, still has poor support for pretty fundamental DDL operations, and is still a rake task in a Rails app. Oh, and DHH is at the helm, so I’m not hoping any of that will be fixed soon.
I want to build a migration system based on a patch system like darcs (except I won’t try to be as clever). At work we need something that will allow several developers to work on independent branches, require no effort to merge changes together, but still flag conflicts for changes in things like stored procs that may not be caught by our CI. I also want full support for ALL sql ddl, so I don’t ever have to embed “execute” in a script again.
I want to start something this weekend, but hopefully I will get to spend time on it at work, as it is something they sorely need. (You really don’t want to see what I wrote in NAnt 3 years ago.)
Hmmm…
~/Desktop/sqlalchemy-trunk ashleymoran$ wc -l `find . -name ‘*.py’`
…
70898 total
If I get cracking, might have it ported by xmas 2009…
However 2566 lines of that is for MySQL, which can be removed straight away ;o)
Ooh, an SQLAlchemy port would be very nice. heh.
I did look into Og briefly - does seems a bit better than ActiveRecord but still the same approach.
The DB migration tool sounds like quite a challenge - can’t say it’s ever been enough of an issue for me to really feel the need, although I’ve not had to deal with significant concurrent database work with other developers - just the occasional ‘two different migrations get created and committed with the same number’ issue.
What would be nice would be a SQL-specific diff tool which could generate migrations from a before and after schema - perhaps integrated into the version control system. But that would only handle changes to the schema, and schema migrations often need to be done alongside (or interleaved with) data migrations (insert/delete/update etc). So I suspect it’s fundamentally quite a hard problem.
I don’t think the DB migration thing would be as hard as it sounds in the simple case, it only has to be aware of certain conflicting operations, and be able to draw up a dependency tree (which, to simplify matters further, can be largely manually specified). Then the migration tool only has to record unapplied patches (determined by querying a list in the database, as opposed to a version number).
Aqua Data Studio has schema and data diffing tools, but personally I’d advise against going down that route. It was how development used to be done at my current company before I started, and Database Diff Day was often a marathon event. I think the best way is to have a lightweight repeatable migration system, so you can easily snapshot, migrate and restore your dev DB (easier than fixing broken down steps). One thing that is missing from the process is migration-BDD - I’d love to have a way of writing specs for database state post-migration, but in a multi-developer environment that would just be intractable.
I’d only given SQLAlchemy a brief look over before, but I’ve just gone through most of the tutorial and I’m blown away! I am so tempted to start porting it - I have a feeling it may be quite easy to get some of the basics working, as the code looks very loosely coupled. I may investigate it over the weekend, and sketch out the architecture. If I think I can have something useful working in under two months, I may well make a start. It’s about time the Ruby world had a decent ORM, damnit!
Yeah it’s nice isn’t it. I’ve used it on a Python project, and while the session-based approach takes a little getting used to, the sheer power of the thing really astounds.
I was thinking of attempting something which aims to feel at least superficially familiar to ActiveRecord users - but which has a proper relational layer under the hood and allows for various different ways of mapping class inheritance hierarchies and mixins to relations. Ideally allowing for better approaches to polymorphic associations too, and for a nice ruby-like query-building syntax.
I’d probably want it to use the SQLAlchemy database session approach too - in a Rails app one session for request would make a lot of sense. The way it identifies the database changes needed to ‘flush’ an set of changes to the object graph, topologically sorts them by foreign key dependency and executes them in a nice transaction… lovely.
Also the way there is only ever one Object present in memory per row of a relation.
Maybe we should start a rubyforge project!
If you’re interested, this could be used as a starting-point - something I put together a long time ago for ruby:
http://rubyforge.org/projects/active-relation/
the only source code is in the SVN repository on rubyforge..
Actually looking back at it, ActiveRelation isn’t really what I want any more. But something like it could form the relational layer for a more fully-featured ORM.
I imagine the session-based system makes it easier to spec too, so you can mock the session instead of stubbing MyModel.find… I shudder every time I do that.
I don’t see why, if you had a full mapping system like SQLAlchemy, you couldn’t write a facade layer that makes it look like ActiveRecord by generating default classtable mappers at some point in the app lifecycle. Although TBH, I am not immediately very concerned with making it a drop-in replacement for ActiveRecord, as I think most Rails developers want to use everything out of the box. (And I don’t know how loosely-coupled AR is to Rails anyway.)
I have actually already registered a project (not accepted yet) called Celestial - there’s prize for the first person to figure out the pun
Couldn’t find the ActiveRelation source you mention on RubyForge, all I can find is the home page.
ActiveRecord is fairly loosely coupled to the rest of Rails actually. The scaffolding and some of the form creation helpers are a bit coupled to it, although will work with something similarly duck-typed too.
ActiveRelation - just checked and, ouch, it’s still using CVS, which was the only option at the time :/
cvs -d :pserver:anonymous@rubyforge.org:/var/cvs/active-relation checkout active-relation
should do the trick. I have actually used it in a rails app in place of activerecord - but for maintainability reasons it would be nice to have something which is reasonably accessible to all the Rails folk out there with activerecord experience…
I was wondering if you caught Martin Fowler’s keynote at RailsConf 2006 ? (Its still available as a podcast on Odeo if you didn’t). My interpretation of that speech is that Rails’ ActiveRecord does exactly what its supposed to do. That is, if you want to write an application similar to Basecamp - the project from which it was extracted. However if thats not how you model your world then you need something else.
Hi very interesting blog - thanks.
Are there still plans for a new ORM project?
By doing ORM right did you have in mind something like what is described here (implmented in Ruby of course):
http://rubyurl.com/0DB
Some of the initial heavy lifting for which would probably be made easier by gratr:
http://gratr.rubyforge.org/
Thanks again.
Anthony - nope I didn’t catch the keynote, although saying “x does what it was designed to do, and if you don’t like it use something else” doesn’t have a ton of useful content for me.
My issue with ActiveRecord is that I feel it’s being pushed too far beyond what it’s capable of and suited for, both by the ActiveRecord developers, and by a lot of the Rails projects using it. Perhaps making people more aware that other more elegant approaches exist to the data layer, and what the trade-offs are, would help.
Mark - my thoughts on graph-based models (RDF is one) - they are just a restricted form of a highly-normalised relational model, where you’re restricted to binary predicates/relations corresponding to different kinds of labels for graph edges. Often you want ternary and higher predicates, even in a highly normalised relational model - forcing them to be binary is only possible if you introduce arbitrary surrogate objects into your graph to represent the rows of greater=than-binary relations. I guess that’s not always an issue, but I prefer the more relational model myself.
It seems that with the relational model you have a lot more scope to index things efficiently for particular queries (using multi-column indices, indexed views etc), to impose structure in the form of (potentially multi-column) uniqueness and foreign key constraints. While I expect similar ends could be acheived by a sophisticated graph-based DBMS, I’m not sure I’ve seen anything yet capable of it.
I hear good things about this project: http://rubyforge.org/projects/activefacts/
and will be following that quite closely