Self-referential many-to-many relations in Ruby on Rails
…or is my friend’s friend my friend too?
Recently I tried to implement a feature that is commonly used in social sites like Xing or Linked-in: a friends list, also known as buddy list or contacts list. Basically this means that a user can link to any other user to indicate that they are friends or know each other. Normally, the other party has to confirm the link proposition for the link to be established. Having a contact network, the site can implement various cool features, like finding the shortest path between two random users or telling you that there are ‘820 contacts of your contacts’.
Implementing this proved not to be that easy in Ruby on Rails, partly because of lack of documentation and mostly because of my poor knowledge of Rails, so here’s a solution that works for me. It should be noted that this is rather a story of trial and error (although I will spare you visiting all the blind alleys with me) than an emanation of guru-like knowledge, so if anyone knows a better way, I’ll be glad to hear about it.
Let’s begin with some basic assumptions: we have a model class called User and we want the User to know three things:
- a list of users directly linked to (me → someone)
- a list of users directly linked from (someone → me)
- a list of users directly linked, no matter to or from (e.g. if
A→BandB→C, thenAandCshould be onB’s linked list)
As we can see, the relationship between two users is directed: one user initiated it and the other responded. But when looking for a path between two ‘distant’ users, we don’t care who initiated an individual relationship that could be part of the path, so we also need an undirected view on relationships.
I left out the confirmation feature, as this is easy and not particularly interesting to implement.
Database
For starters, we need two tables in the database, users and relationships. Here are excerpts from my migration files, stripped of anything unnecessary.
create_table :users do |t| t.column :login, :string, :null => false end create_table :relationships, :id => false do |t| t.column "user_id", :integer, :null => false t.column "buddy_id", :integer, :null => false end
There is nothing special in these tables, I think that they would look like that even if we used anything other than Rails. The users table should of course contain additional user attributes, and relationships is a typical join table and could store information whether the link is confirmed, creation and confirmation dates and so on. Most of you would also probably use foreign keys constraints to have the database enforce the data integrity (omited here for simplicity).
Specifications
Next, let’s describe how the relationships on User should work. I use rspec, which combined with autotest, turns BDD into a pleasure rather than work. Here’s user_spec.rb:
describe User do
before :each do
%w(u v w).each do |x|
eval "@#{x} = User.new; @#{x}.login = '#{x}'; @#{x}.save"
end
end
it "can be linked with another user" do
@u.linked_to << @v
@u.linked_to.should include(@v)
u2 = User.find_by_id @u.id
u2.linked_to.should include(@v)
end
it "can find users that linked to it" do
@u.linked_to << @v
@v.linked_from.should include(@u)
end
it "can see friends of its friends" do
@u.linked_to << @v
@v.linked_to << @w
@u.linked_to[0].linked_to.should include(@w)
@w.linked_from[0].linked_from.should include(@u)
end
it "has a list of all users linked to and from it" do
@u.linked_to << @v
@v.linked_to << @w
@v.linked.should include(@u)
@v.linked.should include(@w)
end
end
The before method defines three instance variables @u, @v and @w. Rails requires the objects in many-to-many associations to be already saved to the database, so we give them appropriate login names and save them. I use the each loop combined with eval so I don’t have to repeat the almost-the-same code three times.
All the other ‘methods’ are pretty self-explanatory, as always when using rspec. I assumed that User has thee collections: linked_to, linked_from and linked, which implement the above mentioned lists. Then I test some basic relations between those collections. For example, if user @u links to @v, then @v should have @u on its linked_from list.
Models
Having specified User's behavior, we can try to implement it along with helping class, Relationship:
class Relationship < ActiveRecord::Base
belongs_to :user,
:class_name => 'User', :foreign_key => 'user_id'
belongs_to :buddy,
:class_name => 'User', :foreign_key => 'buddy_id'
end
class User < ActiveRecord::Base
has_many :relations_to,
:foreign_key => 'user_id', :class_name => 'Relationship'
has_many :relations_from,
:foreign_key => 'buddy_id', :class_name => 'Relationship'
has_many :linked_to,
:through => :relations_to, :source => :buddy
has_many :linked_from,
:through => :relations_from, :source => :user
end
Class Relationship has no particular business purpose (for now), but defines the roles: the initiator of the link is called user, and buddy is the responding party.
Then comes the User class, full of ActiveRecord magic that I, to be frank, don’t fully understand. It looks nice and symmetrical, but took many tries and some googling to get it to work. Probably this could be simplified, but I’m happy that it works.
The linked list
OK, we have linked_to and linked_from, but where is linked list?! — I hear you exclaim. Well, currently I don’t know how to implement it using pure Rails magic, so I resorted to a little bit more of SQL. I create a view called buddies:
create or replace view buddies as
select user_id, buddy_id
from relationships
union
select buddy_id, user_id
from relationships
Here I use a little trick: the view lists the contents of relationships table two times — first normally, and then reversed. So, if relationships contains a tuple (1, 2), then the view has both (1, 2) and (2, 1).
Having this view in place, we can add following method to User class:
def linked
User.find_by_sql "
select *
from users
where id in (
select buddy_id
from buddies
where user_id = #{id})"
end
The inner select finds id’s of all the users that are either linked to or linked from the current one, and the outer select just reads the user data. This is probably far from being optimal and should be done without resorting to find_by_sql, but it works.
Obviously, the collection returned by linked method doesn’t support adding and removing buddies, and that’s a good thing, because we need to know who initiated the link. To have the relations changed, we should use operations provided by linked_to and linked_from.
Acknowledgements and sources
My solution is based on Self-referential has_many :through associations blog entry by Josh Susser and most notably on the solutions in comments to this entry. You may also find Rails: ActiveRecord goes :through article by Scott Matthewman helpful. And, of course, see Rails documentation on has_many and belongs_to.
I think that the above code could be easily changed to implement other self-referential relations, like parent-child, boss-employee or virtually any problem that can be turned into some kind of graph algorithm.
In the future I’m going to try to implement the cool features I mentioned in the beginning: counting contacts of various degrees and finding shortest path between two users. If I succeed I will share my experiences in an upcoming entry.
Add to:
digg
reddit
del.icio.us




I was just wondering…
aren’t the lists all the same?:
1. a list of users directly linked to (me → someone)
2. a list of users directly linked from (someone → me)
3. a list of users directly linked, no matter to or from (e.g. if A → B and B → C, then A and C should be on B’s linked list)
If this is so, you only need a find_friends method in your user model:
def self.find_friends(user_id)
find_by_sql(“SELECT users.*
FROM (
SELECT user_id, friend_id
FROM friendships
WHERE user_id = #{user_id}
UNION
SELECT friend_id, user_id
FROM friendships
WHERE friend_id = #{user_id}
) AS friends
INNER JOIN users
ON friends.friend_id = users.id
“)
end
please correct me if I’m wrong :)
art said this on 2007-07-02 at 18:28:06
art: yes and no :-) on the lower level we want to have directed relations in case we need to know who started the relation etc. but there are other situations when we want to treat the relationships as undirected (looking for a path between two users). that’s why I defined two versions of the relations.
as of your version of finding friends, I think it’s equivalent to my version. the difference is that you don’t use a view and instead do the union-trick inline in your query.
PS. thanks for first comment on my blog :)
szeryf said this on 2007-07-02 at 20:04:31
[...] data makes me crazy I went on searching for any helpful information and found szeryf’s article. He uses some custom SQL to avoid [...]
“Self-referential many-to-many relations in Ruby on Rails” - Art at Code said this on 2007-07-13 at 14:34:35
Hey,
Is it possible :
– to store an other field ?
For example :
create_table :relationships, :id => false do |t|
t.column “user_id”, :integer, :null => false
t.column “buddy_id”, :integer, :null => false
t.column “feelling”, : string
end
– and to give the new paramter in :
@v.linked_to(feeling) << @w
Thx
Suryc said this on 2007-10-26 at 17:41:11
Yes, you can add additional fields to the relations table, but you cannot access them like this. The linked_to collection is a collection of User object and they don’t have feeling attribute. You have to use relation_to collection. Something like this:
@v.relation_to << @w, :feeling => ‘good’
szeryf said this on 2007-10-27 at 09:59:25
Hey, you’r inner join doesn’t work for the ‘linked’ case, it just selects the ‘linked_from’.
art’s solution should work.
provides 'brain' said this on 2007-11-23 at 17:55:05
actually, it should be
select * from users where id in (select user_id from relationships where buddy_id = #{n} union select buddy_id from relationships where user_id = #{4});
provides 'brain' said this on 2007-11-23 at 19:25:35
great article, BTW, it really helped me a lot to understand rails relationships and self relationships and stuff. It worked great for a project in university.
thanks !
provides 'brain' said this on 2007-11-23 at 19:26:49
i didn’t use join, so I’m not sure which query you are referring to. your query returns the same as my query on view. nonetheless, I’m glad this helped you :)
szeryf said this on 2007-11-23 at 22:35:46
Thanks a lot for your post
guillaume said this on 2008-08-26 at 22:56:42
Thanks a lot for your post, it helped me a lot !
guillaume said this on 2008-08-26 at 23:01:34