Self-referential many-to-many relations in Ruby on Rails

…or is my friend’s friend my friend too?

Recently I tried to implement a feature that is commonly used in social sites like Xing or Linked-in: a friends list, also known as buddy list or contacts list. Basically this means that a user can link to any other user to indicate that they are friends or know each other. Normally, the other party has to confirm the link proposition for the link to be established. Having a contact network, the site can implement various cool features, like finding the shortest path between two random users or telling you that there are ‘820 contacts of your contacts’.

Implementing this proved not to be that easy in Ruby on Rails, partly because of lack of documentation and mostly because of my poor knowledge of Rails, so here’s a solution that works for me. It should be noted that this is rather a story of trial and error (although I will spare you visiting all the blind alleys with me) than an emanation of guru-like knowledge, so if anyone knows a better way, I’ll be glad to hear about it.

Let’s begin with some basic assumptions: we have a model class called User and we want the User to know three things:

  1. a list of users directly linked to (me → someone)
  2. a list of users directly linked from (someone → me)
  3. a list of users directly linked, no matter to or from (e.g. if A → B and B → C, then A and C should be on B‘s linked list)

As we can see, the relationship between two users is directed: one user initiated it and the other responded. But when looking for a path between two ‘distant’ users, we don’t care who initiated an individual relationship that could be part of the path, so we also need an undirected view on relationships.

I left out the confirmation feature, as this is easy and not particularly interesting to implement.

Database

For starters, we need two tables in the database, users and relationships. Here are excerpts from my migration files, stripped of anything unnecessary.

create_table :users do |t| 
  t.column :login, :string, :null => false 
end 
create_table :relationships, :id => false do |t| 
  t.column "user_id",  :integer, :null => false 
  t.column "buddy_id", :integer, :null => false 
end

There is nothing special in these tables, I think that they would look like that even if we used anything other than Rails. The users table should of course contain additional user attributes, and relationships is a typical join table and could store information whether the link is confirmed, creation and confirmation dates and so on. Most of you would also probably use foreign keys constraints to have the database enforce the data integrity (omited here for simplicity).

Specifications

Next, let’s describe how the relationships on User should work. I use rspec, which combined with autotest, turns BDD into a pleasure rather than work. Here’s user_spec.rb:

describe User do 
  before :each do 
    %w(u v w).each do |x| 
      eval "@#{x} = User.new; @#{x}.login = '#{x}'; @#{x}.save" 
    end 
  end 
  it "can be linked with another user" do 
    @u.linked_to << @v 
    @u.linked_to.should include(@v) 
    u2 = User.find_by_id @u.id 
    u2.linked_to.should include(@v) 
  end 
  it "can find users that linked to it" do 
    @u.linked_to << @v 
    @v.linked_from.should include(@u) 
  end 
  it "can see friends of its friends" do 
    @u.linked_to << @v 
    @v.linked_to << @w 
    @u.linked_to[0].linked_to.should include(@w) 
    @w.linked_from[0].linked_from.should include(@u) 
  end 
  it "has a list of all users linked to and from it" do 
    @u.linked_to << @v 
    @v.linked_to << @w 
    @v.linked.should include(@u) 
    @v.linked.should include(@w) 
  end 
end

The before method defines three instance variables @u, @v and @w. Rails requires the objects in many-to-many associations to be already saved to the database, so we give them appropriate login names and save them. I use the each loop combined with eval so I don’t have to repeat the almost-the-same code three times.

All the other ‘methods’ are pretty self-explanatory, as always when using rspec. I assumed that User has thee collections: linked_to, linked_from and linked, which implement the above mentioned lists. Then I test some basic relations between those collections. For example, if user @u links to @v, then @v should have @u on its linked_from list.

Models

Having specified User's behavior, we can try to implement it along with helping class, Relationship:

class Relationship < ActiveRecord::Base 
  belongs_to :user, 
    :class_name => 'User', :foreign_key => 'user_id' 
  belongs_to :buddy, 
    :class_name => 'User', :foreign_key => 'buddy_id' 
end 
class User < ActiveRecord::Base 
  has_many :relations_to, 
    :foreign_key => 'user_id',  :class_name => 'Relationship' 
  has_many :relations_from, 
    :foreign_key => 'buddy_id', :class_name => 'Relationship'                             

  has_many :linked_to, 
    :through => :relations_to,   :source => :buddy 
  has_many :linked_from, 
    :through => :relations_from, :source => :user 
end

Class Relationship has no particular business purpose (for now), but defines the roles: the initiator of the link is called user, and buddy is the responding party.

Then comes the User class, full of ActiveRecord magic that I, to be frank, don’t fully understand. It looks nice and symmetrical, but took many tries and some googling to get it to work. Probably this could be simplified, but I’m happy that it works.

The linked list

OK, we have linked_to and linked_from, but where is linked list?! — I hear you exclaim. Well, currently I don’t know how to implement it using pure Rails magic, so I resorted to a little bit more of SQL. I create a view called buddies:

create or replace view buddies as 
  select user_id, buddy_id 
    from relationships 
  union 
  select buddy_id, user_id 
     from relationships

Here I use a little trick: the view lists the contents of relationships table two times — first normally, and then reversed. So, if relationships contains a tuple (1, 2), then the view has both (1, 2) and (2, 1).

Having this view in place, we can add following method to User class:

def linked 
  User.find_by_sql " 
    select * 
    from users 
    where id in ( 
      select buddy_id 
      from buddies 
      where user_id = #{id})" 
end

The inner select finds id’s of all the users that are either linked to or linked from the current one, and the outer select just reads the user data. This is probably far from being optimal and should be done without resorting to find_by_sql, but it works.

Obviously, the collection returned by linked method doesn’t support adding and removing buddies, and that’s a good thing, because we need to know who initiated the link. To have the relations changed, we should use operations provided by linked_to and linked_from.

Acknowledgements and sources

My solution is based on Self-referential has_many :through associations blog entry by Josh Susser and most notably on the solutions in comments to this entry. You may also find Rails: ActiveRecord goes :through article by Scott Matthewman helpful. And, of course, see Rails documentation on has_many and belongs_to.

I think that the above code could be easily changed to implement other self-referential relations, like parent-child, boss-employee or virtually any problem that can be turned into some kind of graph algorithm.

In the future I’m going to try to implement the cool features I mentioned in the beginning: counting contacts of various degrees and finding shortest path between two users. If I succeed I will share my experiences in an upcoming entry.

Add to:
Digg! digg blog_head.png reddit delicioussmall.gif del.icio.us


13 responses to “Self-referential many-to-many relations in Ruby on Rails

  • art

    I was just wondering…
    aren’t the lists all the same?:
    1. a list of users directly linked to (me → someone)
    2. a list of users directly linked from (someone → me)
    3. a list of users directly linked, no matter to or from (e.g. if A → B and B → C, then A and C should be on B’s linked list)

    If this is so, you only need a find_friends method in your user model:
    def self.find_friends(user_id)
    find_by_sql(“SELECT users.*
    FROM (
    SELECT user_id, friend_id
    FROM friendships
    WHERE user_id = #{user_id}
    UNION
    SELECT friend_id, user_id
    FROM friendships
    WHERE friend_id = #{user_id}
    ) AS friends
    INNER JOIN users
    ON friends.friend_id = users.id
    “)
    end

    please correct me if I’m wrong :)

  • szeryf

    art: yes and no :-) on the lower level we want to have directed relations in case we need to know who started the relation etc. but there are other situations when we want to treat the relationships as undirected (looking for a path between two users). that’s why I defined two versions of the relations.

    as of your version of finding friends, I think it’s equivalent to my version. the difference is that you don’t use a view and instead do the union-trick inline in your query.

    PS. thanks for first comment on my blog :)

  • “Self-referential many-to-many relations in Ruby on Rails” - Art at Code

    […] data makes me crazy I went on searching for any helpful information and found szeryf’s article. He uses some custom SQL to avoid […]

  • Suryc

    Hey,
    Is it possible :
    – to store an other field ?

    For example :
    create_table :relationships, :id => false do |t|
    t.column “user_id”, :integer, :null => false
    t.column “buddy_id”, :integer, :null => false
    t.column “feelling”, : string
    end

    – and to give the new paramter in :
    @v.linked_to(feeling) << @w

    Thx

  • szeryf

    Yes, you can add additional fields to the relations table, but you cannot access them like this. The linked_to collection is a collection of User object and they don’t have feeling attribute. You have to use relation_to collection. Something like this:

    @v.relation_to << @w, :feeling => ‘good’

  • provides 'brain'

    Hey, you’r inner join doesn’t work for the ‘linked’ case, it just selects the ‘linked_from’.
    art’s solution should work.

  • provides 'brain'

    actually, it should be

    select * from users where id in (select user_id from relationships where buddy_id = #{n} union select buddy_id from relationships where user_id = #{4});

  • provides 'brain'

    great article, BTW, it really helped me a lot to understand rails relationships and self relationships and stuff. It worked great for a project in university.

    thanks !

  • szeryf

    i didn’t use join, so I’m not sure which query you are referring to. your query returns the same as my query on view. nonetheless, I’m glad this helped you :)

  • guillaume

    Thanks a lot for your post

  • guillaume

    Thanks a lot for your post, it helped me a lot !

  • christian

    still very usefull. having mine half way done, i’ve found yours and it helped a lot to get mine fully done. i’ve solved(hopefully?) the bidirectional thing by using not rails but ruby “magic”:

    (roughly translated to your names in the user model)

    def linked
    self.linked_to | self.linked_from
    end

  • Guillaume

    Many many thanks!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: