Self-referential many-to-many relations in Ruby on Rails

…or is my friend’s friend my friend too?

Recently I tried to implement a feature that is commonly used in social sites like Xing or Linked-in: a friends list, also known as buddy list or contacts list. Basically this means that a user can link to any other user to indicate that they are friends or know each other. Normally, the other party has to confirm the link proposition for the link to be established. Having a contact network, the site can implement various cool features, like finding the shortest path between two random users or telling you that there are ‘820 contacts of your contacts’.

Implementing this proved not to be that easy in Ruby on Rails, partly because of lack of documentation and mostly because of my poor knowledge of Rails, so here’s a solution that works for me. It should be noted that this is rather a story of trial and error (although I will spare you visiting all the blind alleys with me) than an emanation of guru-like knowledge, so if anyone knows a better way, I’ll be glad to hear about it.

Let’s begin with some basic assumptions: we have a model class called User and we want the User to know three things:

  1. a list of users directly linked to (me → someone)
  2. a list of users directly linked from (someone → me)
  3. a list of users directly linked, no matter to or from (e.g. if A → B and B → C, then A and C should be on B’s linked list)

As we can see, the relationship between two users is directed: one user initiated it and the other responded. But when looking for a path between two ‘distant’ users, we don’t care who initiated an individual relationship that could be part of the path, so we also need an undirected view on relationships.

I left out the confirmation feature, as this is easy and not particularly interesting to implement.

Database

For starters, we need two tables in the database, users and relationships. Here are excerpts from my migration files, stripped of anything unnecessary.

create_table :users do |t|
  t.column :login, :string, :null => false
end
create_table :relationships, :id => false do |t|
  t.column "user_id",  :integer, :null => false
  t.column "buddy_id", :integer, :null => false
end

There is nothing special in these tables, I think that they would look like that even if we used anything other than Rails. The users table should of course contain additional user attributes, and relationships is a typical join table and could store information whether the link is confirmed, creation and confirmation dates and so on. Most of you would also probably use foreign keys constraints to have the database enforce the data integrity (omited here for simplicity).

Specifications

Next, let’s describe how the relationships on User should work. I use rspec, which combined with autotest, turns BDD into a pleasure rather than work. Here’s user_spec.rb:

describe User do
  before :each do
    %w(u v w).each do |x|
      eval "@#{x} = User.new; @#{x}.login = '#{x}'; @#{x}.save"
    end
  end
  it "can be linked with another user" do
    @u.linked_to << @v
    @u.linked_to.should include(@v)
    u2 = User.find_by_id @u.id
    u2.linked_to.should include(@v)
  end
  it "can find users that linked to it" do
    @u.linked_to << @v
    @v.linked_from.should include(@u)
  end
  it "can see friends of its friends" do
    @u.linked_to << @v
    @v.linked_to << @w
    @u.linked_to[0].linked_to.should include(@w)
    @w.linked_from[0].linked_from.should include(@u)
  end
  it "has a list of all users linked to and from it" do
    @u.linked_to << @v
    @v.linked_to << @w
    @v.linked.should include(@u)
    @v.linked.should include(@w)
  end
end

The before method defines three instance variables @u, @v and @w. Rails requires the objects in many-to-many associations to be already saved to the database, so we give them appropriate login names and save them. I use the each loop combined with eval so I don’t have to repeat the almost-the-same code three times.

All the other ‘methods’ are pretty self-explanatory, as always when using rspec. I assumed that User has thee collections: linked_to, linked_from and linked, which implement the above mentioned lists. Then I test some basic relations between those collections. For example, if user @u links to @v, then @v should have @u on its linked_from list.

Models

Having specified User's behavior, we can try to implement it along with helping class, Relationship:

class Relationship < ActiveRecord::Base
  belongs_to :user,
    :class_name => 'User', :foreign_key => 'user_id'
  belongs_to :buddy,
    :class_name => 'User', :foreign_key => 'buddy_id'
end
class User < ActiveRecord::Base
  has_many :relations_to,
    :foreign_key => 'user_id',  :class_name => 'Relationship'
  has_many :relations_from,
    :foreign_key => 'buddy_id', :class_name => 'Relationship'                             

  has_many :linked_to,
    :through => :relations_to,   :source => :buddy
  has_many :linked_from,
    :through => :relations_from, :source => :user
end

Class Relationship has no particular business purpose (for now), but defines the roles: the initiator of the link is called user, and buddy is the responding party.

Then comes the User class, full of ActiveRecord magic that I, to be frank, don’t fully understand. It looks nice and symmetrical, but took many tries and some googling to get it to work. Probably this could be simplified, but I’m happy that it works.

The linked list

OK, we have linked_to and linked_from, but where is linked list?! — I hear you exclaim. Well, currently I don’t know how to implement it using pure Rails magic, so I resorted to a little bit more of SQL. I create a view called buddies:

create or replace view buddies as
  select user_id, buddy_id
    from relationships
  union
  select buddy_id, user_id
     from relationships

Here I use a little trick: the view lists the contents of relationships table two times — first normally, and then reversed. So, if relationships contains a tuple (1, 2), then the view has both (1, 2) and (2, 1).

Having this view in place, we can add following method to User class:

def linked
  User.find_by_sql "
    select *
    from users
    where id in (
      select buddy_id
      from buddies
      where user_id = #{id})"
end

The inner select finds id’s of all the users that are either linked to or linked from the current one, and the outer select just reads the user data. This is probably far from being optimal and should be done without resorting to find_by_sql, but it works.

Obviously, the collection returned by linked method doesn’t support adding and removing buddies, and that’s a good thing, because we need to know who initiated the link. To have the relations changed, we should use operations provided by linked_to and linked_from.

Acknowledgements and sources

My solution is based on Self-referential has_many :through associations blog entry by Josh Susser and most notably on the solutions in comments to this entry. You may also find Rails: ActiveRecord goes :through article by Scott Matthewman helpful. And, of course, see Rails documentation on has_many and belongs_to.

I think that the above code could be easily changed to implement other self-referential relations, like parent-child, boss-employee or virtually any problem that can be turned into some kind of graph algorithm.

In the future I’m going to try to implement the cool features I mentioned in the beginning: counting contacts of various degrees and finding shortest path between two users. If I succeed I will share my experiences in an upcoming entry.

Add to:
Digg! digg blog_head.png reddit delicioussmall.gif del.icio.us

~ by szeryf on 2007-06-27.

11 Responses to “Self-referential many-to-many relations in Ruby on Rails”

  1. I was just wondering…
    aren’t the lists all the same?:
    1. a list of users directly linked to (me → someone)
    2. a list of users directly linked from (someone → me)
    3. a list of users directly linked, no matter to or from (e.g. if A → B and B → C, then A and C should be on B’s linked list)

    If this is so, you only need a find_friends method in your user model:
    def self.find_friends(user_id)
    find_by_sql(“SELECT users.*
    FROM (
    SELECT user_id, friend_id
    FROM friendships
    WHERE user_id = #{user_id}
    UNION
    SELECT friend_id, user_id
    FROM friendships
    WHERE friend_id = #{user_id}
    ) AS friends
    INNER JOIN users
    ON friends.friend_id = users.id
    “)
    end

    please correct me if I’m wrong :)

  2. art: yes and no :-) on the lower level we want to have directed relations in case we need to know who started the relation etc. but there are other situations when we want to treat the relationships as undirected (looking for a path between two users). that’s why I defined two versions of the relations.

    as of your version of finding friends, I think it’s equivalent to my version. the difference is that you don’t use a view and instead do the union-trick inline in your query.

    PS. thanks for first comment on my blog :)

  3. [...] data makes me crazy I went on searching for any helpful information and found szeryf’s article. He uses some custom SQL to avoid [...]

  4. Hey,
    Is it possible :
    – to store an other field ?

    For example :
    create_table :relationships, :id => false do |t|
    t.column “user_id”, :integer, :null => false
    t.column “buddy_id”, :integer, :null => false
    t.column “feelling”, : string
    end

    – and to give the new paramter in :
    @v.linked_to(feeling) << @w

    Thx

  5. Yes, you can add additional fields to the relations table, but you cannot access them like this. The linked_to collection is a collection of User object and they don’t have feeling attribute. You have to use relation_to collection. Something like this:

    @v.relation_to << @w, :feeling => ‘good’

  6. Hey, you’r inner join doesn’t work for the ‘linked’ case, it just selects the ‘linked_from’.
    art’s solution should work.

  7. actually, it should be

    select * from users where id in (select user_id from relationships where buddy_id = #{n} union select buddy_id from relationships where user_id = #{4});

  8. great article, BTW, it really helped me a lot to understand rails relationships and self relationships and stuff. It worked great for a project in university.

    thanks !

  9. i didn’t use join, so I’m not sure which query you are referring to. your query returns the same as my query on view. nonetheless, I’m glad this helped you :)

  10. Thanks a lot for your post

  11. Thanks a lot for your post, it helped me a lot !

Leave a Reply