administration mode
Pssst...Ferdy is the creator of JungleDragon, an awesome wildlife community. Visit JungleDragon

 

Friends are expensive »

FERDY CHRISTANT - NOV 13, 2009 (02:08:29 PM)

Friends are expensive, in a web application. Allow me to explain. Like many other web applications, JungleDragon allows users to make friends with other users. It is trivial to model this relationship in a database:

The UserFriend table simply maps the relationship between a user and his friend. This makes it very easy to get a list of friends for a specific user. For one particular user, it is also easy to check if it concerns a friend of the current user.

However, there is a case where this model becomes highly ineffective and you are left with a scalability problem. The case concerns the situation where a page shows a large number of users who may or may not be a friend of the user, and you want to indicate whether this is a friend or not. An example of such a case is a large comment thread. 

The problem with a scenario like that is that either you have to trigger a "friend" query for each user you render on the screen, or you could do this all in a single, very expensive join query. The cost of this query is N x M and will put a serious load on your database. Since the database is the hardest thing to scale, I decided to drop this feature, since it is a nice-to-have. Note that I'm not dropping the friends feature, only the friend indicator in lists of users. You can still see who your friends are from your profile.

The scalability issue presented above is quite common, this article explains how Digg has the same issue.Of course their user base is huge, but the issue is the same. They are even looking into exotic non-relational systems just in order to be able to mark a user, story or comment as green, meaning a friend of the current user.

Case two

Whilst on the subject of optimizing the database, I decided to denormalize a small part of the database too. Everywhere you go in JungleDragon, a user's karma, level and class are shown. In the database, this was modeled like so:

Based on a user's karma, the level is calculated. The level is determined by the minimum and maximum karma. For example, if you have 550 karma, it could mean that you are level 3, because level 3 concerns the range of 200 - 600 karma. Based on your level, you are assigned a class. Again this works based on a range. For example, level 3-5 could mean you have the class "Grizzly Bear".

Functionally, this model is nice. I can rearrange the level and class ranges and all users will automatically get assigned the correct level and class. However, displaying a user's level and class occurs multiple times on each page. You always see your own level and class, but also those of your friends, commenters, image uploaders, etc. For each user to display, I need to do a double inner join to calculate the level and class, again putting a large load on the database. Therefore, I decided to denormalize this. The new model looks like this:

 

As you can see, the user table now holds the level and class. The Level and Class tables are not gone, they are used to calculate the latest level and class of the user each time they get assigned new karma. This way, I can directly query the user table for these values, without the need of an expensive query. In the rare event that I change the Level and Class definitions, I run an update script that stamps the corrected level and class on the profile of each user. 

Premature optimization is considered the root of all evil, yet I consider these cases to be best practices. 

Share |

Comments: 4

COMMENT: GARTH THOMAS email

NOV 17, 2009 - 08:18:08 PM

comment » okay so why not cache the user's friend data in their session? if you built a cacheFriend() method and called it on logon and then perhaps every time the user leaves their Update Friends page (I know I'm making this up but stick with me :) Then you don't need to hit the backend every time you hit a page, you simply have your client side code dynamically build the friend status code from the users session. «

COMMENT: FERDY

NOV 18, 2009 - 08:09:57 AM

comment » Garth,

It's a nice suggestion, but some people may have a lot of friends. I guess it depends on how you store your session. Cookies only allow for 3KB of data. Still, it's an interesting suggestion. Perhaps memcached can help too. «

COMMENT: GARTH THOMAS email

DEC 9, 2009 - 06:46:13 PM

comment » yeah, if you are worried about space in the users session you could use caching on the server side (sorry I don't know what you are using as a server so I am making some assumptions). caching the friends keys would allow you to call a javascript object onLoad that would collect a list of the friends user's ids from the page (say stored in an attribute in the form Ferdy) and make an AJAX call off to a servlet that would return a list of friend matches. The callback code could then loop through and change their style to "friend" or whatever to provide the functionality you want.

If you want to you can update the cache on the server side as people change their friends list and the javascript can rerun the AJAX routine say every 10 mins so people can see updates in near realtime without ever hitting the backend. «

COMMENT: GARTH THOMAS email

DEC 9, 2009 - 06:53:19 PM

comment » hahaha, I should read the content rules before submitting comments! "stored in an attribute in the form Ferdy" should read (angled brackets replaced with square brackets): [span style="user" userid="1234"]Ferdy[/span]

i guess you could go further here and then add javascript events for creating online chat and presence awareness also «

RATE THIS CONTENT (OPTIONAL)
Was this document useful to you?
 
rating Awesome
rating Good
rating Average
rating Poor
rating Useless
CREATE A NEW COMMENT
required field
required field HTML is not allowed. Hyperlinks will automatically be converted.