A recommendation system analyze your preference and automatically suggest you the similar item/product that you may be interested in.
For example Movie recommendation system such as NETFLIX , Product recommendation system such as AMAZON and so. on.
In this tutorial, I will be talking about food items recommendation system using collaborative approach with example.
Collaborative approach uses the idea of collaboration between users preference and finding the similarity between users preference. These measurement are used as a recommendation criteria.
This can be illustrate with following figure:
  
 
Lets consider 5 items and 4 users with following ratings:
 
 
By intuition, we can find the dissimilarity between User "Ram" and "Shyam" then the other users. So we need an algorithm to find the similarity between users so that this similarity criteria can be used for the recommendation propose. Lets say ram(5,5, ?,0,0) is a rating vector for user ram and shyam(0,0,2,4,5) is for shyam ans so.on.
Similarity measure by Jecard Distance: The Jeccard distance between vector A and Vector B is defined as:
sim(A,B)= |A intersection B|/ |A union B|
For Example: the jeccard distance between Ram and Shyam is =4/9 i.e common items divided by total item rated.
Here the problem with this approach is that it ignore the value of rating and just consider whether the rating is present or not.
The next option is cosine similarity. it is defined as
cos(A,B) =A.B/|A|.|B|
Now the cosine similarity between ram and sham is :
Ram =(5,5, ?,0,0)=(5,5, 0,0,0) here feel the unknown rating by zero.
Syam=(0,0,2,4,5)
Now sim(Ram, Shyam)=(5x0+5x0+0x2+0x4+0x5)+/sqrt(5x5+5x5) xsqrt(2x2+4x4+5x5) =0
i.e they are opposite to each other. it gives better estimation of similarity then jeccard in case of rating values. There are many improvements on cosine similarity such as centered cosine or Pearson correlations and so.on.
Rating Predication
suppose we want to predict the rating of user x to item i, then we select the N-most similar user to x who also have rated item i and then can take a average of rating of this item i by these N user as a rating for item i by user x. This is very simple approach. The other approach is to rake weighted average.
The approach explained above is user based collaborative filtering. Now the another version of collaborative filtering is item based approach, Here this is very similar to user based. We need to find the similarity between item to item and then predict the rating of item i to user x.
For example Movie recommendation system such as NETFLIX , Product recommendation system such as AMAZON and so. on.
In this tutorial, I will be talking about food items recommendation system using collaborative approach with example.
Collaborative approach uses the idea of collaboration between users preference and finding the similarity between users preference. These measurement are used as a recommendation criteria.
This can be illustrate with following figure:
Lets consider 5 items and 4 users with following ratings:
| 
Items | 
Ram | 
Shyam | 
Hari | 
Gopal | 
| 
MoMO | 
5 | 
0 | 
0 | 
0 | 
| 
PIZZA | 
5 | 
0 | 
? | 
0 | 
| 
BIRIYANI | 
? | 
2 | 
5 | 
? | 
| 
NODDLES | 
0 | 
4 | 
0 | 
4 | 
| 
CHICKEN ROLL | 
0 | 
5 | 
0 | 
? | 
By intuition, we can find the dissimilarity between User "Ram" and "Shyam" then the other users. So we need an algorithm to find the similarity between users so that this similarity criteria can be used for the recommendation propose. Lets say ram(5,5, ?,0,0) is a rating vector for user ram and shyam(0,0,2,4,5) is for shyam ans so.on.
Similarity measure by Jecard Distance: The Jeccard distance between vector A and Vector B is defined as:
sim(A,B)= |A intersection B|/ |A union B|
For Example: the jeccard distance between Ram and Shyam is =4/9 i.e common items divided by total item rated.
Here the problem with this approach is that it ignore the value of rating and just consider whether the rating is present or not.
The next option is cosine similarity. it is defined as
cos(A,B) =A.B/|A|.|B|
Now the cosine similarity between ram and sham is :
Ram =(5,5, ?,0,0)=(5,5, 0,0,0) here feel the unknown rating by zero.
Syam=(0,0,2,4,5)
Now sim(Ram, Shyam)=(5x0+5x0+0x2+0x4+0x5)+/sqrt(5x5+5x5) xsqrt(2x2+4x4+5x5) =0
i.e they are opposite to each other. it gives better estimation of similarity then jeccard in case of rating values. There are many improvements on cosine similarity such as centered cosine or Pearson correlations and so.on.
Rating Predication
suppose we want to predict the rating of user x to item i, then we select the N-most similar user to x who also have rated item i and then can take a average of rating of this item i by these N user as a rating for item i by user x. This is very simple approach. The other approach is to rake weighted average.
The approach explained above is user based collaborative filtering. Now the another version of collaborative filtering is item based approach, Here this is very similar to user based. We need to find the similarity between item to item and then predict the rating of item i to user x.
