A Regression Modeling View of NB methods.
r ^ u t = Ξ£ j β Q t ( u ) A d j u s t e d C o s i n e ( j , t ) β
r u j Ξ£ j β Q t ( u ) β£ A d j u s t e d C o s i n e ( j , t ) β£ \hat{r}_{ut}=\dfrac{\Sigma_{j\in Q_t(u)}AdjustedCosine(j,t)\cdot r_{uj}}{\Sigma_{j \in Q_t(u)}|AdjustedCosine(j,t)|} r ^ u t β = Ξ£ j β Q t β ( u ) β β£ A d j u s t e d C os in e ( j , t ) β£ Ξ£ j β Q t β ( u ) β A d j u s t e d C os in e ( j , t ) β
r u j β β User-Based Nearest Neighbor Regression
The predicted rating is a weighted linear combination of other ratings of the same item. If P u ( j ) P_u(j) P u β ( j ) contains all ratings of item j, this combination becomes similar to a linear regression. The difference is that the linear regression find coefficients by solving optimization problems, whereas the recommender system chooses coefficients in a heuristic way with the user-user similarities.
r ^ u j = ΞΌ u + Ξ£ v β P u ( j ) S i m ( u , v ) β
s v j Ξ£ v β P u ( j ) β£ S i m ( u , v ) β£ , β
β β
β s v j = r v j β ΞΌ v \hat{r}_{uj}=\mu_u+\dfrac{\Sigma_{v\in P_u(j)}Sim(u,v)\cdot s_{vj}}{\Sigma_{v \in P_u(j)}|Sim(u,v)|}, \;\; s_{vj}=r_{vj}-\mu_v r ^ u j β = ΞΌ u β + Ξ£ v β P u β ( j ) β β£ S im ( u , v ) β£ Ξ£ v β P u β ( j ) β S im ( u , v ) β
s v j β β , s v j β = r v j β β ΞΌ v β The above expression is changed into the below expression.
r ^ u j = ΞΌ u + β v β P u ( j ) w v u u s e r β
( r v j β ΞΌ v ) \hat{r}_{uj}=\mu_u+\sum_{v \in P_u(j)} w^{user}_{vu} \cdot (r_{vj}-\mu_v) r ^ u j β = ΞΌ u β + v β P u β ( j ) β β w vu u ser β β
( r v j β β ΞΌ v β ) m i n J u = β j β I u ( r u j β r ^ u j ) 2 = β j β I u ( r u j β [ ΞΌ u + β v β P u ( j ) w v u u s e r β
( r v j β ΞΌ v ) ] ) 2 minJ_u=\sum_{j\in I_u}(r_{uj}-\hat{r}_{uj})^2=\sum_{j\in I_u}(r_{uj}-[\mu_u+\sum_{v \in P_u(j)} w^{user}_{vu}\cdot (r_{vj}-\mu_v)])^2 min J u β = j β I u β β β ( r u j β β r ^ u j β ) 2 = j β I u β β β ( r u j β β [ ΞΌ u β + v β P u β ( j ) β β w vu u ser β β
( r v j β β ΞΌ v β )] ) 2 m i n β u = 1 m J u = β u = 1 m β j β I u ( r u j β [ ΞΌ u + β v β P u ( j ) w v u u s e r β
( r v j β ΞΌ v ) ] ) 2 min\sum^m_{u=1}J_u=\sum^m_{u=1}\sum_{j\in I_u}(r_{uj}-[\mu_u+\sum_{v \in P_u(j)} w^{user}_{vu}\cdot (r_{vj}-\mu_v)])^2 min u = 1 β m β J u β = u = 1 β m β j β I u β β β ( r u j β β [ ΞΌ u β + v β P u β ( j ) β β w vu u ser β β
( r v j β β ΞΌ v β )] ) 2 To reduce model complexity, the regularization term like Ξ» Ξ£ j β I u Ξ£ v β P u ( j ) ( w v u u s e r ) 2 \lambda \Sigma_{j \in I_u} \Sigma_{v \in P_u(j)} (w^{user}_{vu})^2 Ξ» Ξ£ j β I u β β Ξ£ v β P u β ( j ) β ( w vu u ser β ) 2 could be added as regression do. P u ( j ) P_u(j) P u β ( j ) can be vastly different for the same user u u u and varying item indices(denoted by j j j ), because of the extraordinary level of sparsity inherent in rating matrices. Let me consider a scenario where one similar user rated movie N e r o Nero N ero whereas four similar user rated G l a d i a t o r Gladiator Gl a d ia t or for target user u u u . The regression coefficient w v u u s e r w^{user}_{vu} w vu u ser β is heavily influenced by the rating for G l a d i a t o r Gladiator Gl a d ia t or because it has more sample. It leads to overfitting problem, so scaling method needs to be applied in P u ( j ) P_u(j) P u β ( j ) .
r ^ u j β
β£ P u ( j ) β£ k = ΞΌ u + β v β P u ( j ) w v u u s e r β
( r v j β ΞΌ v ) \hat{r}_{uj}\cdot \dfrac{|P_u(j)|}{k}=\mu_u+\sum_{v \in P_u(j)} w^{user}_{vu} \cdot (r_{vj}-\mu_v) r ^ u j β β
k β£ P u β ( j ) β£ β = ΞΌ u β + v β P u β ( j ) β β w vu u ser β β
( r v j β β ΞΌ v β ) This expression predicts a fraction β£ P u ( j ) β£ k \frac{|P_u(j)|}{k} k β£ P u β ( j ) β£ β of the rating of target user u u u for item j j j .
r ^ u j = b u u s e r + Ξ£ v β P u ( j ) w v u u s e r β
( r v j β b v u s e r ) β£ P u ( j ) β£ \hat{r}_{uj}=b^{user}_u+\dfrac{\Sigma_{v \in P_u(j)} w^{user}_{vu} \cdot (r_vj - b^{user}_v)}{\sqrt{|P_u(j)|}} r ^ u j β = b u u ser β + β£ P u β ( j ) β£ β Ξ£ v β P u β ( j ) β w vu u ser β β
( r v β j β b v u ser β ) β ΞΌ v \mu_v ΞΌ v β is replaced by a bias variable b u b_u b u β
r ^ u j = b u u s e r + b j i t e m + Ξ£ v β P u ( j ) w v u u s e r β
( r v j β b v u s e r β b j i t e m ) β£ P u ( j ) β£ \hat{r}_{uj}=b^{user}_u+b^{item}_j+\dfrac{\Sigma_{v \in P_u(j)} w^{user}_{vu} \cdot (r_{vj} - b^{user}_v - b^{item} _ j )}{\sqrt{|P_u(j)|}} r ^ u j β = b u u ser β + b j i t e m β + β£ P u β ( j ) β£ β Ξ£ v β P u β ( j ) β w vu u ser β β
( r v j β β b v u ser β β b j i t e m β ) β Item-Based Nearest Neighbor Regression
r ^ u t = β j β Q t ( u ) w j t i t e m β
r u j \hat{r}_{ut}=\sum_{j \in Q_t(u)}w^{item}_{jt} \cdot r_{uj} r ^ u t β = j β Q t β ( u ) β β w j t i t e m β β
r u j β m i n J t = β u β U t ( r u t β r ^ u t ) 2 = β u β U t ( r u t β β j β Q t ( u ) w j t i t e m β
r u j ) 2 minJ_t=\sum_{u \in U_t} (r_{ut}-\hat{r}_{ut})^2=\sum_{u \in U_t} (r_{ut}-\sum_{j \in Q_t(u)} w^{item}_{jt} \cdot r_{uj})^2 min J t β = u β U t β β β ( r u t β β r ^ u t β ) 2 = u β U t β β β ( r u t β β j β Q t β ( u ) β β w j t i t e m β β
r u j β ) 2 m i n β t = 1 n β u β U t ( r u t β β j β Q t ( u ) w j t i t e m β
r u j ) 2 min\sum_{t=1}^n\sum_{u \in U_t}(r_{ut} -\sum_{j \in Q_t(u)} w^{item}_{jt}\cdot r_{uj})^2 min t = 1 β n β u β U t β β β ( r u t β β j β Q t β ( u ) β β w j t i t e m β β
r u j β ) 2 r ^ u t = b u u s e r + b t i t e m + Ξ£ j β Q t ( u ) w j t i t e m β
( r u j β b u u s e r β b j i t e m ) β£ Q t ( u ) β£ \hat{r}_{ut}=b^{user}_u+b^{item}_t+\dfrac{\Sigma_{j \in Q_t(u)}w^{item}_{jt} \cdot (r_{uj}-b^{user}_u-b^{item}_j)}{\sqrt{|Q_t(u)|}} r ^ u t β = b u u ser β + b t i t e m β + β£ Q t β ( u ) β£ β Ξ£ j β Q t β ( u ) β w j t i t e m β β
( r u j β β b u u ser β β b j i t e m β ) β Combined Method
r ^ u j = b u u s e r + b j i t e m + Ξ£ v β P u ( j ) w v u u s e r β
( r v j β B v j ) β£ P u ( j ) β£ + Ξ£ j β Q t ( u ) w j t i t e m β
( r u j β B u j ) β£ Q t ( u ) β£ \hat{r}_{uj}=b^{user}_{u}+b^{item}_j+\dfrac{\Sigma_{v \in P_u(j)} w^{user}_{vu} \cdot (r_{vj}-B_{vj})}{\sqrt{|P_u(j)|}}+\dfrac{\Sigma_{j\in Q_t(u)} w^{item}_{jt} \cdot (r_{uj}-B_{uj})}{\sqrt{|Q_t(u)|}} r ^ u j β = b u u ser β + b j i t e m β + β£ P u β ( j ) β£ β Ξ£ v β P u β ( j ) β w vu u ser β β
( r v j β β B v j β ) β + β£ Q t β ( u ) β£ β Ξ£ j β Q t β ( u ) β w j t i t e m β β
( r u j β β B u j β ) β