

Hi,
According to the paper on which MLlib's ALS is based, the model should take all useritem preferences
as an input, including those which are not related to any input observation (zero preference).
My question is:
With all positive observations in hand (similar to explicit feedback data set), should I generate all negative observations in order to make implicit ALS work with the complete data set (pos union neg) ?
Actually, we test on some data set like:
 user  item  nbPurchase 
nbPurchase is non zero, so we have no negative observations. What we did is generating all possible useritem with zero nbPurchase to have all possible useritem pair, but this operation takes some time and storage.
I just want to make sure whether we have to do that with MLlib's ALS ? or it has already done that ? In that case, I could simply pass only the positive observation as the explicit ALS does.
Hao.


The paper definitely does not suggest that you should include every
useritem pair in the input. The input is by nature extremely sparse,
so literally filling in all the 0s in the input would create
overwhelmingly large input. No, there is no need to do it and it would
be terrible for performance.
As far as I can see, the implementation would correctly handle an
input of 0 and the result would be as if it had not been included at
all, but, that is to say that you do not include implicit 0 input, no.
That's not quite negative input, either figuratively or literally. Are
you trying to figure out how to include actual negative feedback (i.e.
a signal that a user actively does not like an item)? That you do
include if you like, and the implementation is extended from the
original paper to meaningfully handle negative values.
On Thu, Jun 5, 2014 at 4:46 PM, redocpot < [hidden email]> wrote:
> Hi,
>
> According to the paper on which MLlib's ALS is based, the model should take
> all useritem preferences
> as an input, including those which are not related to any input observation
> (zero preference).
>
> My question is:
>
> With all positive observations in hand (similar to explicit feedback data
> set), should I generate all negative observations in order to make implicit
> ALS work with the complete data set (pos union neg) ?
>
> Actually, we test on some data set like:
>
>  user  item  nbPurchase 
>
> nbPurchase is non zero, so we have no negative observations. What we did is
> generating all possible useritem with zero nbPurchase to have all possible
> useritem pair, but this operation takes some time and storage.
>
> I just want to make sure whether we have to do that with MLlib's ALS ? or it
> has already done that ? In that case, I could simply pass only the positive
> observation as the explicit ALS does.
>
> Hao.
>
>
>
>
> 
> View this message in context: http://apachesparkuserlist.1001560.n3.nabble.com/implicitALSdataSettp7067.html> Sent from the Apache Spark User List mailing list archive at Nabble.com.


Thank you for your quick reply.
As far as I know, the update does not require negative observations, because the update rule
Xu = (YtCuY + λI)^1 Yt Cu P(u)
can be simplified by taking advantage of its algebraic structure, so negative observations are not needed. This is what I think at the first time I read the paper.
What makes me confused is, after that, the paper (in Discussion section) says
"Unlike explicit datasets, here the model should take all useritem preferences as an input, including those which are not related to any input observation (thus hinting to a zero preference). This is crucial, as the given observations are inherently biased towards a positive preference, and thus do not reflect well the user profile.
However, taking all useritem values as an input to the model raises serious scalability issues – the number of all those pairs tends to significantly exceed the input size since a typical user would provide feedback only on a small fraction of the available items. We address this by exploiting the algebraic structure of the model, leading to an algorithm that scales linearly with the input size while addressing the full scope of useritem pairs without resorting to any subsampling."
If my understanding is right, it seems that we need negative obs as input, but we dont use them during the updating. It is strange for me, because that will generate too many usetime pair, which is not possible.
Thx for the confirmation. I will read the ALS implementation for more details.
Hao


On Thu, Jun 5, 2014 at 10:38 PM, redocpot < [hidden email]> wrote:
> can be simplified by taking advantage of its algebraic structure, so
> negative observations are not needed. This is what I think at the first time
> I read the paper.
Correct, a big part of the reason that is efficient is because of
sparsity of the input.
> What makes me confused is, after that, the paper (in Discussion section)
> says
>
> "Unlike explicit datasets, here *the model should take all useritem
> preferences as an input, including those which are not related to any input
It is not saying that these nonobservations (I would not call them
negative) should explicitly appear in the input. But their implicit
existence can and should be used in the math.
In particular, the loss function that is being minimized is minimizing
error in the implicit "0" cells of the input too, just with much less
weight.


Hi,
Recently, I have launched a implicit ALS test on a realworld data set.
Initially, we have 2 data set, one is the purchase record during 3 years past (training set), and the other is the one during 6 months just after the 3 years (test set)
It's a database with 1060080 user and 23880 items.
According the paper based on which MLlib als is implemented, we use expected percentile rank(EPR) to evaluation the recommendation performance. It shows a EPR about 8%  9% which is considered as a good result in the paper.
We did some sanity check. For example, each user has his own item list which is sorted by preference, then we just pick the top 10 items for each user. As a result, we found that there were only 169 different items among the (1060080 x 10) items picked, most of them are repeated. That means, given 2 users, the items recommended might be the same. Nothing is personalized.
It seems that the system is focusing on the bestseller, sth like that. What we want is to recommended as many different items as possible. That makes the reco sys more reasonable.
I am not sure if it is a common case for ALS ?
Thanks,
Hao


On Thu, Jun 19, 2014 at 3:03 PM, redocpot < [hidden email]> wrote:
> We did some sanity check. For example, each user has his own item list which
> is sorted by preference, then we just pick the top 10 items for each user.
> As a result, we found that there were only 169 different items among the
> (1060080 x 10) items picked, most of them are repeated. That means, given 2
> users, the items recommended might be the same. Nothing is personalized.
This sounds like severe underfitting  lambda is too high or the
number of features is too small.


One thing needs to be mentioned is that, in fact, the schema is (userId, itemId, nbPurchase), where nbPurchase is equivalent to ratings. I found that there are many onetimers, which means the pairs whose nbPurchase = 1. The number of these pairs is about 85% of all positive observations.
As the paper said, the low ratings will get a low confidence weight, so if I understand correctly, these dominant onetimers will be more unlikely to be recommended comparing to other items whose nbPurchase is bigger.
In fact, lambda is also considered as a potential problem, as in our case, the lambda is set to 300, which is confirmed by the test set. Here is test result :
lambda = 65
EPR_in = 0.06518592593142056
EPR_out = 0.14789338884259276
lambda = 100
EPR_in = 0.06619274171311466
EPR_out = 0.13494609978226865
lambda = 300
EPR_in = 0.08814703345418627
EPR_out = 0.09522125434156471
where EPR_in is given by training set and EPR_out is given by test set. It seems 300 is the right lambda, since less overfitting.
Some other parameters are showed in the following code :
val model = new ALS()
.setImplicitPrefs(implicitPrefs = true)
.setAlpha(1)
.setLambda(300)
.setRank(50)
.setIterations(40)
.setBlocks(8)
.setSeed(42)
.run(ratings_train)
we set Alpha to 1, since the max nbPurchase is 1396. Not sure if Alpha is already too big.


On Thu, Jun 19, 2014 at 3:44 PM, redocpot < [hidden email]> wrote:
> As the paper said, the low ratings will get a low confidence weight, so if I
> understand correctly, these dominant onetimers will be more *unlikely* to
> be recommended comparing to other items whose nbPurchase is bigger.
Correct, yes.
> In fact, lambda is also considered as a potential problem, as in our case,
> the lambda is set to 300, which is confirmed by the test set. Here is test
> result :
Although people use lambda to mean different things in different
places, in every interpretation I've seen, 300 is extremely high :) 1
is very high even.
(alpha = 1 is the lowest value I'd try; it also depends on the data
but sometimes higher values work well. For the data set in the
original paper, they used alpha = 40)
> where EPR_in is given by training set and EPR_out is given by test set. It
> seems 300 is the right lambda, since less overfitting.
I take your point about your results though, hm. Can you at least try
much lower lambda? I'd have to think and speculate about why you might
be observing this effect but a few more data points could help. It may
be that you've forced the model into basically recommending globally
top items, and that does OK as a local minimum, but personalized
recommendations are better still, with a very different lambda.
Also you might consider holding out the mostfavored data as the test
data. It biases the test a bit, but at least you are asking whether
the model ranks highly things that are known to rank highly, rather
than any old thing the user interacted with.


Hi,
The realworld dataset is a bit more large, so I tested on the MovieLens data set, and find the same results:
alpha  lambda  rank  top1  top5  EPR_in  EPR_out  40  0.001  50  297  559  0.05855  0.17299  40  0.01  50  295  559  0.05854  0.17298  40  0.1  50  296  560  0.05846  0.17287  40  1  50  309  564  0.05819  0.17227  40  25  50  287  537  0.05699  0.14855  40  50  50  267  496  0.05795  0.13389  40  100  50  247  444  0.06504  0.11920  40  200  50  145  306  0.09558  0.11388  40  300  50  77  178  0.11340  0.12264 
To be clear, there are 1650 items in this movielens data set. Top 1 and Top 5 in the table means the nb of diff items on top1 and top5 according to the preference list for each user after ALS do the work. Top1, top5, EPR_in are based on training set. Only EPR_out is on test set. In the top1 and top5, all items are taken into account, no matter whether it is purchased or not.
The table shows that small lambda( < 1) always leads to over fitting, while big lambda like 300 removes over fitting but the nb of diff items on the top 1 and top 5 of the preference list is very small (not personalized).

