Hi,
According to the paper on which MLlib's ALS is based, the model should take all user-item preferences as an input, including those which are not related to any input observation (zero preference). My question is: With all positive observations in hand (similar to explicit feedback data set), should I generate all negative observations in order to make implicit ALS work with the complete data set (pos union neg) ? Actually, we test on some data set like: | user | item | nbPurchase | nbPurchase is non zero, so we have no negative observations. What we did is generating all possible user-item with zero nbPurchase to have all possible user-item pair, but this operation takes some time and storage. I just want to make sure whether we have to do that with MLlib's ALS ? or it has already done that ? In that case, I could simply pass only the positive observation as the explicit ALS does. Hao. |
The paper definitely does not suggest that you should include every
user-item pair in the input. The input is by nature extremely sparse, so literally filling in all the 0s in the input would create overwhelmingly large input. No, there is no need to do it and it would be terrible for performance. As far as I can see, the implementation would correctly handle an input of 0 and the result would be as if it had not been included at all, but, that is to say that you do not include implicit 0 input, no. That's not quite negative input, either figuratively or literally. Are you trying to figure out how to include actual negative feedback (i.e. a signal that a user actively does not like an item)? That you do include if you like, and the implementation is extended from the original paper to meaningfully handle negative values. On Thu, Jun 5, 2014 at 4:46 PM, redocpot <[hidden email]> wrote: > Hi, > > According to the paper on which MLlib's ALS is based, the model should take > all user-item preferences > as an input, including those which are not related to any input observation > (zero preference). > > My question is: > > With all positive observations in hand (similar to explicit feedback data > set), should I generate all negative observations in order to make implicit > ALS work with the complete data set (pos union neg) ? > > Actually, we test on some data set like: > > | user | item | nbPurchase | > > nbPurchase is non zero, so we have no negative observations. What we did is > generating all possible user-item with zero nbPurchase to have all possible > user-item pair, but this operation takes some time and storage. > > I just want to make sure whether we have to do that with MLlib's ALS ? or it > has already done that ? In that case, I could simply pass only the positive > observation as the explicit ALS does. > > Hao. > > > > > -- > View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/implicit-ALS-dataSet-tp7067.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. |
Thank you for your quick reply.
As far as I know, the update does not require negative observations, because the update rule Xu = (YtCuY + λI)^-1 Yt Cu P(u) can be simplified by taking advantage of its algebraic structure, so negative observations are not needed. This is what I think at the first time I read the paper. What makes me confused is, after that, the paper (in Discussion section) says "Unlike explicit datasets, here the model should take all user-item preferences as an input, including those which are not related to any input observation (thus hinting to a zero preference). This is crucial, as the given observations are inherently biased towards a positive preference, and thus do not reflect well the user profile. However, taking all user-item values as an input to the model raises serious scalability issues – the number of all those pairs tends to significantly exceed the input size since a typical user would provide feedback only on a small fraction of the available items. We address this by exploiting the algebraic structure of the model, leading to an algorithm that scales linearly with the input size while addressing the full scope of user-item pairs without resorting to any sub-sampling." If my understanding is right, it seems that we need negative obs as input, but we dont use them during the updating. It is strange for me, because that will generate too many use-time pair, which is not possible. Thx for the confirmation. I will read the ALS implementation for more details. Hao |
On Thu, Jun 5, 2014 at 10:38 PM, redocpot <[hidden email]> wrote:
> can be simplified by taking advantage of its algebraic structure, so > negative observations are not needed. This is what I think at the first time > I read the paper. Correct, a big part of the reason that is efficient is because of sparsity of the input. > What makes me confused is, after that, the paper (in Discussion section) > says > > "Unlike explicit datasets, here *the model should take all user-item > preferences as an input, including those which are not related to any input It is not saying that these non-observations (I would not call them negative) should explicitly appear in the input. But their implicit existence can and should be used in the math. In particular, the loss function that is being minimized is minimizing error in the implicit "0" cells of the input too, just with much less weight. |
Hi,
Recently, I have launched a implicit ALS test on a real-world data set. Initially, we have 2 data set, one is the purchase record during 3 years past (training set), and the other is the one during 6 months just after the 3 years (test set) It's a database with 1060080 user and 23880 items. According the paper based on which MLlib als is implemented, we use expected percentile rank(EPR) to evaluation the recommendation performance. It shows a EPR about 8% - 9% which is considered as a good result in the paper. We did some sanity check. For example, each user has his own item list which is sorted by preference, then we just pick the top 10 items for each user. As a result, we found that there were only 169 different items among the (1060080 x 10) items picked, most of them are repeated. That means, given 2 users, the items recommended might be the same. Nothing is personalized. It seems that the system is focusing on the best-seller, sth like that. What we want is to recommended as many different items as possible. That makes the reco sys more reasonable. I am not sure if it is a common case for ALS ? Thanks, Hao |
On Thu, Jun 19, 2014 at 3:03 PM, redocpot <[hidden email]> wrote:
> We did some sanity check. For example, each user has his own item list which > is sorted by preference, then we just pick the top 10 items for each user. > As a result, we found that there were only 169 different items among the > (1060080 x 10) items picked, most of them are repeated. That means, given 2 > users, the items recommended might be the same. Nothing is personalized. This sounds like severe underfitting -- lambda is too high or the number of features is too small. |
One thing needs to be mentioned is that, in fact, the schema is (userId, itemId, nbPurchase), where nbPurchase is equivalent to ratings. I found that there are many one-timers, which means the pairs whose nbPurchase = 1. The number of these pairs is about 85% of all positive observations.
As the paper said, the low ratings will get a low confidence weight, so if I understand correctly, these dominant one-timers will be more unlikely to be recommended comparing to other items whose nbPurchase is bigger. In fact, lambda is also considered as a potential problem, as in our case, the lambda is set to 300, which is confirmed by the test set. Here is test result : lambda = 65 EPR_in = 0.06518592593142056 EPR_out = 0.14789338884259276 lambda = 100 EPR_in = 0.06619274171311466 EPR_out = 0.13494609978226865 lambda = 300 EPR_in = 0.08814703345418627 EPR_out = 0.09522125434156471 where EPR_in is given by training set and EPR_out is given by test set. It seems 300 is the right lambda, since less overfitting. Some other parameters are showed in the following code : val model = new ALS() .setImplicitPrefs(implicitPrefs = true) .setAlpha(1) .setLambda(300) .setRank(50) .setIterations(40) .setBlocks(8) .setSeed(42) .run(ratings_train) we set Alpha to 1, since the max nbPurchase is 1396. Not sure if Alpha is already too big. |
On Thu, Jun 19, 2014 at 3:44 PM, redocpot <[hidden email]> wrote:
> As the paper said, the low ratings will get a low confidence weight, so if I > understand correctly, these dominant one-timers will be more *unlikely* to > be recommended comparing to other items whose nbPurchase is bigger. Correct, yes. > In fact, lambda is also considered as a potential problem, as in our case, > the lambda is set to 300, which is confirmed by the test set. Here is test > result : Although people use lambda to mean different things in different places, in every interpretation I've seen, 300 is extremely high :) 1 is very high even. (alpha = 1 is the lowest value I'd try; it also depends on the data but sometimes higher values work well. For the data set in the original paper, they used alpha = 40) > where EPR_in is given by training set and EPR_out is given by test set. It > seems 300 is the right lambda, since less overfitting. I take your point about your results though, hm. Can you at least try much lower lambda? I'd have to think and speculate about why you might be observing this effect but a few more data points could help. It may be that you've forced the model into basically recommending globally top items, and that does OK as a local minimum, but personalized recommendations are better still, with a very different lambda. Also you might consider holding out the most-favored data as the test data. It biases the test a bit, but at least you are asking whether the model ranks highly things that are known to rank highly, rather than any old thing the user interacted with. |
Hi,
The real-world dataset is a bit more large, so I tested on the MovieLens data set, and find the same results:
To be clear, there are 1650 items in this movielens data set. Top 1 and Top 5 in the table means the nb of diff items on top1 and top5 according to the preference list for each user after ALS do the work. Top1, top5, EPR_in are based on training set. Only EPR_out is on test set. In the top1 and top5, all items are taken into account, no matter whether it is purchased or not. The table shows that small lambda( < 1) always leads to over fitting, while big lambda like 300 removes over fitting but the nb of diff items on the top 1 and top 5 of the preference list is very small (not personalized). |
Free forum by Nabble | Edit this page |