ALS block settings

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view

ALS block settings

This post has NOT been accepted by the mailing list yet.

I am trying to figure out how to find optimal values for the the ALS userBlocks and itemBlocks parameters.  
For example, I am having out of memory issues fitting the ALS model to a matrix with about 100 million users and 300 items and it sounds like these blocks parameters should help but I am unable to find documentation about how these blocks values should be adjusted.
For example, in my case should we be using more blocks? How many more? Is there a recommended ratio of blocks to users/items?
Anyone have any recommendations on best practices with these two settings?
Reply | Threaded
Open this post in threaded view

Re: ALS block settings

I have the same question. Trying to figure out how to get ALS to complete
with larger dataset. It seems to get stuck on "Count" from what I can tell.
I'm running 8 r4.4xlarge instances on Amazon EMR. The dataset is 80 GB (just
to give some idea of size). I assumed Spark could handle this, but maybe I
need to try some different settings like userBlock or itemBlock. Any help

Sent from:

To unsubscribe e-mail: [hidden email]