ALS block settings

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

ALS block settings

hagow
This post has NOT been accepted by the mailing list yet.
Hi,

I am trying to figure out how to find optimal values for the the ALS userBlocks and itemBlocks parameters.  
 
For example, I am having out of memory issues fitting the ALS model to a matrix with about 100 million users and 300 items and it sounds like these blocks parameters should help but I am unable to find documentation about how these blocks values should be adjusted.
 
For example, in my case should we be using more blocks? How many more? Is there a recommended ratio of blocks to users/items?
 
Anyone have any recommendations on best practices with these two settings?
 
Reply | Threaded
Open this post in threaded view
|

Re: ALS block settings

evanzamir
I have the same question. Trying to figure out how to get ALS to complete
with larger dataset. It seems to get stuck on "Count" from what I can tell.
I'm running 8 r4.4xlarge instances on Amazon EMR. The dataset is 80 GB (just
to give some idea of size). I assumed Spark could handle this, but maybe I
need to try some different settings like userBlock or itemBlock. Any help
appreciated!



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]