Implementing this in Spark could mean a significant speedup in ALS solving as the order of growth is smaller than the default solver (Cholesky). This has the potential to improve the training phase of collaborative filtering significantly.
I've opened a JIRA ticket but thought I'd reach out here as well since I've implemented the algorithm (for implicit feedback) and demonstrated it's correctness but I am having trouble actually seeing a performance speedup, likely due to incorrect handling of RDD persistence/checkpointing. I wasn't sure the best way to reach out to see if there were dev cycles available to collaborate on completing this solution but I figure it has the potential to have a big impact within Spark and MLLib. If there is interest, I can open a pull request with the functionally correct code I have as of now.
Note, we are seeing collaborative filtering training times of over 3 hours within Spark (4 large node instances) compared to ~8 minutes on a single machine running the Implicit library cited above. It would be great to get this kind of speedup within Spark and potentially benefit from the added parallelism.