Re: How to use parallelize feature with newAPIHadoopRDD?
This post has NOT been accepted by the mailing list yet.
I tried to use map each of them by returning the key that I want to use as join key. Then join them and use foreach to be able to get the query results. (Here is the gist: https://gist.github.com/buremba/9919584)
However since I had to use map before joining column families I'm not sure whether this is a efficient way to do this operation or not. Do you have any suggestion?