Is there a way to control where RDD partition physically go to?

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Is there a way to control where RDD partition physically go to?

Yishu Lin
Let’s say I have a RDD that represents user’s behavior data. I can shard the RDD to several partitions on user id by HashPartitioner.  Is there any way that I can control to which machine each partition goes to? Or how does Spark distribute partitions onto each machine? Thanks!

Yishu
Reply | Threaded
Open this post in threaded view
|

Re: Is there a way to control where RDD partition physically go to?

Longzhen Lin
This post has NOT been accepted by the mailing list yet.
    In Mesos,Spark fills each node with tasks in a round-robin manner so that tasks are balanced across the cluster.I also want to find a way to control the partition,I have the same problem with you.