Resilient nature of RDD

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Resilient nature of RDD

David Thomas
Can someone explain how RDD is resilient? If one of the partition is lost, who is responsible to recreate that partition - is it the driver program?
Reply | Threaded
Open this post in threaded view
|

Re: Resilient nature of RDD

Patrick Wendell
The driver stores the meta-data associated with the partition, but the re-computation will occur on an executor. So if several partitions are lost, e.g. due to a few machines failing, the re-computation can be striped across the cluster making it fast.


On Wed, Apr 2, 2014 at 11:27 AM, David Thomas <[hidden email]> wrote:
Can someone explain how RDD is resilient? If one of the partition is lost, who is responsible to recreate that partition - is it the driver program?

Reply | Threaded
Open this post in threaded view
|

Re: Resilient nature of RDD

David Thomas
I'm trying to understand the Spark soure code. Could you please point me to the code where the compute() function of RDD is called. Is that called by the workers?


On Wed, Apr 2, 2014 at 5:36 PM, Patrick Wendell <[hidden email]> wrote:
The driver stores the meta-data associated with the partition, but the re-computation will occur on an executor. So if several partitions are lost, e.g. due to a few machines failing, the re-computation can be striped across the cluster making it fast.


On Wed, Apr 2, 2014 at 11:27 AM, David Thomas <[hidden email]> wrote:
Can someone explain how RDD is resilient? If one of the partition is lost, who is responsible to recreate that partition - is it the driver program?


Reply | Threaded
Open this post in threaded view
|

Re: Resilient nature of RDD

Andrew Or-2
It all begins with calling rdd.iterator, which calls rdd.computeOrReadCheckpoint(). This materializes the RDD if it's not already materialized, or reads a previously checkpointed version if it is. SeeĀ https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/rdd/RDD.scala#L216


On Thu, Apr 3, 2014 at 8:44 PM, David Thomas <[hidden email]> wrote:
I'm trying to understand the Spark soure code. Could you please point me to the code where the compute() function of RDD is called. Is that called by the workers?


On Wed, Apr 2, 2014 at 5:36 PM, Patrick Wendell <[hidden email]> wrote:
The driver stores the meta-data associated with the partition, but the re-computation will occur on an executor. So if several partitions are lost, e.g. due to a few machines failing, the re-computation can be striped across the cluster making it fast.


On Wed, Apr 2, 2014 at 11:27 AM, David Thomas <[hidden email]> wrote:
Can someone explain how RDD is resilient? If one of the partition is lost, who is responsible to recreate that partition - is it the driver program?