Checkpoint Vs Cache

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Checkpoint Vs Cache

David Thomas
What is the difference between checkpointing and caching an RDD?
Reply | Threaded
Open this post in threaded view
|

Re: Checkpoint Vs Cache

Mayur Rustagi

For starters cacheing may or may not be persisted on disk , but check pointing will be.
Also cache is generic & check pointing is specific to streaming.

On Apr 14, 2014 7:51 AM, "David Thomas" <[hidden email]> wrote:
What is the difference between checkpointing and caching an RDD?
Reply | Threaded
Open this post in threaded view
|

Re: Checkpoint Vs Cache

Cheng Lian
In reply to this post by David Thomas
Checkpointed RDDs are materialized on disk, while cached RDDs are materialized in memory. When memory is insufficient, cached RDD blocks (1 block per partition) will be evicted in an LRU manner. An evicted RDD block will be spilled to disk if the storage level of the RDD allows, otherwise this block vanishes entirely and must be recomputed from the lineage DAG if it's referenced later.


On Mon, Apr 14, 2014 at 10:20 AM, David Thomas <[hidden email]> wrote:
What is the difference between checkpointing and caching an RDD?

Reply | Threaded
Open this post in threaded view
|

Re: Checkpoint Vs Cache

cfregly


On Mon, Apr 14, 2014 at 2:43 AM, Cheng Lian <[hidden email]> wrote:
Checkpointed RDDs are materialized on disk, while cached RDDs are materialized in memory. When memory is insufficient, cached RDD blocks (1 block per partition) will be evicted in an LRU manner. An evicted RDD block will be spilled to disk if the storage level of the RDD allows, otherwise this block vanishes entirely and must be recomputed from the lineage DAG if it's referenced later.


On Mon, Apr 14, 2014 at 10:20 AM, David Thomas <[hidden email]> wrote:
What is the difference between checkpointing and caching an RDD?