Actually it is the same for map and mapPartitions if you do transformations like this:
a.map(r => r * 2)
a.mapPartitions(iter => iter.map(r => r *2))
these are iterator to iterator transformations.
But mapPartitions are more flexible than map, you can do transformation like: Iterator[A] => Iterator[B], where Iterator[B] can be anything iterable, there's no one to one mapping constraint. In short words, mapPartitions is quite like superset of map. You can check MappedRDD and MapPartitionsRDD to see the details.
From: Huan Dao [mailto:[hidden email]]
Sent: Tuesday, December 24, 2013 1:15 PM
To: [hidden email] Subject: mapPartitions versus map overhead?
Hi all, is there any overhead of mapPartitions versus overhead, if I implement an algorithm using map -> reduce versus mapPartitions -> reduce.