Mean/Averaging list of lists in an RDD

Previous Topic Next Topic
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

Mean/Averaging list of lists in an RDD

This post has NOT been accepted by the mailing list yet.
It may be simple question, but I can not figure out the most efficient way.

I have an RDD containing list  like this :

RDD1 : ((key1, key2),List(1,2,3,4,5,6,7,8,9,10))

I transformed this (by using sliding(4) or grouped(4) ) to : ((key1, key2),List(List(1,2,3,4), List(5,6,7,8), List(9,10)))

 By using :   RDD1 = RDD. map { x => ((x._1._1, x._1._2),  x._2.sliding(4).toList) }

I want to transform this to :
                                                    ((key1, key2),List(List(5), List(5,6,7,8), List(9,10)))
where :
 5 is the mean(1,2,3,4,5)
 13 is the mean(5,6,7,8)
 9.5 is the mean(9,10)

Best regards