Hi

I am attempting to aggregate a large file. It looks something like this:

[['a', [1, 2]], ['b', [3, 0]] , ['a', [3,2]] ]

I've have to aggregate the file in a variety of ways, these are some of the ways that I have completed thus far, but I cannot figure out the one below. Can anybody suggest a way?

fc_rdd = sc.parallelize( [['a', [1, 2]], ['b', [3, 0]] , ['a', [3,2]] ])
print (fc_rdd.collect())
value_result = fc_rdd.map(lambda x: x[1]).reduce(lambda x,y: [ x[0]+y[0], x[1]+y[1]] )
unique_key_result = fc_rdd.map(lambda x: x[0]).distinct()
print(unique_key_result.collect())
# aggregated_key_value_result = = fc_rdd.map(lambda x,y: x[0] [ y[0] ])

How can I end up with RDD result such as this:

[ ['a', [4, 4]], ['b', [3, 0]] ]

?

Thanks