Crosstab/AproxQuantile Performance on Spark Cluster

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Crosstab/AproxQuantile Performance on Spark Cluster

Aakash Basu-2
Hi all,

Is the Event Timeline representing a good shape? I mean at a point, to calculate WoE columns on categorical variables, I am having to do crosstab on each column, and on a cluster of 4 nodes, it is taking time as I've 230+ columns and 60,000 rows. How to make it more performant?





Thanks,
Aakash.