On Sat, Feb 1, 2014 at 2:34 AM, Uri Laserson <[hidden email]> wrote:
I implemented a version of distributed streaming quantiles for PySpark. It uses a count-min sketch approach. You can find the code here:
Thought it might be of interest...
Uri-- Uri Laserson, PhD
Data Scientist, Cloudera
+1 617 910 0447