sc.makeRDD bug with NumericRange

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

sc.makeRDD bug with NumericRange

Aureliano Buendia
Hi,

I just notices that sc.makeRDD() does not make all values given with input type of NumericRange, try this in spark shell:


$ MASTER=local[4] bin/spark-shell

scala> sc.makeRDD(0.0 to 1 by 0.1).collect().length

8


The expected length is 11. This works correctly when lanching spark with only one core:


$ MASTER=local[1] bin/spark-shell

scala> sc.makeRDD(0.0 to 1 by 0.1).collect().length

11


This also works correctly when using toArray():

$ MASTER=local[4] bin/spark-shell

scala> sc.makeRDD((0.0 to 1 by 0.1).toArray).collect().length

8

Reply | Threaded
Open this post in threaded view
|

Re: sc.makeRDD bug with NumericRange

Mark Hamstra
Please file an issue: Spark Project JIRA



On Fri, Apr 18, 2014 at 10:25 AM, Aureliano Buendia <[hidden email]> wrote:
Hi,

I just notices that sc.makeRDD() does not make all values given with input type of NumericRange, try this in spark shell:


$ MASTER=local[4] bin/spark-shell

scala> sc.makeRDD(0.0 to 1 by 0.1).collect().length

8


The expected length is 11. This works correctly when lanching spark with only one core:


$ MASTER=local[1] bin/spark-shell

scala> sc.makeRDD(0.0 to 1 by 0.1).collect().length

11


This also works correctly when using toArray():

$ MASTER=local[4] bin/spark-shell

scala> sc.makeRDD((0.0 to 1 by 0.1).toArray).collect().length

8


Reply | Threaded
Open this post in threaded view
|

Re: sc.makeRDD bug with NumericRange

Daniel Darabos
Looks like NumericRange in Scala is just a joke.

scala> val x = 0.0 to 1.0 by 0.1
x: scala.collection.immutable.NumericRange[Double] = NumericRange(0.0, 0.1, 0.2, 0.30000000000000004, 0.4, 0.5, 0.6, 0.7, 0.7999999999999999, 0.8999999999999999, 0.9999999999999999)

scala> x.take(3)
res1: scala.collection.immutable.NumericRange[Double] = NumericRange(0.0, 0.1, 0.2)

scala> x.drop(3)
res2: scala.collection.immutable.NumericRange[Double] = NumericRange(0.30000000000000004, 0.4, 0.5, 0.6, 0.7, 0.7999999999999999, 0.8999999999999999, 0.9999999999999999)

So far so good.

scala> x.drop(3).take(3)
res3: scala.collection.immutable.NumericRange[Double] = NumericRange(0.30000000000000004, 0.4)

Why only two values? Where's 0.5?

scala> x.drop(6)
res4: scala.collection.immutable.NumericRange[Double] = NumericRange(0.6000000000000001, 0.7000000000000001, 0.8, 0.9)

And where did the last value disappear now?

You have to approach Scala with a healthy amount of distrust. You're on the right track with toArray.


On Fri, Apr 18, 2014 at 8:01 PM, Mark Hamstra <[hidden email]> wrote:
Please file an issue: Spark Project JIRA



On Fri, Apr 18, 2014 at 10:25 AM, Aureliano Buendia <[hidden email]> wrote:
Hi,

I just notices that sc.makeRDD() does not make all values given with input type of NumericRange, try this in spark shell:


$ MASTER=local[4] bin/spark-shell

scala> sc.makeRDD(0.0 to 1 by 0.1).collect().length

8


The expected length is 11. This works correctly when lanching spark with only one core:


$ MASTER=local[1] bin/spark-shell

scala> sc.makeRDD(0.0 to 1 by 0.1).collect().length

11


This also works correctly when using toArray():

$ MASTER=local[4] bin/spark-shell

scala> sc.makeRDD((0.0 to 1 by 0.1).toArray).collect().length

8



Reply | Threaded
Open this post in threaded view
|

Re: sc.makeRDD bug with NumericRange

Daniel Darabos
To make up for mocking Scala, I've filed a bug (https://issues.scala-lang.org/browse/SI-8518) and will try to patch this.


On Fri, Apr 18, 2014 at 9:24 PM, Daniel Darabos <[hidden email]> wrote:
Looks like NumericRange in Scala is just a joke.

scala> val x = 0.0 to 1.0 by 0.1
x: scala.collection.immutable.NumericRange[Double] = NumericRange(0.0, 0.1, 0.2, 0.30000000000000004, 0.4, 0.5, 0.6, 0.7, 0.7999999999999999, 0.8999999999999999, 0.9999999999999999)

scala> x.take(3)
res1: scala.collection.immutable.NumericRange[Double] = NumericRange(0.0, 0.1, 0.2)

scala> x.drop(3)
res2: scala.collection.immutable.NumericRange[Double] = NumericRange(0.30000000000000004, 0.4, 0.5, 0.6, 0.7, 0.7999999999999999, 0.8999999999999999, 0.9999999999999999)

So far so good.

scala> x.drop(3).take(3)
res3: scala.collection.immutable.NumericRange[Double] = NumericRange(0.30000000000000004, 0.4)

Why only two values? Where's 0.5?

scala> x.drop(6)
res4: scala.collection.immutable.NumericRange[Double] = NumericRange(0.6000000000000001, 0.7000000000000001, 0.8, 0.9)

And where did the last value disappear now?

You have to approach Scala with a healthy amount of distrust. You're on the right track with toArray.


On Fri, Apr 18, 2014 at 8:01 PM, Mark Hamstra <[hidden email]> wrote:
Please file an issue: Spark Project JIRA



On Fri, Apr 18, 2014 at 10:25 AM, Aureliano Buendia <[hidden email]> wrote:
Hi,

I just notices that sc.makeRDD() does not make all values given with input type of NumericRange, try this in spark shell:


$ MASTER=local[4] bin/spark-shell

scala> sc.makeRDD(0.0 to 1 by 0.1).collect().length

8


The expected length is 11. This works correctly when lanching spark with only one core:


$ MASTER=local[1] bin/spark-shell

scala> sc.makeRDD(0.0 to 1 by 0.1).collect().length

11


This also works correctly when using toArray():

$ MASTER=local[4] bin/spark-shell

scala> sc.makeRDD((0.0 to 1 by 0.1).toArray).collect().length

8




Reply | Threaded
Open this post in threaded view
|

Re: sc.makeRDD bug with NumericRange

Aureliano Buendia
Good catch, Daniel. Looks like this is a scala bug, not a spark one. Yet, spark users got to be careful not using NumericRange.


On Fri, Apr 18, 2014 at 9:05 PM, Daniel Darabos <[hidden email]> wrote:
To make up for mocking Scala, I've filed a bug (https://issues.scala-lang.org/browse/SI-8518) and will try to patch this.


On Fri, Apr 18, 2014 at 9:24 PM, Daniel Darabos <[hidden email]> wrote:
Looks like NumericRange in Scala is just a joke.

scala> val x = 0.0 to 1.0 by 0.1
x: scala.collection.immutable.NumericRange[Double] = NumericRange(0.0, 0.1, 0.2, 0.30000000000000004, 0.4, 0.5, 0.6, 0.7, 0.7999999999999999, 0.8999999999999999, 0.9999999999999999)

scala> x.take(3)
res1: scala.collection.immutable.NumericRange[Double] = NumericRange(0.0, 0.1, 0.2)

scala> x.drop(3)
res2: scala.collection.immutable.NumericRange[Double] = NumericRange(0.30000000000000004, 0.4, 0.5, 0.6, 0.7, 0.7999999999999999, 0.8999999999999999, 0.9999999999999999)

So far so good.

scala> x.drop(3).take(3)
res3: scala.collection.immutable.NumericRange[Double] = NumericRange(0.30000000000000004, 0.4)

Why only two values? Where's 0.5?

scala> x.drop(6)
res4: scala.collection.immutable.NumericRange[Double] = NumericRange(0.6000000000000001, 0.7000000000000001, 0.8, 0.9)

And where did the last value disappear now?

You have to approach Scala with a healthy amount of distrust. You're on the right track with toArray.


On Fri, Apr 18, 2014 at 8:01 PM, Mark Hamstra <[hidden email]> wrote:
Please file an issue: Spark Project JIRA



On Fri, Apr 18, 2014 at 10:25 AM, Aureliano Buendia <[hidden email]> wrote:
Hi,

I just notices that sc.makeRDD() does not make all values given with input type of NumericRange, try this in spark shell:


$ MASTER=local[4] bin/spark-shell

scala> sc.makeRDD(0.0 to 1 by 0.1).collect().length

8


The expected length is 11. This works correctly when lanching spark with only one core:


$ MASTER=local[1] bin/spark-shell

scala> sc.makeRDD(0.0 to 1 by 0.1).collect().length

11


This also works correctly when using toArray():

$ MASTER=local[4] bin/spark-shell

scala> sc.makeRDD((0.0 to 1 by 0.1).toArray).collect().length

8