Error in SparkSQL Example

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Error in SparkSQL Example

Manoj Samel
Hi,

On http://people.apache.org/~pwendell/catalyst-docs/sql-programming-guide.html, I am trying to run code on "Writing Language-Integrated Relational Queries" ( I have 1.0.0 Snapshot ).

I am running into error on 

val people: RDD[Person] // An RDD of case class objects, from the first example.

scala> val people: RDD[Person]
<console>:19: error: not found: type RDD
       val people: RDD[Person]
                   ^

scala> val people: org.apache.spark.rdd.RDD[Person]
<console>:18: error: class $iwC needs to be abstract, since value people is not defined
class $iwC extends Serializable {
      ^

Any idea what the issue is ?

Also, its not clear what does the RDD[Person] brings. I can run the DSL without the case class objects RDD ...

val people = sc.textFile("examples/src/main/resources/people.txt").map(_.split(",")).map(p => Person(p(0), p(1).trim.toInt))

val teenagers = people.where('age >= 13).where('age <= 19)

Thanks,



Reply | Threaded
Open this post in threaded view
|

Re: Error in SparkSQL Example

Michael Armbrust
"val people: RDD[Person] // An RDD of case class objects, from the first example." is just a placeholder to avoid cluttering up each example with the same code for creating an RDD.  The ": RDD[People]" is just there to let you know the expected type of the variable 'people'.  Perhaps there is a clearer way to indicate this.

As you have realized, using the full line from the first example will allow you to run the rest of them.



On Sun, Mar 30, 2014 at 7:31 AM, Manoj Samel <[hidden email]> wrote:
Hi,

On http://people.apache.org/~pwendell/catalyst-docs/sql-programming-guide.html, I am trying to run code on "Writing Language-Integrated Relational Queries" ( I have 1.0.0 Snapshot ).

I am running into error on 

val people: RDD[Person] // An RDD of case class objects, from the first example.

scala> val people: RDD[Person]
<console>:19: error: not found: type RDD
       val people: RDD[Person]
                   ^

scala> val people: org.apache.spark.rdd.RDD[Person]
<console>:18: error: class $iwC needs to be abstract, since value people is not defined
class $iwC extends Serializable {
      ^

Any idea what the issue is ?

Also, its not clear what does the RDD[Person] brings. I can run the DSL without the case class objects RDD ...

val people = sc.textFile("examples/src/main/resources/people.txt").map(_.split(",")).map(p => Person(p(0), p(1).trim.toInt))

val teenagers = people.where('age >= 13).where('age <= 19)

Thanks,




Reply | Threaded
Open this post in threaded view
|

Re: Error in SparkSQL Example

Manoj Samel
Hi Michael,

Thanks for the clarification. My question is about the error above "error: class $iwC needs to be abstract" and what does the RDD brings, since I can do the DSL without the "people: people: org.apache.spark.rdd.RDD[Person]" 

Thanks,


On Mon, Mar 31, 2014 at 9:13 AM, Michael Armbrust <[hidden email]> wrote:
"val people: RDD[Person] // An RDD of case class objects, from the first example." is just a placeholder to avoid cluttering up each example with the same code for creating an RDD.  The ": RDD[People]" is just there to let you know the expected type of the variable 'people'.  Perhaps there is a clearer way to indicate this.

As you have realized, using the full line from the first example will allow you to run the rest of them.



On Sun, Mar 30, 2014 at 7:31 AM, Manoj Samel <[hidden email]> wrote:
Hi,

On http://people.apache.org/~pwendell/catalyst-docs/sql-programming-guide.html, I am trying to run code on "Writing Language-Integrated Relational Queries" ( I have 1.0.0 Snapshot ).

I am running into error on 

val people: RDD[Person] // An RDD of case class objects, from the first example.

scala> val people: RDD[Person]
<console>:19: error: not found: type RDD
       val people: RDD[Person]
                   ^

scala> val people: org.apache.spark.rdd.RDD[Person]
<console>:18: error: class $iwC needs to be abstract, since value people is not defined
class $iwC extends Serializable {
      ^

Any idea what the issue is ?

Also, its not clear what does the RDD[Person] brings. I can run the DSL without the case class objects RDD ...

val people = sc.textFile("examples/src/main/resources/people.txt").map(_.split(",")).map(p => Person(p(0), p(1).trim.toInt))

val teenagers = people.where('age >= 13).where('age <= 19)

Thanks,





Reply | Threaded
Open this post in threaded view
|

Re: Error in SparkSQL Example

Michael Armbrust

Thanks for the clarification. My question is about the error above "error: class $iwC needs to be abstract"

This is a fairly confusing scala REPL (interpreter) error.  Under the covers, to run the line you entered into the interpreter, scala is creating an object called $iwC with your code inserted into it.  So this error is telling you that you cannot create a val in an object (or in a line of the REPL) without giving it a value. 

and what does the RDD brings, since I can do the DSL without the "people: people: org.apache.spark.rdd.RDD[Person]" 

This is just assigning the type for the variable people (which is really only there for people reading the code in this particular example).  In the case of the line from the first example where we leave it out, the scala compiler will add it for us using type inference.