Run/install tensorframes on zeppelin pyspark

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Run/install tensorframes on zeppelin pyspark

spicoflorin
Hi!

I would like to use tensorframes in my pyspark notebook.

I have performed the following:

1. In the spark intepreter adde a new repository http://dl.bintray.com/spark-packages/maven
2. in the spark interpreter added the dependency databricks:tensorframes:0.2.9-s_2.11 
3. pip install tensorframes


In both 0.7.3 and 0.8.0:
1.  the following code resulted in error: "ImportError: No module named tensorframes"

%pyspark
import tensorframes as tfs

2. the following code succeeded 
%spark
import org.tensorframes.{dsl => tf}
import org.tensorframes.dsl.Implicits._
val df = spark.createDataFrame(Seq(1.0->1.1, 2.0->2.2)).toDF("a", "b")

// As in Python, scoping is recommended to prevent name collisions.
val df2 = tf.withGraph {
    val a = df.block("a")
    // Unlike python, the scala syntax is more flexible:
    val out = a + 3.0 named "out"
    // The 'mapBlocks' method is added using implicits to dataframes.
    df.mapBlocks(out).select("a", "out")
}

// The transform is all lazy at this point, let's execute it with collect:
df2.collect()

I ran the code above directly with spark interpreter with the default configurations (master set up to local[*] - so not via spark-submit command) .

Also, I have installed spark home locally and ran the command 
$SPARK_HOME/bin/pyspark --packages databricks:tensorframes:0.2.9-s_2.11
and the code below worked as expcted
import tensorframes as tfs
 Can you please help to solve this?
Thanks,
 Florin





 

Reply | Threaded
Open this post in threaded view
|

Re: Run/install tensorframes on zeppelin pyspark

zjffdu

Make sure you use the correct python which has tensorframe installed.  Use PYSPARK_PYTHON to configure the python



Spico Florin <[hidden email]>于2018年8月8日周三 下午9:59写道:
Hi!

I would like to use tensorframes in my pyspark notebook.

I have performed the following:

1. In the spark intepreter adde a new repository http://dl.bintray.com/spark-packages/maven
2. in the spark interpreter added the dependency databricks:tensorframes:0.2.9-s_2.11 
3. pip install tensorframes


In both 0.7.3 and 0.8.0:
1.  the following code resulted in error: "ImportError: No module named tensorframes"

%pyspark
import tensorframes as tfs

2. the following code succeeded 
%spark
import org.tensorframes.{dsl => tf}
import org.tensorframes.dsl.Implicits._
val df = spark.createDataFrame(Seq(1.0->1.1, 2.0->2.2)).toDF("a", "b")

// As in Python, scoping is recommended to prevent name collisions.
val df2 = tf.withGraph {
    val a = df.block("a")
    // Unlike python, the scala syntax is more flexible:
    val out = a + 3.0 named "out"
    // The 'mapBlocks' method is added using implicits to dataframes.
    df.mapBlocks(out).select("a", "out")
}

// The transform is all lazy at this point, let's execute it with collect:
df2.collect()

I ran the code above directly with spark interpreter with the default configurations (master set up to local[*] - so not via spark-submit command) .

Also, I have installed spark home locally and ran the command 
$SPARK_HOME/bin/pyspark --packages databricks:tensorframes:0.2.9-s_2.11
and the code below worked as expcted
import tensorframes as tfs
 Can you please help to solve this?
Thanks,
 Florin





 

Reply | Threaded
Open this post in threaded view
|

Re: Run/install tensorframes on zeppelin pyspark

spicoflorin
Hello!
  Thank you very much for your response.
As I understood, in order to use tensorframes in Zeppelin pyspark notebook with spark master locally 
1. we should run command pip install tensorframes
2. we should set up the PYSPARK_PYTHON in conf/zeppelin-env.sh

I have performed the above steps like this

python2.7 -m pip install tensorframes==0.2.7
export PYSPARK_PYTHON=python2.7 in  in conf/zeppelin-env.sh
"zeppelin.pyspark.python": "python2.7 in conf/interpreter.json

As you can see the installation and the configurations refers to the same python2.7 version.
After performing all of these steps, I'm still getting the same error   "ImportError: No module named tensorframes"

I'm still puzzled how this import works fine in the pyspark command from the spark and for example in python2.7 results in errors.
Also I've observed that pyspark shell from /spark/bin doesn't need the tensorframes python package installed and this is more confusing. 
Zeppelin pyspark interpreter is not using the same approach as spark pyspark shell?

Is someone succeeded to import/use correctly tensorframes in Zeppelin with default spark master setup (local[*]?) If yes how?

I look forward for your answers/

Regards,
 Florin

















On Thu, Aug 9, 2018 at 3:52 AM, Jeff Zhang <[hidden email]> wrote:

Make sure you use the correct python which has tensorframe installed.  Use PYSPARK_PYTHON to configure the python



Spico Florin <[hidden email]>于2018年8月8日周三 下午9:59写道:
Hi!

I would like to use tensorframes in my pyspark notebook.

I have performed the following:

1. In the spark intepreter adde a new repository http://dl.bintray.com/spark-packages/maven
2. in the spark interpreter added the dependency databricks:tensorframes:0.2.9-s_2.11 
3. pip install tensorframes


In both 0.7.3 and 0.8.0:
1.  the following code resulted in error: "ImportError: No module named tensorframes"

%pyspark
import tensorframes as tfs

2. the following code succeeded 
%spark
import org.tensorframes.{dsl => tf}
import org.tensorframes.dsl.Implicits._
val df = spark.createDataFrame(Seq(1.0->1.1, 2.0->2.2)).toDF("a", "b")

// As in Python, scoping is recommended to prevent name collisions.
val df2 = tf.withGraph {
    val a = df.block("a")
    // Unlike python, the scala syntax is more flexible:
    val out = a + 3.0 named "out"
    // The 'mapBlocks' method is added using implicits to dataframes.
    df.mapBlocks(out).select("a", "out")
}

// The transform is all lazy at this point, let's execute it with collect:
df2.collect()

I ran the code above directly with spark interpreter with the default configurations (master set up to local[*] - so not via spark-submit command) .

Also, I have installed spark home locally and ran the command 
$SPARK_HOME/bin/pyspark --packages databricks:tensorframes:0.2.9-s_2.11
and the code below worked as expcted
import tensorframes as tfs
 Can you please help to solve this?
Thanks,
 Florin