Run Spark on Java 10

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Run Spark on Java 10

Ben_W
*my user case*: We run Spark cluster on Mesos, since our Mesos cluster is
also hosting other frameworks such as Storm, Cassandra, we had incidents
where Spark job over-utilizes CPU which caused resource contention with
other frameworks.

*objective* : run un-modularized spark application (jar is compiled with
java 8 compatible sbt compiler) on Java 10 to leverage Java 10 container
support. Related link: https://bugs.openjdk.java.net/browse/JDK-8146115

After reading some readings about Java 9, this is my imaginary *happy-path*:

1) Point JAVA_HOME to Java 10,
2) Run my spark job, resolve classNotFoundException, lookup the missing
modules in oracle documentation, for example (java.sql
https://docs.oracle.com/javase/9/docs/api/java.sql-summary.html) module is
missing, add “spark.executor.extraJavaOptions --add-modules java.se.ee” ->
conf/spark-defaults.conf
3) Repeat step 2 until no more exceptions are thrown

however I found this,

Warning: Local jar ***/java.se.ee does not exist, skipping.

/java.lang.ClassNotFoundException: com.**.spark.Main at
java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:466)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:566)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:499)
at java.base/java.lang.Class.forName0(Native Method)
at java.base/java.lang.Class.forName(Class.java:374)
at org.apache.spark.util.Utils$.classForName(Utils.scala:233)
at
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:732)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)/


This is what I have observed after turning verbose mode

...

Using properties file:
/tmp/spark-2.2.2-bin-hadoop2.6/conf/spark-defaults.conf

Adding default property: spark.eventLog.enabled=true Adding default
property: spark.eventLog.dir=hdfs://*** Adding default property:
spark.executor.extraJavaOptions=--add-modules java.se.ee

Parsed arguments:

master                 mesos://localhost:10017
deployMode             cluster
executorMemory         16G
executorCores           2
totalExecutorCores     50
propertiesFile         /tmp/spark-2.2.2-bin-hadoop2.6/conf/spark-
defaults.conf
driverMemory           4G
driverCores             1
driverExtraClassPath   null
driverExtraLibraryPath null
driverExtraJavaOptions null
supervise               false
queue                   null
numExecutors           null
files                   null
pyFiles                 null
archives               null
mainClass               com.**.spark.Main
primaryResource         ***.jar
name                   ***
childArgs               [***]
jars                   null
packages               null
packagesExclusions     null
repositories           null
verbose                true

Spark properties used, including those specified through
--conf and those from the properties file
/tmp/spark-2.2.2-bin-hadoop2.6/conf/spark-defaults.conf:
(spark.mesos.uris,hdfs:///***/tmpqIx6x2)
(spark.driver.memory,4G)
(spark.eventLog.enabled,true)
(spark.executor.extraJavaOptions,--add-modules java.se.ee)
(spark.executor.uri,***/spark-2.2.2-bin-hadoop2.6.tgz)
(spark.eventLog.dir,hdfs://***)

*The warning was printed here:*

https://github.com/apache/spark/blob/5264164a67df498b73facae207eda12ee133be7d/core/src/main/scala/org/apache/spark/deploy/worker/DriverWrapper.scala#L101

https://github.com/apache/spark/blob/5264164a67df498b73facae207eda12ee133be7d/core/src/main/scala/org/apache/spark/deploy/DependencyUtils.scala#L76

After reading the source code. Seems to me that spark-submit does not
understand --add-modules option so it treat java.se.ee as a jar file rather
than a module. *And I coundn`t make it the way I want it to translate
--add-modules when launching executor JVM.

Has anyone done similar experiments running Spark on Java 9/10?*

Thanks in advance



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]