I have a plan to program such functions like in spark-shell.
When spark-shell is run for yarn, it seems that spark-shell application is
submitted to yarn with yarn cluster client mode.
I am curious when the input codes in scala are typed in spark-shell, how the
input codes in scala are compiled dynamically, and how these compiled codes
can be loaded onto the classloader in the distributed current yarn
If you run spark repl, you can find the spark configuration like this :
The repl class fetch server will be run to handle the classes compiled by
repl spark interpreter with this uri in the spark repl driver.
The distributed executors will fetch the classes from the repl class fetch
server with the uri of "spark.repl.class.uri" and load them into the
classloader in ExecutorClassLoader.
I have also researched the spark and zeppeline source codes to use only
spark interpreter, but not repl entirely.
I have picked up some codes from zeppeline and spark to run spark
interpreter in my application.
In my application, the embeded http server will be run to handle and
interpret the spark codes from the user, the spark codes sent by users will
be interpreted dynamically and executed on the distributed executors like
spark repl does. It works for now!!
For my application, there are some more research to do, for instance, how to
handle multiple users with the individual spark session, etc.