I was wanting to pull data from about 1500 remote Oracle tables with Spark, and I want to have a multi-threaded application that picks up a table per thread or maybe 10 tables per thread and launches a spark job to read from their respective tables.
Also you might have noticed in this SO post https://stackoverflow.com/questions/30862956/concurrent-job-execution-in-sparkthat there was no accepted answer on this similar question and the most upvoted answer starts with "This is not really in the spirit of Spark", and that is A. Everyone knows it's not in the spirit of Spark and B. Who cares what is the spirit of Spark, that doesn't actually mean anything.
Has anyone gotten something like this to work before? Did you have to do anything special? I was thinking of sending out a message to the dev group too because maybe the person that actually wrote the website can give a little more color to the above statement.