does "Deep Learning Pipelines" scale out linearly?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

does "Deep Learning Pipelines" scale out linearly?

Andy Davidson
I am starting a new deep learning project currently we do all of our work on a single machine using a combination of Keras and Tensor flow. https://databricks.github.io/spark-deep-learning/site/index.html looks very promising. Any idea how performance is likely to improve as I add machines to my my cluster?

Kind regards

Andy


P.s. Is [hidden email] the best place to ask questions about this package?


Reply | Threaded
Open this post in threaded view
|

Re: does "Deep Learning Pipelines" scale out linearly?

MLnick
For that package specifically it’s best to see if they have a mailing list and if not perhaps ask on github issues.

Having said that perhaps the folks involved in that package will reply here too. 

On Wed, 22 Nov 2017 at 20:03, Andy Davidson <[hidden email]> wrote:
I am starting a new deep learning project currently we do all of our work on a single machine using a combination of Keras and Tensor flow. https://databricks.github.io/spark-deep-learning/site/index.html looks very promising. Any idea how performance is likely to improve as I add machines to my my cluster?

Kind regards

Andy


P.s. Is [hidden email] the best place to ask questions about this package?


Reply | Threaded
Open this post in threaded view
|

Re: does "Deep Learning Pipelines" scale out linearly?

Tim Hunter
In reply to this post by Andy Davidson
Hello Andy,
regarding your question, this will depend a lot on the specific task:
 - for tasks that are "easy" to distribute such as inference
(scoring), hyper-parameter tuning or cross-validation, these tasks
will take full advantage of the cluster and the performance should
improve more or less linearly
 - for training the same model with multiple machines, and a
distributed dataset, then you are currently better off with a
dedicated solution such as TensorFlowOnSpark or dist-keras. We are
working on addressing this issue in a future release.

Also, we opened a mailing list dedicated to Deep Learning Pipelines,
to which I will copy this answer. Feel free to answer there:

https://groups.google.com/forum/#!forum/dl-pipelines-users/


Tim


On November 22, 2017 at 10:02:59 AM, Andy Davidson
([hidden email]) wrote:

> I am starting a new deep learning project currently we do all of our work on
> a single machine using a combination of Keras and Tensor flow.
> https://databricks.github.io/spark-deep-learning/site/index.html looks very
> promising. Any idea how performance is likely to improve as I add machines
> to my my cluster?
>
> Kind regards
>
> Andy
>
>
> P.s. Is [hidden email] the best place to ask questions about this
> package?
>
>
>
>
>

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]