breeze DGEMM slow in spark

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

breeze DGEMM slow in spark

wxhsdp
This post was updated on .
Dear, all
  i'am testing double precision matrix multiplication in spark on ec2 m1.large machines.
  i use breeze linalg library, and internally it calls native library(openblas nehalem single threaded)

m1.large:
model name : Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz
cpu MHz : 1795.672
model name : Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz
cpu MHz : 1795.672

os:
Linux ip-172-31-24-33 3.4.37-40.44.amzn1.x86_64 #1 SMP Thu Mar 21 01:17:08 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

  here's my test code:
  def main(args: Array[String]) {

    val n = args(0).toInt
    val loop = args(1).toInt

    val ranGen = new Random

    var arr = ofDim[Double](loop,n*n)

    for(i <- 0 until loop)
      for(j <- 0 until n*n) {
        arr(i)(j) = ranGen.nextDouble()
      }

    var time0 = System.currentTimeMillis()
    println("init time = "+time0)

    var c = new DenseMatrix[Double](n,n)

    var time1 = System.currentTimeMillis()
    println("start time = "+time1)

    for(i <- 0 until loop) {
      var a = new DenseMatrix[Double](n,n,arr(i))
      var b = new DenseMatrix[Double](n,n,arr(i))

      c :+= (a * b)
    }

    var time2 = System.currentTimeMillis()
    println("stop time = "+time2)
    println("init time = "+(time1-time0))
    println("used time = "+(time2-time1))
  }

  two n=3584 matrix mult uses about 14s using the above test code. but when i put matrix
  mult part in spark mapPartitions function:

  val b = a.mapPartitions{ itr =>
    val arr = itr.toArray

    //timestamp here
    var a = new DenseMatrix[Double](n,n,arr)
    var b = new DenseMatrix[Double](n,n,arr)

    c = a*b

   //timestamp here
    c.toIterator
  }

  two n=3584 matrix mult uses about 50s(jblas costs 90s, only less than 2x speed up)!
  there's a shuffle operation before matrix mult in spark, during shuffle phase the aggregated data are
  put in memory on the reduce side, there is no spill to disk. so the above 2 cases are all in memory
  matrix mult, and they all have enough memory, GC time is really small

  so why case 2 is 3.5x slower than case 1? has any one met this before, and what's your performance
  of DGEMM in spark? thanks for advices
 
Reply | Threaded
Open this post in threaded view
|

Re: breeze DGEMM slow in spark

wxhsdp
i think maybe it's related to m1.large, because i also tested on my laptop, the two case cost nearly
the same amount of time.

my laptop:
model name : Intel(R) Core(TM) i5-3380M CPU @ 2.90GHz
cpu MHz : 2893.549

os:
Linux ubuntu 3.11.0-12-generic #19-Ubuntu SMP Wed Oct 9 16:20:46 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
Reply | Threaded
Open this post in threaded view
|

Re: breeze DGEMM slow in spark

Xiangrui Meng
You need to include breeze-natives or netlib:all to load the native
libraries. Check the log messages to ensure native libraries are used,
especially on the worker nodes. The easiest way to use OpenBLAS is
copying the shared library to /usr/lib/libblas.so.3 and
/usr/lib/liblapack.so.3. -Xiangrui

On Sat, May 17, 2014 at 8:02 PM, wxhsdp <[hidden email]> wrote:

> i think maybe it's related to m1.large, because i also tested on my laptop,
> the two case cost nearly
> the same amount of time.
>
> my laptop:
> model name      : Intel(R) Core(TM) i5-3380M CPU @ 2.90GHz
> cpu MHz         : 2893.549
>
> os:
> Linux ubuntu 3.11.0-12-generic #19-Ubuntu SMP Wed Oct 9 16:20:46 UTC 2013
> x86_64 x86_64 x86_64 GNU/Linux
>
>
>
>
> --
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/breeze-DGEMM-slow-in-spark-tp5950p5971.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|

Re: breeze DGEMM slow in spark

wxhsdp
Hi, xiangrui
  i check the stderr of worker node, yes it's failed to load implementation from:  
  com.github.fommil.netlib.NativeSystemBLAS...

  what do you mean by "include breeze-natives or netlib:all"?

  things i've already done:
  1. add breeze and breeze native dependency in sbt build file
  2. download all breeze jars to slaves
  3. add jars to classpath in slave
  4. ln -s libopenblas_nehalemp-r0.2.9.rc2.so libblas.so.3 and add it to LD_LIBRARY_PATH in slave

  thank you for your help
Reply | Threaded
Open this post in threaded view
|

Re: breeze DGEMM slow in spark

wxhsdp
In reply to this post by Xiangrui Meng
in case 1, breeze dependency in sbt.build file automatically downloads the jars and add them
to classpath.

in spark case, i manually download all the jars and add them to spark classpath

why case 1 succeeded, and case 2 failed? do i miss something?
Reply | Threaded
Open this post in threaded view
|

Re: breeze DGEMM slow in spark

wxhsdp
In reply to this post by Xiangrui Meng
Hi, xiangrui

  you said "It doesn't work if you put the netlib-native jar inside an assembly
  jar. Try to mark it "provided" in the dependencies, and use --jars to
  include them with spark-submit. -Xiangrui"

  i'am not use an assembly jar which contains every thing, i also mark breeze dependencies
  provided, and manually download the jars and add them to slave classpath. but doesn't work:(
Reply | Threaded
Open this post in threaded view
|

Re: breeze DGEMM slow in spark

Xiangrui Meng
Can you attach the slave classpath? -Xiangrui

On Sun, May 18, 2014 at 2:02 AM, wxhsdp <[hidden email]> wrote:

> Hi, xiangrui
>
>   you said "It doesn't work if you put the netlib-native jar inside an
> assembly
>   jar. Try to mark it "provided" in the dependencies, and use --jars to
>   include them with spark-submit. -Xiangrui"
>
>   i'am not use an assembly jar which contains every thing, i also mark
> breeze dependencies
>   provided, and manually download the jars and add them to slave classpath.
> but doesn't work:(
>
>
>
> --
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/breeze-DGEMM-slow-in-spark-tp5950p5979.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|

Re: breeze DGEMM slow in spark

wxhsdp
ok

Spark Executor Command: "java" "-cp" ":/root/ephemeral-hdfs/conf:/root/.ivy2/cache/org.scala-lang/scala-library/jars/scala-library-2.10.4.jar:/root/.ivy2/cache/org.scalanlp/breeze_2.10/jars/breeze_2.10-0.7.jar:/root/.ivy2/cache/org.scalanlp/breeze-macros_2.10/jars/breeze-macros_2.10-0.3.jar:/root/.sbt/boot/scala-2.10.3/lib/scala-reflect.jar:/root/.ivy2/cache/com.thoughtworks.paranamer/paranamer/jars/paranamer-2.2.jar:/root/.ivy2/cache/com.github.fommil.netlib/core/jars/core-1.1.2.jar:/root/.ivy2/cache/net.sourceforge.f2j/arpack_combined_all/jars/arpack_combined_all-0.1.jar:/root/.ivy2/cache/net.sourceforge.f2j/arpack_combined_all/jars/arpack_combined_all-0.1-javadoc.jar:/root/.ivy2/cache/net.sf.opencsv/opencsv/jars/opencsv-2.3.jar:/root/.ivy2/cache/com.github.rwl/jtransforms/jars/jtransforms-2.4.0.jar:/root/.ivy2/cache/junit/junit/jars/junit-4.8.2.jar:/root/.ivy2/cache/org.apache.commons/commons-math3/jars/commons-math3-3.2.jar:/root/.ivy2/cache/org.spire-math/spire_2.10/jars/spire_2.10-0.7.1.jar:/root/.ivy2/cache/org.spire-math/spire-macros_2.10/jars/spire-macros_2.10-0.7.1.jar:/root/.ivy2/cache/com.typesafe/scalalogging-slf4j_2.10/jars/scalalogging-slf4j_2.10-1.0.1.jar:/root/.ivy2/cache/org.slf4j/slf4j-api/jars/slf4j-api-1.7.2.jar:/root/.ivy2/cache/org.scalanlp/breeze-natives_2.10/jars/breeze-natives_2.10-0.7.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_ref-osx-x86_64/jars/netlib-native_ref-osx-x86_64-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/native_ref-java/jars/native_ref-java-1.1.jar:/root/.ivy2/cache/com.github.fommil/jniloader/jars/jniloader-1.1.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_ref-linux-x86_64/jars/netlib-native_ref-linux-x86_64-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_ref-linux-i686/jars/netlib-native_ref-linux-i686-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_ref-win-x86_64/jars/netlib-native_ref-win-x86_64-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_ref-win-i686/jars/netlib-native_ref-win-i686-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_ref-linux-armhf/jars/netlib-native_ref-linux-armhf-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_system-osx-x86_64/jars/netlib-native_system-osx-x86_64-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/native_system-java/jars/native_system-java-1.1.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_system-linux-x86_64/jars/netlib-native_system-linux-x86_64-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_system-linux-i686/jars/netlib-native_system-linux-i686-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_system-linux-armhf/jars/netlib-native_system-linux-armhf-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_system-win-x86_64/jars/netlib-native_system-win-x86_64-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_system-win-i686/jars/netlib-native_system-win-i686-1.1-natives.jar ::/root/spark/conf:/root/spark/assembly/target/scala-2.10/spark-assembly-1.0.0-hadoop1.0.4.jar" "-Xms4096M" "-Xmx4096M"
Reply | Threaded
Open this post in threaded view
|

Re: breeze DGEMM slow in spark

Xiangrui Meng
The classpath seems to be correct. Where did you link libopenblas*.so
to? The safest approach is to rename it to /usr/lib/libblas.so.3 and
/usr/lib/liblapack.so.3 . This is the way I made it work. -Xiangrui

On Sun, May 18, 2014 at 4:49 PM, wxhsdp <[hidden email]> wrote:

> ok
>
> Spark Executor Command: "java" "-cp"
> ":/root/ephemeral-hdfs/conf:/root/.ivy2/cache/org.scala-lang/scala-library/jars/scala-library-2.10.4.jar:/root/.ivy2/cache/org.scalanlp/breeze_2.10/jars/breeze_2.10-0.7.jar:/root/.ivy2/cache/org.scalanlp/breeze-macros_2.10/jars/breeze-macros_2.10-0.3.jar:/root/.sbt/boot/scala-2.10.3/lib/scala-reflect.jar:/root/.ivy2/cache/com.thoughtworks.paranamer/paranamer/jars/paranamer-2.2.jar:/root/.ivy2/cache/com.github.fommil.netlib/core/jars/core-1.1.2.jar:/root/.ivy2/cache/net.sourceforge.f2j/arpack_combined_all/jars/arpack_combined_all-0.1.jar:/root/.ivy2/cache/net.sourceforge.f2j/arpack_combined_all/jars/arpack_combined_all-0.1-javadoc.jar:/root/.ivy2/cache/net.sf.opencsv/opencsv/jars/opencsv-2.3.jar:/root/.ivy2/cache/com.github.rwl/jtransforms/jars/jtransforms-2.4.0.jar:/root/.ivy2/cache/junit/junit/jars/junit-4.8.2.jar:/root/.ivy2/cache/org.apache.commons/commons-math3/jars/commons-math3-3.2.jar:/root/.ivy2/cache/org.spire-math/spire_2.10/jars/spire_2.10-0.7.1.jar:/root/.ivy2/cache/org.spire-math/spire-macros_2.10/jars/spire-macros_2.10-0.7.1.jar:/root/.ivy2/cache/com.typesafe/scalalogging-slf4j_2.10/jars/scalalogging-slf4j_2.10-1.0.1.jar:/root/.ivy2/cache/org.slf4j/slf4j-api/jars/slf4j-api-1.7.2.jar:/root/.ivy2/cache/org.scalanlp/breeze-natives_2.10/jars/breeze-natives_2.10-0.7.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_ref-osx-x86_64/jars/netlib-native_ref-osx-x86_64-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/native_ref-java/jars/native_ref-java-1.1.jar:/root/.ivy2/cache/com.github.fommil/jniloader/jars/jniloader-1.1.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_ref-linux-x86_64/jars/netlib-native_ref-linux-x86_64-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_ref-linux-i686/jars/netlib-native_ref-linux-i686-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_ref-win-x86_64/jars/netlib-native_ref-win-x86_64-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_ref-win-i686/jars/netlib-native_ref-win-i686-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_ref-linux-armhf/jars/netlib-native_ref-linux-armhf-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_system-osx-x86_64/jars/netlib-native_system-osx-x86_64-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/native_system-java/jars/native_system-java-1.1.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_system-linux-x86_64/jars/netlib-native_system-linux-x86_64-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_system-linux-i686/jars/netlib-native_system-linux-i686-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_system-linux-armhf/jars/netlib-native_system-linux-armhf-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_system-win-x86_64/jars/netlib-native_system-win-x86_64-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_system-win-i686/jars/netlib-native_system-win-i686-1.1-natives.jar
> ::/root/spark/conf:/root/spark/assembly/target/scala-2.10/spark-assembly-1.0.0-hadoop1.0.4.jar"
> "-Xms4096M" "-Xmx4096M"
>
>
>
> --
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/breeze-DGEMM-slow-in-spark-tp5950p5994.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|

Re: breeze DGEMM slow in spark

wxhsdp
thank you xiangrui, i also think it's maybe the problem of link

i tried several ways:
1. export LD_LIBRARY_PATH=mypath
2. create the link file in /usr/lib
    lrwxrwxrwx 1 root root   34 May 19 00:38 libblas.so.3 -> libopenblas_nehalemp-r0.2.9.rc2.so
3. add mypath to /etc/ld.so.conf, then ldconfig

  1 and 2 does not work, as to 3, it seems that ldconfig doesn't work in amazon linux, i check it by using  
  ldconfig -p, but can not find my .so file

Xiangrui Meng wrote
The classpath seems to be correct. Where did you link libopenblas*.so
to? The safest approach is to rename it to /usr/lib/libblas.so.3 and
/usr/lib/liblapack.so.3 . This is the way I made it work. -Xiangrui

On Sun, May 18, 2014 at 4:49 PM, wxhsdp <[hidden email]> wrote:
> ok
>
> Spark Executor Command: "java" "-cp"
> ":/root/ephemeral-hdfs/conf:/root/.ivy2/cache/org.scala-lang/scala-library/jars/scala-library-2.10.4.jar:/root/.ivy2/cache/org.scalanlp/breeze_2.10/jars/breeze_2.10-0.7.jar:/root/.ivy2/cache/org.scalanlp/breeze-macros_2.10/jars/breeze-macros_2.10-0.3.jar:/root/.sbt/boot/scala-2.10.3/lib/scala-reflect.jar:/root/.ivy2/cache/com.thoughtworks.paranamer/paranamer/jars/paranamer-2.2.jar:/root/.ivy2/cache/com.github.fommil.netlib/core/jars/core-1.1.2.jar:/root/.ivy2/cache/net.sourceforge.f2j/arpack_combined_all/jars/arpack_combined_all-0.1.jar:/root/.ivy2/cache/net.sourceforge.f2j/arpack_combined_all/jars/arpack_combined_all-0.1-javadoc.jar:/root/.ivy2/cache/net.sf.opencsv/opencsv/jars/opencsv-2.3.jar:/root/.ivy2/cache/com.github.rwl/jtransforms/jars/jtransforms-2.4.0.jar:/root/.ivy2/cache/junit/junit/jars/junit-4.8.2.jar:/root/.ivy2/cache/org.apache.commons/commons-math3/jars/commons-math3-3.2.jar:/root/.ivy2/cache/org.spire-math/spire_2.10/jars/spire_2.10-0.7.1.jar:/root/.ivy2/cache/org.spire-math/spire-macros_2.10/jars/spire-macros_2.10-0.7.1.jar:/root/.ivy2/cache/com.typesafe/scalalogging-slf4j_2.10/jars/scalalogging-slf4j_2.10-1.0.1.jar:/root/.ivy2/cache/org.slf4j/slf4j-api/jars/slf4j-api-1.7.2.jar:/root/.ivy2/cache/org.scalanlp/breeze-natives_2.10/jars/breeze-natives_2.10-0.7.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_ref-osx-x86_64/jars/netlib-native_ref-osx-x86_64-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/native_ref-java/jars/native_ref-java-1.1.jar:/root/.ivy2/cache/com.github.fommil/jniloader/jars/jniloader-1.1.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_ref-linux-x86_64/jars/netlib-native_ref-linux-x86_64-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_ref-linux-i686/jars/netlib-native_ref-linux-i686-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_ref-win-x86_64/jars/netlib-native_ref-win-x86_64-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_ref-win-i686/jars/netlib-native_ref-win-i686-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_ref-linux-armhf/jars/netlib-native_ref-linux-armhf-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_system-osx-x86_64/jars/netlib-native_system-osx-x86_64-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/native_system-java/jars/native_system-java-1.1.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_system-linux-x86_64/jars/netlib-native_system-linux-x86_64-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_system-linux-i686/jars/netlib-native_system-linux-i686-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_system-linux-armhf/jars/netlib-native_system-linux-armhf-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_system-win-x86_64/jars/netlib-native_system-win-x86_64-1.1-natives.jar:/root/.ivy2/cache/com.github.fommil.netlib/netlib-native_system-win-i686/jars/netlib-native_system-win-i686-1.1-natives.jar
> ::/root/spark/conf:/root/spark/assembly/target/scala-2.10/spark-assembly-1.0.0-hadoop1.0.4.jar"
> "-Xms4096M" "-Xmx4096M"
>
>
>
> --
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/breeze-DGEMM-slow-in-spark-tp5950p5994.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|

Re: breeze DGEMM slow in spark

wxhsdp
correct what i said above:

ldconfig does work, it automatically makes a link:
libopenblas.so.0 -> libopenblas_nehalemp-r0.2.9.rc2.so

but what i need is libblas.so.3, so i tried several ways
1. create a file called libblas.so.3, then ldconfig.
2. create a file called libblas.so.3.0 then ldconfig

i hope ldconfig will generate a link file called libblas.so.3, but it seems libblas.so.3 is ignored
by ldconfig