DFS Pregel performance vs simple Java DFS implementation

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

DFS Pregel performance vs simple Java DFS implementation


Considering a directed graph with 15,000 vertices and 14,000 edges, I wonder
why GraphX (Pregel) takes much more time than the java implementation of a
graph to get all the vertices from a vertex to the leaf?
By the nature of the graph, we can almost consider it as a tree.

The java implementation: A few seconds
The Graphx implementation: Several minutes

Is Pregel really suitable for this kind of treatment?

Additionnal informations:
My system: 16 cores, 35GB RAM

The pregel's algorythm:
val sourceId: VertexId = 42 // The ultimate source
  // Initialize the graph such that all vertices except the root have
canReach = false.
  val initialGraph: Graph[Boolean, Double]  = graph.mapVertices((id, _) =>
id == sourceId)
  val dfs = initialGraph.pregel(false)(
    (id, canReach, newCanReach) => canReach || newCanReach, // Vertex
    triplet => {  // Send Message
      if (triplet.srcAttr && !triplet.dstAttr) {
        Iterator((triplet.dstId, true))
      } else {
    (a, b) => a || b // Merge Message


Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

To unsubscribe e-mail: [hidden email]