Incrementally add/remove vertices in GraphX

classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

Incrementally add/remove vertices in GraphX

Deepak Nulu
Hi,

Is there a way to incrementally add/remove vertices in GraphX? I have read the documentation and looked at the API, but I don't see a way to incrementally add/remove vertices in GraphX.

Thanks.

-deepak
Reply | Threaded
Open this post in threaded view
|

Re: Incrementally add/remove vertices in GraphX

Matei Zaharia
Administrator
Right now there isn’t. It’s meant for analysis once you have a graph. If you just need a few vertices at the beginning you could add them to the vertex and edge RDDs using RDD.union() before creating a Graph.

Matei

On Mar 2, 2014, at 2:38 PM, Deepak Nulu <[hidden email]> wrote:

> Hi,
>
> Is there a way to incrementally add/remove vertices in GraphX? I have read
> the documentation and looked at the API, but I don't see a way to
> incrementally add/remove vertices in GraphX.
>
> Thanks.
>
> -deepak
>
>
>
>
> --
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Incrementally-add-remove-vertices-in-GraphX-tp2227.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|

Re: Incrementally add/remove vertices in GraphX

Deepak Nulu
Hi Matei,

Thanks for the quick response. Is there a plan to support this? Any ticket I can follow? I don't see a GraphX component at https://spark-project.atlassian.net; is there a different bug database for GraphX?

Thanks.

-deepak
Reply | Threaded
Open this post in threaded view
|

Re: Incrementally add/remove vertices in GraphX

Matei Zaharia
Administrator
You can create a ticket, but note that real-time updates to the graph are outside the scope of GraphX right now. It’s meant to be a graph analysis system, not a graph storage system. I’ve added it as a component on https://spark-project.atlassian.net/browse/SPARK.

Matei

On Mar 2, 2014, at 3:32 PM, Deepak Nulu <[hidden email]> wrote:

> Hi Matei,
>
> Thanks for the quick response. Is there a plan to support this? Any ticket I
> can follow? I don't see a GraphX component at
> https://spark-project.atlassian.net; is there a different bug database for
> GraphX?
>
> Thanks.
>
> -deepak
>
>
>
>
> --
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Incrementally-add-remove-vertices-in-GraphX-tp2227p2230.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|

Re: Incrementally add/remove vertices in GraphX

Nick Chammas
Quick side-note on that page, Matei: Several versions up to and including 0.9.0 are still marked as "unreleased" in JIRA. Dunno if that's intentional (or if it matters any).


On Sun, Mar 2, 2014 at 7:52 PM, Matei Zaharia <[hidden email]> wrote:
You can create a ticket, but note that real-time updates to the graph are outside the scope of GraphX right now. It’s meant to be a graph analysis system, not a graph storage system. I’ve added it as a component on https://spark-project.atlassian.net/browse/SPARK.

Matei

On Mar 2, 2014, at 3:32 PM, Deepak Nulu <[hidden email]> wrote:

> Hi Matei,
>
> Thanks for the quick response. Is there a plan to support this? Any ticket I
> can follow? I don't see a GraphX component at
> https://spark-project.atlassian.net; is there a different bug database for
> GraphX?
>
> Thanks.
>
> -deepak
>
>
>
>
> --
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Incrementally-add-remove-vertices-in-GraphX-tp2227p2230.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.


Reply | Threaded
Open this post in threaded view
|

Re: Incrementally add/remove vertices in GraphX

Matei Zaharia
Administrator
Good catch, I’ve fixed those.

On Mar 2, 2014, at 5:25 PM, Nicholas Chammas <[hidden email]> wrote:

Quick side-note on that page, Matei: Several versions up to and including 0.9.0 are still marked as "unreleased" in JIRA. Dunno if that's intentional (or if it matters any).


On Sun, Mar 2, 2014 at 7:52 PM, Matei Zaharia <[hidden email]> wrote:
You can create a ticket, but note that real-time updates to the graph are outside the scope of GraphX right now. It’s meant to be a graph analysis system, not a graph storage system. I’ve added it as a component on https://spark-project.atlassian.net/browse/SPARK.

Matei

On Mar 2, 2014, at 3:32 PM, Deepak Nulu <[hidden email]> wrote:

> Hi Matei,
>
> Thanks for the quick response. Is there a plan to support this? Any ticket I
> can follow? I don't see a GraphX component at
> https://spark-project.atlassian.net; is there a different bug database for
> GraphX?
>
> Thanks.
>
> -deepak
>
>
>
>
> --
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Incrementally-add-remove-vertices-in-GraphX-tp2227p2230.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.



Reply | Threaded
Open this post in threaded view
|

Re: Incrementally add/remove vertices in GraphX

psnively@icloud.com
In reply to this post by Deepak Nulu
Does this suggest value in an integration of GraphX and neo4j?

Sent from my Verizon Wireless Phone

----- Reply message -----
From: "Matei Zaharia" <[hidden email]>
To: <[hidden email]>
Cc: <[hidden email]>
Subject: Incrementally add/remove vertices in GraphX
Date: Sun, Mar 2, 2014 4:52 pm


You can create a ticket, but note that real-time updates to the graph are outside the scope of GraphX right now. It’s meant to be a graph analysis system, not a graph storage system. I’ve added it as a component on https://spark-project.atlassian.net/browse/SPARK.

Matei

On Mar 2, 2014, at 3:32 PM, Deepak Nulu <[hidden email]> wrote:

> Hi Matei,
>
> Thanks for the quick response. Is there a plan to support this? Any ticket I
> can follow? I don't see a GraphX component at
> https://spark-project.atlassian.net; is there a different bug database for
> GraphX?
>
> Thanks.
>
> -deepak
>
>
>
>
> --
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Incrementally-add-remove-vertices-in-GraphX-tp2227p2230.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|

Re: Incrementally add/remove vertices in GraphX

Aditya Varun Chadha

Or Titan on hbase, so you could try reading graphs directly via custom io formats

Reply | Threaded
Open this post in threaded view
|

Re: Incrementally add/remove vertices in GraphX

alelulli
In reply to this post by Matei Zaharia
Hi Matei,

Could you please clarify why i must call union before creating the graph?

What's the behavior if i call union / subtract after the creation?
Is the added /removed vertexes been processed?

For example if i'm implementing an iterative algorithm and at the 5th step i need to add some vertex / edge, can i call union / subtract on the VertexRDD, EdgeRDD and Triplets?

Thanks
Alessandro
Reply | Threaded
Open this post in threaded view
|

Re: Incrementally add/remove vertices in GraphX

alelulli
Hi All,

Is somebody looking into this?
I think this is correlated with the discussion "Are there any plans to develop Graphx Streaming?".

Using union / subtract on VertexRDD or EdgeRDD leads on the creation of new RDD but NOT in the modification of the RDD in the graph.
Is creating a new graph the only way to go to add /remove vertex or edge?

Thanks
Alessandro


On Fri, Mar 14, 2014 at 4:32 PM, alelulli <[hidden email]> wrote:
Hi Matei,

Could you please clarify why i must call union before creating the graph?

What's the behavior if i call union / subtract after the creation?
Is the added /removed vertexes been processed?

For example if i'm implementing an iterative algorithm and at the 5th step i
need to add some vertex / edge, can i call union / subtract on the
VertexRDD, EdgeRDD and Triplets?

Thanks
Alessandro



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Incrementally-add-remove-vertices-in-GraphX-tp2227p2695.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|

Re: Incrementally add/remove vertices in GraphX

Adam Novak
I would assume that, regardless of the efficiency of such an operation, any method of adding or removing vertices would need to result in a new graph, since graphs in GraphX are supposed to be immutable.

It sounds like what you probably want is an efficient union/subtract/whatever that operates on graphs, returning a graph modified to include your new vertices or edges, or to remove the ones you wanted to throw out. Because of the fancy indexing going on inside VertexRDD, I am not sure how easy it would be to do this on vertices in an efficient way, without having to rebuild the whole index.

Is the index used in a VertexRDD able to efficiently accommodate insertions?

-Adam


On Mon, Mar 17, 2014 at 9:50 AM, Alessandro Lulli <[hidden email]> wrote:
Hi All,

Is somebody looking into this?
I think this is correlated with the discussion "Are there any plans to develop Graphx Streaming?".

Using union / subtract on VertexRDD or EdgeRDD leads on the creation of new RDD but NOT in the modification of the RDD in the graph.
Is creating a new graph the only way to go to add /remove vertex or edge?

Thanks
Alessandro


On Fri, Mar 14, 2014 at 4:32 PM, alelulli <[hidden email]> wrote:
Hi Matei,

Could you please clarify why i must call union before creating the graph?

What's the behavior if i call union / subtract after the creation?
Is the added /removed vertexes been processed?

For example if i'm implementing an iterative algorithm and at the 5th step i
need to add some vertex / edge, can i call union / subtract on the
VertexRDD, EdgeRDD and Triplets?

Thanks
Alessandro



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Incrementally-add-remove-vertices-in-GraphX-tp2227p2695.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.


Reply | Threaded
Open this post in threaded view
|

Re: Incrementally add/remove vertices in GraphX

Matei Zaharia
Administrator
In reply to this post by alelulli
I just meant that you call union() before creating the RDDs that you pass to new Graph(). If you call it after it will produce other RDDs.

The Graph() constructor actually shuffles and “indexes” the data to make graph operations efficient, so it’s not too easy to add elements after. You could access graph.vertices and graph.edges to build new RDDs, and then call Graph() again to make a new graph. I’ve CCed Joey and Ankur to see if they have further ideas on how to optimize this. It would be cool to support more efficient union and subtracting of graphs once they’ve been partitioned by GraphX.

Matei

On Mar 14, 2014, at 8:32 AM, alelulli <[hidden email]> wrote:

> Hi Matei,
>
> Could you please clarify why i must call union before creating the graph?
>
> What's the behavior if i call union / subtract after the creation?
> Is the added /removed vertexes been processed?
>
> For example if i'm implementing an iterative algorithm and at the 5th step i
> need to add some vertex / edge, can i call union / subtract on the
> VertexRDD, EdgeRDD and Triplets?
>
> Thanks
> Alessandro
>
>
>
> --
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Incrementally-add-remove-vertices-in-GraphX-tp2227p2695.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|

Re: Incrementally add/remove vertices in GraphX

ankurdave
As Matei said, there's currently no support for incrementally adding vertices or edges to their respective partitions. Doing this efficiently would require extensive modifications to GraphX, so for now, the only options are to rebuild the indices on every graph modification, or to use the subgraph operator if the modification only involves removing vertices and edges.

However, Joey and I are working on GraphX streaming, which is currently in the very early stages but eventually will enable this.



On Tue, Mar 18, 2014 at 3:30 PM, Matei Zaharia <[hidden email]> wrote:
I just meant that you call union() before creating the RDDs that you pass to new Graph(). If you call it after it will produce other RDDs.

The Graph() constructor actually shuffles and “indexes” the data to make graph operations efficient, so it’s not too easy to add elements after. You could access graph.vertices and graph.edges to build new RDDs, and then call Graph() again to make a new graph. I’ve CCed Joey and Ankur to see if they have further ideas on how to optimize this. It would be cool to support more efficient union and subtracting of graphs once they’ve been partitioned by GraphX.

Matei

On Mar 14, 2014, at 8:32 AM, alelulli <[hidden email]> wrote:

> Hi Matei,
>
> Could you please clarify why i must call union before creating the graph?
>
> What's the behavior if i call union / subtract after the creation?
> Is the added /removed vertexes been processed?
>
> For example if i'm implementing an iterative algorithm and at the 5th step i
> need to add some vertex / edge, can i call union / subtract on the
> VertexRDD, EdgeRDD and Triplets?
>
> Thanks
> Alessandro
>
>
>
> --
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Incrementally-add-remove-vertices-in-GraphX-tp2227p2695.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.


Reply | Threaded
Open this post in threaded view
|

Re: Incrementally add/remove vertices in GraphX

alelulli
Hi All,

Thanks for your answer.

Regarding GraphX streaming:
  • Is there an issue (pull request) to follow to keep track of the update?
  • where is possible to find description and details of what will be provided?

Thanks for your help and your time to answer my questions
Alessandro



On Wed, Mar 19, 2014 at 2:43 AM, Ankur Dave <[hidden email]> wrote:
As Matei said, there's currently no support for incrementally adding vertices or edges to their respective partitions. Doing this efficiently would require extensive modifications to GraphX, so for now, the only options are to rebuild the indices on every graph modification, or to use the subgraph operator if the modification only involves removing vertices and edges.

However, Joey and I are working on GraphX streaming, which is currently in the very early stages but eventually will enable this.



On Tue, Mar 18, 2014 at 3:30 PM, Matei Zaharia <[hidden email]> wrote:
I just meant that you call union() before creating the RDDs that you pass to new Graph(). If you call it after it will produce other RDDs.

The Graph() constructor actually shuffles and “indexes” the data to make graph operations efficient, so it’s not too easy to add elements after. You could access graph.vertices and graph.edges to build new RDDs, and then call Graph() again to make a new graph. I’ve CCed Joey and Ankur to see if they have further ideas on how to optimize this. It would be cool to support more efficient union and subtracting of graphs once they’ve been partitioned by GraphX.

Matei

On Mar 14, 2014, at 8:32 AM, alelulli <[hidden email]> wrote:

> Hi Matei,
>
> Could you please clarify why i must call union before creating the graph?
>
> What's the behavior if i call union / subtract after the creation?
> Is the added /removed vertexes been processed?
>
> For example if i'm implementing an iterative algorithm and at the 5th step i
> need to add some vertex / edge, can i call union / subtract on the
> VertexRDD, EdgeRDD and Triplets?
>
> Thanks
> Alessandro
>
>
>
> --
> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Incrementally-add-remove-vertices-in-GraphX-tp2227p2695.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.



Reply | Threaded
Open this post in threaded view
|

Re: Incrementally add/remove vertices in GraphX

vzaychik
This post has NOT been accepted by the mailing list yet.
In reply to this post by ankurdave
Any updates on GraphX Streaming? There was mention of this about a year ago, but nothing much since.
Thanks!
mas
Reply | Threaded
Open this post in threaded view
|

Re: Incrementally add/remove vertices in GraphX

mas
This post has NOT been accepted by the mailing list yet.
In reply to this post by alelulli
Dear All,

Any update regarding Graph Streaming, I want to update, i.e., add vertices and edges after creation of graph.

Any suggestions or recommendations to do that.

Thanks,