SPIP: DataFrame-based Property Graphs, Cypher Queries, and Algorithms

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

SPIP: DataFrame-based Property Graphs, Cypher Queries, and Algorithms

Xiangrui Meng
Hi all,

I want to re-send the previous SPIP on introducing a DataFrame-based graph component to collect more feedback. It supports property graphs, Cypher graph queries, and graph algorithms built on top of the DataFrame API. If you are a GraphX user or your workload is essentially graph queries, please help review and check how it fits into your use cases. Your feedback would be greatly appreciated!

# Links to SPIP and design sketch:


# Sample code:

~~~
val graph = ...

// query
val result = graph.cypher("""
  MATCH (p:Person)-[r:STUDY_AT]->(u:University)
  RETURN p.name, r.since, u.name
""")

// algorithms
val ranks = graph.pageRank.run()
~~~

Best,
Xiangrui
Reply | Threaded
Open this post in threaded view
|

Re: SPIP: DataFrame-based Property Graphs, Cypher Queries, and Algorithms

clarrob
Hi Xiangrui

+1.

I've been working with data and analytics technologies in the finance
industry for many years, and I think that getting a well-established graph
query language like Cypher to operate over SparkSQL-conformant property
graphs would be relevant for lots of use cases where people want to use a
graph approach. I've seen interest in using graphs to track and catalog
metadata and data flows, and also for business cases like suspicious
transaction analysis or fraud-ring detection.

Regards,

Robert




--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: SPIP: DataFrame-based Property Graphs, Cypher Queries, and Algorithms

dusanz
In reply to this post by Xiangrui Meng
I support this proposal - great idea, something that's been missing in Spark
world. I'm a data architect working primarily in banking, many years of
designing and tuning relational database systems, and more recently, wBig
Data solutions, often including integration of old and new technologies. The
graph database model is becoming more and more recognised and present in the
world of finance. The idea of being able to take a property graph view over
dataframes and run graph queries makes a lot of sense from the integration
point of view as we want to use graph databases/services alongside existing
investments in the Spark ecosystem (typically deployed on Hadoop clusters,
typically implementing relational stuff). I can see use cases
(relational-meets-graph) in my world, specifically for
completeness/availability calculation dependency graphs, metadata and data
management in the space where enterprise architecture meets BCBS239
(taxonomy, provenance, lineage), plus of course unauthorised trading, fraud
detection, all that. An additional bonus here is that Cypher seems like a
good choice in light of its spread beyond Neo and its contribution to the
future official ISO standard Graph Query Language.



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: SPIP: DataFrame-based Property Graphs, Cypher Queries, and Algorithms

木内満歳
In reply to this post by Xiangrui Meng
I support the proposal. I am assisting various companies as a system integrator of Neo4j. There is several Japanese telecommunications companies quickly grasping the state of the network topology. In addition, many advertising censure companies associate enormous metadata and use it for marketing using Neo4j and Cypher query. Cypher Query's flexible search and extraction mechanisms benefit from these activities. Also, many manufacturing industries in Japan interest in the graph database model. We believe that support of Cypher Query in Apache Spark can give Japanese graph data users a more convenient path to distributed processing.
In Japan, communities that love Neo4j, Cypher Query are already active and frequently communicating. ( https://jp-neo4j-usersgroup.connpass.com/ ) With Cypher Query support from Apache Spark, they will be encouraged and will love Apache Spark. We are convinced that the Apache Spark developer community will expand further.

Regards,

--
Mitsutoshi Kiuchi


2019年1月16日(水) 1:53 Xiangrui Meng <[hidden email]>:
Hi all,

I want to re-send the previous SPIP on introducing a DataFrame-based graph component to collect more feedback. It supports property graphs, Cypher graph queries, and graph algorithms built on top of the DataFrame API. If you are a GraphX user or your workload is essentially graph queries, please help review and check how it fits into your use cases. Your feedback would be greatly appreciated!

# Links to SPIP and design sketch:


# Sample code:

~~~
val graph = ...

// query
val result = graph.cypher("""
  MATCH (p:Person)-[r:STUDY_AT]->(u:University)
  RETURN p.name, r.since, u.name
""")

// algorithms
val ranks = graph.pageRank.run()
~~~

Best,
Xiangrui
Reply | Threaded
Open this post in threaded view
|

Re: SPIP: DataFrame-based Property Graphs, Cypher Queries, and Algorithms

Gourav Sengupta
In reply to this post by Xiangrui Meng
Hi,

this is fantastic and it will be great to have this. Also a place where we could use graph frames is for data lineage. You will see a 100% adoption of graph frames in case we can send data from catalyst to be stored somewhere as graphs of dependencies.

In case you are including data lineage as well, please do let me know and I will love to be a part of the testing as well.

Regards,
Gourav Sengupta

On Tue, Jan 15, 2019 at 4:53 PM Xiangrui Meng <[hidden email]> wrote:
Hi all,

I want to re-send the previous SPIP on introducing a DataFrame-based graph component to collect more feedback. It supports property graphs, Cypher graph queries, and graph algorithms built on top of the DataFrame API. If you are a GraphX user or your workload is essentially graph queries, please help review and check how it fits into your use cases. Your feedback would be greatly appreciated!

# Links to SPIP and design sketch:


# Sample code:

~~~
val graph = ...

// query
val result = graph.cypher("""
  MATCH (p:Person)-[r:STUDY_AT]->(u:University)
  RETURN p.name, r.since, u.name
""")

// algorithms
val ranks = graph.pageRank.run()
~~~

Best,
Xiangrui
Reply | Threaded
Open this post in threaded view
|

Re: SPIP: DataFrame-based Property Graphs, Cypher Queries, and Algorithms

Andrea Santurbano
In reply to this post by Xiangrui Meng
+1

Graph analytics is now mainstream, and having Cypher first-class support in Spark would allow users to deal with highly connected datasets (fraud detection, epidemiology analysis, genomic analysis, and so on) going beyond the limits of joins when you must traverse a dataset.

On 2019/01/15 16:52:44, Xiangrui Meng <[hidden email]> wrote: 
> Hi all,> 
> 
> I want to re-send the previous SPIP on introducing a DataFrame-based graph> 
> component to collect more feedback. It supports property graphs, Cypher> 
> graph queries, and graph algorithms built on top of the DataFrame API. If> 
> you are a GraphX user or your workload is essentially graph queries, please> 
> help review and check how it fits into your use cases. Your feedback would> 
> be greatly appreciated!> 
> 
> # Links to SPIP and design sketch:> 
> 
> * Jira issue for the SPIP: https://issues.apache.org/jira/browse/SPARK-25994> 
> * Google Doc:> 
https://docs.google.com/document/d/1ljqVsAh2wxTZS8XqwDQgRT6i_mania3ffYSYpEgLx9k/edit?usp=sharing> 
> * Jira issue for a first design sketch:> 
https://issues.apache.org/jira/browse/SPARK-26028> 
> * Google Doc:> 
https://docs.google.com/document/d/1Wxzghj0PvpOVu7XD1iA8uonRYhexwn18utdcTxtkxlI/edit?usp=sharing> 
> 
> # Sample code:> 
> 
> ~~~> 
> val graph = ...> 
> 
> // query> 
> val result = graph.cypher("""> 
>   MATCH (p:Person)-[r:STUDY_AT]->(u:University)> 
>   RETURN p.name, r.since, u.name> 
> """)> 
> 
> // algorithms> 
> val ranks = graph.pageRank.run()> 
> ~~~> 
> 
> Best,> 
> Xiangrui> 
>
Reply | Threaded
Open this post in threaded view
|

Re: SPIP: DataFrame-based Property Graphs, Cypher Queries, and Algorithms

Alistair Blair
In reply to this post by Xiangrui Meng
Hi Xiangrui

+1

It would be fantastic to see this functionality.

Regards

Alistair.


On 2019/01/15 16:52:44, Xiangrui Meng <[hidden email]> wrote:

> Hi all,>
>
> I want to re-send the previous SPIP on introducing a DataFrame-based graph>
> component to collect more feedback. It supports property graphs, Cypher>
> graph queries, and graph algorithms built on top of the DataFrame API. If>
> you are a GraphX user or your workload is essentially graph queries, please>
> help review and check how it fits into your use cases. Your feedback would>
> be greatly appreciated!>
>
> # Links to SPIP and design sketch:>
>
> * Jira issue for the SPIP: https://issues.apache.org/jira/browse/SPARK-25994>
> * Google Doc:>
> https://docs.google.com/document/d/1ljqVsAh2wxTZS8XqwDQgRT6i_mania3ffYSYpEgLx9k/edit?usp=sharing>
> * Jira issue for a first design sketch:>
> https://issues.apache.org/jira/browse/SPARK-26028>
> * Google Doc:>
> https://docs.google.com/document/d/1Wxzghj0PvpOVu7XD1iA8uonRYhexwn18utdcTxtkxlI/edit?usp=sharing>
>
> # Sample code:>
>
> ~~~>
> val graph = ...>
>
> // query>
> val result = graph.cypher(""">
>   MATCH (p:Person)-[r:STUDY_AT]->(u:University)>
>   RETURN p.name, r.since, u.name>
> """)>
>
> // algorithms>
> val ranks = graph.pageRank.run()>
> ~~~>
>
> Best,>
> Xiangrui>
>
---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: SPIP: DataFrame-based Property Graphs, Cypher Queries, and Algorithms

HJSC
In reply to this post by Xiangrui Meng
Hi,
I am a sw engineer that works with historical FAA and pilot data and think
that Cypher will be a good addition to the Spark ecosystem.

I support this proposal and am looking forward to giving it a try.

Regards,
HJ




--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]