Find difference between two dataframes in spark structured streaming

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Find difference between two dataframes in spark structured streaming

act_coder
I am creating a spark structured streaming job, where I need to find the
difference between two dataframes.

Dataframe 1 :

[1, item1, value1]
[2, item2, value2]
[3, item3, value3]
[4, item4, value4]
[5, item5, value5]

Dataframe 2:

[4, item4, value4]
[5, item5, value5]

New Dataframe with difference between two D1-D2:

[1, item1, value1]
[2, item2, value2]
[3, item3, value3]

I tried using except() and left anti join(), but both are not being
supported on spark structured streaming.

Is there a way we can achieve this in structured streaming ?



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]