How more than one spark job can write to same partition in the parquet file

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

How more than one spark job can write to same partition in the parquet file

Chetan Khatri
Hi Spark Users,
would that be possible to write to same partition to the parquet file through concurrent two spark jobs with different spark session.

thanks
Reply | Threaded
Open this post in threaded view
|

Re: How more than one spark job can write to same partition in the parquet file

ayan guha
No we faced problem with that setup. 

On Thu, 12 Dec 2019 at 11:14 am, Chetan Khatri <[hidden email]> wrote:
Hi Spark Users,
would that be possible to write to same partition to the parquet file through concurrent two spark jobs with different spark session.

thanks
--
Best Regards,
Ayan Guha
Reply | Threaded
Open this post in threaded view
|

Re: How more than one spark job can write to same partition in the parquet file

Chetan Khatri
Thanks, If you can share alternative change in design. I would love to hear from you.

On Wed, Dec 11, 2019 at 9:34 PM ayan guha <[hidden email]> wrote:
No we faced problem with that setup. 

On Thu, 12 Dec 2019 at 11:14 am, Chetan Khatri <[hidden email]> wrote:
Hi Spark Users,
would that be possible to write to same partition to the parquet file through concurrent two spark jobs with different spark session.

thanks
--
Best Regards,
Ayan Guha
Reply | Threaded
Open this post in threaded view
|

Re: How more than one spark job can write to same partition in the parquet file

ayan guha
We partitioned data logically for 2 different jobs...in our use case based on geography...

On Thu, 12 Dec 2019 at 3:39 pm, Chetan Khatri <[hidden email]> wrote:
Thanks, If you can share alternative change in design. I would love to hear from you.

On Wed, Dec 11, 2019 at 9:34 PM ayan guha <[hidden email]> wrote:
No we faced problem with that setup. 

On Thu, 12 Dec 2019 at 11:14 am, Chetan Khatri <[hidden email]> wrote:
Hi Spark Users,
would that be possible to write to same partition to the parquet file through concurrent two spark jobs with different spark session.

thanks
--
Best Regards,
Ayan Guha
--
Best Regards,
Ayan Guha
Reply | Threaded
Open this post in threaded view
|

Re: How more than one spark job can write to same partition in the parquet file

Iqbal Singh
Hey Chetan, 

I have not got your question. Are you trying to write to a partition from two actions ?? or you are looking for writing from two jobs. Except for maintaining the state for the dataset completeness in that case, I dont see any issues. 

We are writing data to a Partition using two different actions in a single spark job also partition here meant as a HDFS directory, not a hive partition.



On Thu, Dec 12, 2019 at 1:37 AM ayan guha <[hidden email]> wrote:
We partitioned data logically for 2 different jobs...in our use case based on geography...

On Thu, 12 Dec 2019 at 3:39 pm, Chetan Khatri <[hidden email]> wrote:
Thanks, If you can share alternative change in design. I would love to hear from you.

On Wed, Dec 11, 2019 at 9:34 PM ayan guha <[hidden email]> wrote:
No we faced problem with that setup. 

On Thu, 12 Dec 2019 at 11:14 am, Chetan Khatri <[hidden email]> wrote:
Hi Spark Users,
would that be possible to write to same partition to the parquet file through concurrent two spark jobs with different spark session.

thanks
--
Best Regards,
Ayan Guha
--
Best Regards,
Ayan Guha