text processing in spark (Spark job stucks for several minutes)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

text processing in spark (Spark job stucks for several minutes)

Donni Khan
Hi,
I'm applying preprocessing methods on big data of text by using spark-Java. I created my own NLP pipline as a normal java code and call it in the map function like this:

MyRDD.map(call nlp pipeline fr each row)

I run my job in a cluster 14 machines(32 Cores  and about 140G for each). The job run correctltly, it distrbutes the documents across executors, but the job stuck on the last task for several minutes
I looked at the job details, I found that most of documents are processed in several executrs, but only one task stuck on the small number of documents, it looks like the task waits for something, then after 10-20 minutes the task cntinues to process the rest documents and finish.

I also tried to test different configurations but still the same.
any help?

thanks,
Donni


Reply | Threaded
Open this post in threaded view
|

Re: text processing in spark (Spark job stucks for several minutes)

Jörn Franke
Please provide source code and exceptions that are in executor and/or driver log.


> On 26. Oct 2017, at 08:42, Donni Khan <[hidden email]> wrote:
>
> Hi,
> I'm applying preprocessing methods on big data of text by using spark-Java. I created my own NLP pipline as a normal java code and call it in the map function like this:
>
> MyRDD.map(call nlp pipeline fr each row)
>
> I run my job in a cluster 14 machines(32 Cores  and about 140G for each). The job run correctltly, it distrbutes the documents across executors, but the job stuck on the last task for several minutes
> I looked at the job details, I found that most of documents are processed in several executrs, but only one task stuck on the small number of documents, it looks like the task waits for something, then after 10-20 minutes the task cntinues to process the rest documents and finish.
>
> I also tried to test different configurations but still the same.
> any help?
>
> thanks,
> Donni
>
>

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]