Spark stage stuck

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

Spark stage stuck

Manjunath Shetty H

I have running multiple jobs in same driver with FAIR scheduling enabled. Intermittently one of the Stage gets stuck and not completing even after long time.

Each job flow is something like this
  • Create JDBC RDD to load data from SQL Server
  • Create temporary table
  • Query Temp table with specific set of Columns
  • Persist the DF
  • Write DF to HDFS in ORC format
  • ...
As writing the ORC is the first action it shows it stuck at writing ORC. Is there any way to debug this problem ? Any pointers will be helpful