[Spark Structured Streaming on K8S]: Debug - File handles/descriptor (unix pipe) leaking

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[Spark Structured Streaming on K8S]: Debug - File handles/descriptor (unix pipe) leaking

abhisheyke
Hello All!​​
I am using spark 2.3.1 on kubernetes to run a structured streaming spark job which read stream from Kafka , perform some window aggregation and output sink to Kafka. 
After job running few hours(5-6 hours), the executor pods is getting crashed which is caused by "Too many open files in system". 
Digging in further, with "lsof" command, I can see there is a lot UNIX pipe getting opened. 

# lsof -p 14 | tail 
java     14 root *112u  a_inode               0,10        0      8838 [eventpoll]
java     14 root *113r     FIFO                0,9      0t0 252556158 pipe
java     14 root *114w     FIFO                0,9      0t0 252556158 pipe
java     14 root *115u  a_inode               0,10        0      8838 [eventpoll]
java     14 root *119r     FIFO                0,9      0t0 252552868 pipe
java     14 root *120w     FIFO                0,9      0t0 252552868 pipe
java     14 root *121u  a_inode               0,10        0      8838 [eventpoll]
java     14 root *131r     FIFO                0,9      0t0 252561014 pipe
java     14 root *132w     FIFO                0,9      0t0 252561014 pipe
java     14 root *133u  a_inode               0,10        0      8838 [eventpoll]

Total count of open fd is going up to 85K (increased hard ulimit) for each pod and once it hit the hard limit , executor pod is getting crashed. 
For shuffling I can think of it need more fd but in my case open fd count keep growing forever. Not sure how can I estimate how many fd will be adequate or there is a bug.
With that uncertainty, I increased hard ulimit to large number as 85k but no luck. 
Seems like there is file descriptor leak.
 
This spark job is running with native support of kubernetes as spark cluster manager. Currently using only two executor with 20 core(request) and 10GB (+6GB as memoryOverhead) of physical memory each.

Have any one else seen the similar problem ? 
Thanks for any suggestion.


Error details:  
Caused by: java.io.FileNotFoundException: /tmp/blockmgr-b530983c-39be-4c6d-95aa-3ad12a507843/24/temp_shuffle_bf774cf5-fadb-4ca7-a27b-a5be7b835eb6 (Too many open files in system)
at java.io.FileOutputStream.open0(Native Method)
at java.io.FileOutputStream.open(FileOutputStream.java:270)
at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
at org.apache.spark.storage.DiskBlockObjectWriter.initialize(DiskBlockObjectWriter.scala:103)
at org.apache.spark.storage.DiskBlockObjectWriter.open(DiskBlockObjectWriter.scala:116)
at org.apache.spark.storage.DiskBlockObjectWriter.write(DiskBlockObjectWriter.scala:237)
at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:151)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)

For more error log, please follow below Github gist:



Some details about file descriptor (lsof): 

https://gist.github.com/abhisheyke/27b073e7ac805ce9e6bb33c2b011bb5a

Code Snip:
https://gist.github.com/abhisheyke/6f838adf6651491bd4f263956f403c74

Platform  Details: 
Kubernets Version : 1.9.2
Docker : 17.3.2
Spark version:  2.3.1
Kafka version: 2.11-0.10.2.1 (both topic has 20 partition and getting almost 5k records/s )
Hadoop version (Using hdfs for check pointing)  : 2.7.2

Thank you for any help

Best Regards,
Abhishek Tripathi

Reply | Threaded
Open this post in threaded view
|

Re: [Spark Structured Streaming on K8S]: Debug - File handles/descriptor (unix pipe) leaking

abhisheyke
Hello Dev!
Spark structured streaming job with simple window aggregation is leaking file descriptor on kubernetes as cluster manager setup. It seems bug.
I am suing HDFS as FS for checkpointing. 
Have anyone observed same?  Thanks for any help.

Please find more details in trailing email. 


For more error log, please follow below Github gist:
Some details about file descriptor (lsof): 
https://gist.github.com/abhisheyke/27b073e7ac805ce9e6bb33c2b011bb5a
Code Snip:

Thanks.
 
Best Regards,
Abhishek Tripath


On Thu, Jul 19, 2018 at 10:02 AM Abhishek Tripathi <[hidden email]> wrote:
Hello All!​​
I am using spark 2.3.1 on kubernetes to run a structured streaming spark job which read stream from Kafka , perform some window aggregation and output sink to Kafka. 
After job running few hours(5-6 hours), the executor pods is getting crashed which is caused by "Too many open files in system". 
Digging in further, with "lsof" command, I can see there is a lot UNIX pipe getting opened. 

# lsof -p 14 | tail 
java     14 root *112u  a_inode               0,10        0      8838 [eventpoll]
java     14 root *113r     FIFO                0,9      0t0 252556158 pipe
java     14 root *114w     FIFO                0,9      0t0 252556158 pipe
java     14 root *115u  a_inode               0,10        0      8838 [eventpoll]
java     14 root *119r     FIFO                0,9      0t0 252552868 pipe
java     14 root *120w     FIFO                0,9      0t0 252552868 pipe
java     14 root *121u  a_inode               0,10        0      8838 [eventpoll]
java     14 root *131r     FIFO                0,9      0t0 252561014 pipe
java     14 root *132w     FIFO                0,9      0t0 252561014 pipe
java     14 root *133u  a_inode               0,10        0      8838 [eventpoll]

Total count of open fd is going up to 85K (increased hard ulimit) for each pod and once it hit the hard limit , executor pod is getting crashed. 
For shuffling I can think of it need more fd but in my case open fd count keep growing forever. Not sure how can I estimate how many fd will be adequate or there is a bug.
With that uncertainty, I increased hard ulimit to large number as 85k but no luck. 
Seems like there is file descriptor leak.
 
This spark job is running with native support of kubernetes as spark cluster manager. Currently using only two executor with 20 core(request) and 10GB (+6GB as memoryOverhead) of physical memory each.

Have any one else seen the similar problem ? 
Thanks for any suggestion.


Error details:  
Caused by: java.io.FileNotFoundException: /tmp/blockmgr-b530983c-39be-4c6d-95aa-3ad12a507843/24/temp_shuffle_bf774cf5-fadb-4ca7-a27b-a5be7b835eb6 (Too many open files in system)
at java.io.FileOutputStream.open0(Native Method)
at java.io.FileOutputStream.open(FileOutputStream.java:270)
at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
at org.apache.spark.storage.DiskBlockObjectWriter.initialize(DiskBlockObjectWriter.scala:103)
at org.apache.spark.storage.DiskBlockObjectWriter.open(DiskBlockObjectWriter.scala:116)
at org.apache.spark.storage.DiskBlockObjectWriter.write(DiskBlockObjectWriter.scala:237)
at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:151)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)

For more error log, please follow below Github gist:



Some details about file descriptor (lsof): 

https://gist.github.com/abhisheyke/27b073e7ac805ce9e6bb33c2b011bb5a

Code Snip:
https://gist.github.com/abhisheyke/6f838adf6651491bd4f263956f403c74

Platform  Details: 
Kubernets Version : 1.9.2
Docker : 17.3.2
Spark version:  2.3.1
Kafka version: 2.11-0.10.2.1 (both topic has 20 partition and getting almost 5k records/s )
Hadoop version (Using hdfs for check pointing)  : 2.7.2

Thank you for any help

Best Regards,
Abhishek Tripathi

Reply | Threaded
Open this post in threaded view
|

Re: [Spark Structured Streaming on K8S]: Debug - File handles/descriptor (unix pipe) leaking

Yuval.Itzchakov
We're experiencing the exact same issue while running load tests on Spark
2.3.1 with Structured Streaming and `mapGroupsWithState`.



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]