spark.lapply

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

spark.lapply

Junior Alvarez

Hi!

 

I’m using spark.lapply() in sparkR on a mesos service I get the following crash randomly (The spark.lapply() function is called around 150 times, some times it crashes after 16 calls, other after 25 calls and so on…it is completely random, even though the data used in the actual call is always the same the 150 times I called that function):

 

18/09/26 07:30:42 INFO TaskSetManager: Finished task 129.0 in stage 78.0 (TID 1192) in 98 ms on 10.255.0.18 (executor 0) (121/143)
18/09/26 07:30:42 WARN TaskSetManager: Lost task 128.0 in stage 78.0 (TID 1191, 10.255.0.18, executor 0): org.apache.spark.SparkException: R computation failed with
 7f327f4dd000-7f327f500000 r-xp 00000000 08:11 174916727                  /lib/x86_64-linux-gnu/ld-2.19.so
7f327f51c000-7f327f6f2000 rw-p 00000000 00:00 0 
7f327f6fc000-7f327f6fd000 rw-p 00000000 00:00 0 
7f327f6fd000-7f327f6ff000 rw-p 00000000 00:00 0 
7f327f6ff000-7f327f700000 r--p 00022000 08:11 174916727                  /lib/x86_64-linux-gnu/ld-2.19.so
7f327f700000-7f327f701000 rw-p 00023000 08:11 174916727                  /lib/x86_64-linux-gnu/ld-2.19.so
7f327f701000-7f327f702000 rw-p 00000000 00:00 0 
7fff6070f000-7fff60767000 rw-p 00000000 00:00 0                          [stack]
7fff6077f000-7fff60781000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]
*** buffer overflow detected ***: /usr/local/lib/R/bin/exec/R terminated
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x7329f)[0x7f327db9529f]
/lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x5c)[0x7f327dc3087c]
/lib/x86_64-linux-gnu/libc.so.6(+0x10d750)[0x7f327dc2f750]

 

If I of course use the native R lapply() everything works fine.

 

I wonder if this is a known issue, and/or is there is a way to avoid it when using sparkR.

 

B r

/Junior

 

Reply | Threaded
Open this post in threaded view
|

Re: spark.lapply

Felix Cheung
It looks like the native R process is terminated from buffer overflow. Do you know how much data is involved?

 

From: Junior Alvarez <[hidden email]>
Sent: Wednesday, September 26, 2018 7:33 AM
To: [hidden email]
Subject: spark.lapply
 

Hi!

 

I’m using spark.lapply() in sparkR on a mesos service I get the following crash randomly (The spark.lapply() function is called around 150 times, some times it crashes after 16 calls, other after 25 calls and so on…it is completely random, even though the data used in the actual call is always the same the 150 times I called that function):

 

18/09/26 07:30:42 INFO TaskSetManager: Finished task 129.0 in stage 78.0 (TID 1192) in 98 ms on 10.255.0.18 (executor 0) (121/143)
18/09/26 07:30:42 WARN TaskSetManager: Lost task 128.0 in stage 78.0 (TID 1191, 10.255.0.18, executor 0): org.apache.spark.SparkException: R computation failed with
 7f327f4dd000-7f327f500000 r-xp 00000000 08:11 174916727                  /lib/x86_64-linux-gnu/ld-2.19.so
7f327f51c000-7f327f6f2000 rw-p 00000000 00:00 0 
7f327f6fc000-7f327f6fd000 rw-p 00000000 00:00 0 
7f327f6fd000-7f327f6ff000 rw-p 00000000 00:00 0 
7f327f6ff000-7f327f700000 r--p 00022000 08:11 174916727                  /lib/x86_64-linux-gnu/ld-2.19.so
7f327f700000-7f327f701000 rw-p 00023000 08:11 174916727                  /lib/x86_64-linux-gnu/ld-2.19.so
7f327f701000-7f327f702000 rw-p 00000000 00:00 0 
7fff6070f000-7fff60767000 rw-p 00000000 00:00 0                          [stack]
7fff6077f000-7fff60781000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]
*** buffer overflow detected ***: /usr/local/lib/R/bin/exec/R terminated
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x7329f)[0x7f327db9529f]
/lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x5c)[0x7f327dc3087c]
/lib/x86_64-linux-gnu/libc.so.6(+0x10d750)[0x7f327dc2f750]

 

If I of course use the native R lapply() everything works fine.

 

I wonder if this is a known issue, and/or is there is a way to avoid it when using sparkR.

 

B r

/Junior

 

Reply | Threaded
Open this post in threaded view
|

RE: spark.lapply

Junior Alvarez

Around 500KB each time i call the function (~150 times)….

 

From: Felix Cheung <[hidden email]>
Sent: den 26 september 2018 14:57
To: Junior Alvarez <[hidden email]>; [hidden email]
Subject: Re: spark.lapply

 

It looks like the native R process is terminated from buffer overflow. Do you know how much data is involved?

 

 


From: Junior Alvarez <[hidden email]>
Sent: Wednesday, September 26, 2018 7:33 AM
To: [hidden email]
Subject: spark.lapply

 

Hi!

 

I’m using spark.lapply() in sparkR on a mesos service I get the following crash randomly (The spark.lapply() function is called around 150 times, some times it crashes after 16 calls, other after 25 calls and so on…it is completely random, even though the data used in the actual call is always the same the 150 times I called that function):

 

18/09/26 07:30:42 INFO TaskSetManager: Finished task 129.0 in stage 78.0 (TID 1192) in 98 ms on 10.255.0.18 (executor 0) (121/143)
18/09/26 07:30:42 WARN TaskSetManager: Lost task 128.0 in stage 78.0 (TID 1191, 10.255.0.18, executor 0): org.apache.spark.SparkException: R computation failed with
 7f327f4dd000-7f327f500000 r-xp 00000000 08:11 174916727                  /lib/x86_64-linux-gnu/ld-2.19.so
7f327f51c000-7f327f6f2000 rw-p 00000000 00:00 0 
7f327f6fc000-7f327f6fd000 rw-p 00000000 00:00 0 
7f327f6fd000-7f327f6ff000 rw-p 00000000 00:00 0 
7f327f6ff000-7f327f700000 r--p 00022000 08:11 174916727                  /lib/x86_64-linux-gnu/ld-2.19.so
7f327f700000-7f327f701000 rw-p 00023000 08:11 174916727                  /lib/x86_64-linux-gnu/ld-2.19.so
7f327f701000-7f327f702000 rw-p 00000000 00:00 0 
7fff6070f000-7fff60767000 rw-p 00000000 00:00 0                          [stack]
7fff6077f000-7fff60781000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]
*** buffer overflow detected ***: /usr/local/lib/R/bin/exec/R terminated
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x7329f)[0x7f327db9529f]
/lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x5c)[0x7f327dc3087c]
/lib/x86_64-linux-gnu/libc.so.6(+0x10d750)[0x7f327dc2f750]

 

If I of course use the native R lapply() everything works fine.

 

I wonder if this is a known issue, and/or is there is a way to avoid it when using sparkR.

 

B r

/Junior