Problem in Spark Streaming

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Problem in Spark Streaming

nilmish
I am running a spark streaming job to count top 10 hashtags over last 5 mins window, querying every 1 sec.

It is taking approx <1.4 sec (end-to-end-delay) to answer most of the query but there are few instances in between when it takes considerable more amount of time (like around 15 sec) due to which the response time of further queries also becomes more.  I am not able to debug the reason for such spikes in between.  The data rate is nearly constant, so this spike is not due to sudden increase in the data rate.

Also is there any way so that I can fix a bound on time taken by a particular query. Like if a particular query takes more than say 2 sec then it should kill that query and move on to the next query. So that if a particular query takes more time then it do not effect future queries.

Thanx,
Nilesh
Reply | Threaded
Open this post in threaded view
|

Re: Problem in Spark Streaming

Yingjun Wu
Hi Nilmish,

I confront the same problem. I am wondering how do you measure the latency?

Regards,
Yingjun
Reply | Threaded
Open this post in threaded view
|

Re: Problem in Spark Streaming

nilmish
You can measure the latency from the logs. Search for words like Total delay in the logs. This denotes the total end to end delay for a particular query.
BD
Reply | Threaded
Open this post in threaded view
|

Re: Problem in Spark Streaming

BD
In reply to this post by nilmish
Hi Nilmish,

What's the data rate/node when you see the high latency? (It seems the latency keeps increasing.) Do you still see it if you lower the data rate or the frequency of the windowed query?
Reply | Threaded
Open this post in threaded view
|

Re: Problem in Spark Streaming

nilmish
How can I measure data rate/node ?

I am feeding the data through kafka API. I only know the total inflow data rate which almost remains constant . How can I figure out what amount of data is distributed to the nodes in my cluster ?

Latency does not keep on increasing infinetly. It goes up for some instant and then it drops down again to the normal level. I want to get away with these spikes in between.
BD
Reply | Threaded
Open this post in threaded view
|

Re: Problem in Spark Streaming

BD
Oh, I mean the average data rate/node.

But in case I want to know the input activities to each node (I use a custom receiver instead of Kafka), I usually search these records in logs to get a sense: "BlockManagerInfo: Added input ... on [hostname:port] (size: xxx KB)"

I also see some spikes in latency as I posted earlier: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-achieve-reasonable-performance-on-Spark-Streaming-tp7262.html
It's even worse as the spikes cause the latency to increase infinitely when the data rate is a little high, although the machines are underutilized. I can't explain it either. I'm not sure if the cause is the same as yours.
Reply | Threaded
Open this post in threaded view
|

Re: Problem in Spark Streaming

Yingjun Wu
Hi, as I searched the keyword "Total delay" in the console log, the delay keeps increasing. I am not sure what does this "total delay" mean? For example, if I perform a windowing wordcount with windowSize=10000ms and slidingStep=2000ms, then does the delay measured from the 10th second?

A sample log is shown as follows:
Total delay: 136.983 s for time 1402409331000 ms (execution: 1.711s) <--what is execution time?
Finished TID 490 in 14 ms on .... (progress: 1/6) <--what is TID? and what is the progress?
Reply | Threaded
Open this post in threaded view
|

Re: Problem in Spark Streaming

Ashish Rangole

Have you considered the garbage collection impact and if it coincides with your latency spikes? You can enable gc logging by changing Spark configuration for your job.

Hi, as I searched the keyword "Total delay" in the console log, the delay
keeps increasing. I am not sure what does this "total delay" mean? For
example, if I perform a windowing wordcount with windowSize=10000ms and
slidingStep=2000ms, then does the delay measured from the 10th second?

A sample log is shown as follows:
Total delay: 136.983 s for time 1402409331000 ms (execution: 1.711s) <--what
is execution time?
Finished TID 490 in 14 ms on .... (progress: 1/6) <--what is TID? and what
is the progress?



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Problem-in-Spark-Streaming-tp7310p7329.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|

Re: Problem in Spark Streaming

nilmish

I used these commands to show the GC timings : -verbose:gc -XX:-PrintGCDetails -XX:+PrintGCTimeStamps

Following is the output I got on the standard output :

4.092: [GC 4.092: [ParNew: 274752K->27199K(309056K), 0.0421460 secs] 274752K->27199K(995776K), 0.0422720 secs] [Times: user=0.33 sys=0.11, real=0.04 secs]

16.630: [GC 16.630: [ParNew: 301951K->17854K(309056K), 0.0686940 secs] 301951K->23624K(995776K), 0.0689110 secs] [Times: user=0.36 sys=0.05, real=0.07 secs]
 
32.440: [GC 32.441: [ParNew: 292606K->14985K(309056K), 0.0206040 secs] 298376K->20755K(995776K), 0.0208320 secs] [Times: user=0.20 sys=0.00, real=0.02 secs]
 
42.626: [GC 42.626: [ParNew: 289737K->15467K(309056K), 0.0138100 secs] 295507K->21237K(995776K), 0.0139830 secs] [Times: user=0.10 sys=0.00, real=0.01 secs]
 
56.633: [GC 56.633: [ParNew: 290219K->17334K(309056K), 0.0170930 secs] 295989K->23105K(995776K), 0.0173130 secs] [Times: user=0.12 sys=0.01, real=0.02 secs]

Can anyone help me to understand these messgaes related to GC ?
Reply | Threaded
Open this post in threaded view
|

Re: Problem in Spark Streaming

vinay Bajaj


On Wed, Jun 11, 2014 at 12:53 PM, nilmish <[hidden email]> wrote:

I used these commands to show the GC timings : -verbose:gc
-XX:-PrintGCDetails -XX:+PrintGCTimeStamps

Following is the output I got on the standard output :

4.092: [GC 4.092: [ParNew: 274752K->27199K(309056K), 0.0421460 secs]
274752K->27199K(995776K), 0.0422720 secs] [Times: user=0.33 sys=0.11,
real=0.04 secs]

16.630: [GC 16.630: [ParNew: 301951K->17854K(309056K), 0.0686940 secs]
301951K->23624K(995776K), 0.0689110 secs] [Times: user=0.36 sys=0.05,
real=0.07 secs]

32.440: [GC 32.441: [ParNew: 292606K->14985K(309056K), 0.0206040 secs]
298376K->20755K(995776K), 0.0208320 secs] [Times: user=0.20 sys=0.00,
real=0.02 secs]

42.626: [GC 42.626: [ParNew: 289737K->15467K(309056K), 0.0138100 secs]
295507K->21237K(995776K), 0.0139830 secs] [Times: user=0.10 sys=0.00,
real=0.01 secs]

56.633: [GC 56.633: [ParNew: 290219K->17334K(309056K), 0.0170930 secs]
295989K->23105K(995776K), 0.0173130 secs] [Times: user=0.12 sys=0.01,
real=0.02 secs]

Can anyone help me to understand these messgaes related to GC ?



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Problem-in-Spark-Streaming-tp7310p7384.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.