shuffling using netty in spark streaming

classic Classic list List threaded Threaded
2 messages Options
BD
Reply | Threaded
Open this post in threaded view
|

shuffling using netty in spark streaming

BD
Hi,

1. Does netty perform better than the basic method for shuffling? I found the latency caused by shuffling in a streaming job is not stable with the basic method. 

2. However, after I turn on netty for shuffling, I can only see the results for the first two batches, and then no output at all. I'm not sure whether the way I turn on netty is correct:

val conf = new SparkConf().set("spark.shuffle.use.netty", "true")

Thanks.

Boduo Li
Reply | Threaded
Open this post in threaded view
|

RE: shuffling using netty in spark streaming

Shao, Saisai

Hi,

 

1.      The performance is based on your hardware and system configurations, you can test it yourself. In my test, the two shuffle implementations have no special performance difference in latest version.

2.      That’s correct to turn on netty based shuffle, and there’s no shuffle fetch related metrics in netty based shuffle, so you may not see the shuffle fetch related metrics in web portal.

 

 

Thanks

Jerry

 

From: onpoq l [mailto:[hidden email]]
Sent: Thursday, June 12, 2014 2:35 PM
To: [hidden email]
Subject: shuffling using netty in spark streaming

 

Hi,

 

1. Does netty perform better than the basic method for shuffling? I found the latency caused by shuffling in a streaming job is not stable with the basic method. 

 

2. However, after I turn on netty for shuffling, I can only see the results for the first two batches, and then no output at all. I'm not sure whether the way I turn on netty is correct:

 

val conf = new SparkConf().set("spark.shuffle.use.netty", "true")

 

Thanks.

 

Boduo Li