Task skew or Data skew problem for Spark Standalone (2.3.1)

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Task skew or Data skew problem for Spark Standalone (2.3.1)

linrick

Dear all,

 

I have a problem about task skew and data skew for real-time data (via kafka) under the spark streaming.

 

When one of executors is crashed, task skew and data skew happened in my project as shown figure.

 

For example, in ubuntu8, because there are 3 crashed executors (here, I am not sure this), the 50,000 data is placed into the executor: ubuntu8:34168.

Figure 1 executors crash

 

It is normal (no crashed executors) for most streaming window:

Figure 2 normal performance

 

Figure 3 poor performance

 

 

 

The experiment design in my project is described in the following.

Real-time data speed (via kafka): 100,000/1sec

Read one topic: kafkasink2

Kafka Broker: 2.10-0.10.1.1

  Broker node at ubuntu7

    One topic: kafkasink2 (number of partitions: 8)

 

The running environment is in my PC:

OS: Ubuntn 14.04.4 LTS

The version of related tools:

java version: "1.8.0_151"

Spark version: 2.3.1 Standalone mode

  Execution condition:

  Master/Driver node: ubuntu7

  Worker nodes: ubuntu8 (4 Executors); ubuntu9 (4 Executors)

Number of executors: 8

 

Driver setting (spark-defaults.conf):

spark.cores.max=8

 

spark.executor.instances=8

spark.executor.cores=1

spark.executor.memory=2048m

 

spark.default.parallelism=8

 

spark.driver.cores=4

spark.driver.memory=2048m

 

spark.executor.extraJavaOptions=-XX:+UseConcMarkSweepGC

spark.executor.extraJavaOptions=-Xss100M

 

spark.shuffle.consolidateFiles=true

spark.streaming.unpersist=true

spark.streaming.stopGracefullyOnShutdown=true

 

spark.blacklist.enabled=true

 

 

If anyone provides any direction to help us to overcome this problem, we would appreciate it.

Thanks.

 

Rick

 



--
本信件可能包含工研院機密資訊,非指定之收件者,請勿使用或揭露本信件內容,並請銷毀此信件。 This email may contain confidential information. Please do not use or disclose it in any way and delete it if you are not the intended recipient.