Spark Streaming in Wait mode

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Spark Streaming in Wait mode

khajaasmath786
Hi,

I am running spark streaming job and it is not picking up the next batches but the job is still shows as running on YARN.

is this expected behavior if there is no data or waiting for data to pick up?

I am almost behind 4 hours of batches (30 min interval)


Inline image 1

Inline image 2

hadoop.security.authentication=kerberos
spark.executor.memory=12g
spark.driver.am.memory=8G
spark.yarn.am.memoryOverhead=8g
spark.scheduler.mode=FAIR
spark.shuffle.compress=true
spark.shuffle.spill.compress=true
spark.broadcast.compress=true
spark.io.compression.codec=snappy
spark.dynamicAllocation.enabled=false
spark.streaming.dynamicAllocation.enabled=true
## HIVE JDBC ######################################
java.security.krb5.conf=krb5.conf
javax.security.auth.useSubjectCredsOnly=true
hive.jdbc.url=jdbc:XXXXXa;principal=hive/_[hidden email];ssl=true
hive.jdbc.driver=org.apache.hive.jdbc.HiveDriver
keytab.file=va_dflt.keytab
spark.sql.parquet.binaryAsString=true
spark.sql.parquet.mergeSchema=true
spark.sql.parquet.compression.codec=snappy
spark.rdd.compress=true
spark.io.compression.codec=snappy
spark.sql.tungsten.enabled=false
spark.sql.codegen=false
spark.sql.unsafe.enabled=false
index=15
includeIndex=true
BatchInterval=1800
CheckPointDir=hdfs://prodnameservice1/user/yyy1k78/KafkaCheckPoint
KafkaBrokerList=XXXXXXXXXXX
KafkaTopics=occlocation
###############################33
spark.serializer=org.apache.spark.serializer.KryoSerializer
spark.locality.wait=10
spark.task.maxFailures=8
spark.ui.killEnabled=false
spark.logConf=true
# SPARK STREAMING CONFIGURATION
spark.streaming.blockInterval=200
spark.streaming.receiver.writeAheadLog.enable=true
spark.streaming.backpressure.enabled=true
#spark.streaming.backpressure.pid.minRate=10
#spark.streaming.receiver.maxRate=100
#spark.streaming.kafka.maxRatePerPartition==100
#spark.streaming.backpressure.initialRate=30
spark.yarn.maxAppAttempts=8
spark.yarn.am.attemptFailuresValidityInterval=1h
spark.yarn.executor.failuresValidityInterval=1h

any suggestions on why the batches are not running ? is it expected behavior? 

Thanks,
Asmath