Best place to persist offsets into Zookeeper

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Best place to persist offsets into Zookeeper

ravidspark
Hi All,

I have the below problem in Spark Kafka steaming.

Environment:
Spark-2.2.0

Problem:
We have written our own logic for offset management in zookeeper when
streaming data with Spark + Kafka. Everything is working fine and we are
able to control the offset commitment to zookeeper during failure i.e not
letting the app commit offset to zookeeper node during app failure. Thus
achieving zero message loss when restarting the app. But, sometimes when
there are some unexpected exceptions, We see that offsets are getting
committed to zookeeper for at least the next 3 batches. Not able to figure
out how to control these situations. Right now we are committing the offsets
to zookeeper at the end of every batch.

I am happy to share the code.

Can you help me in solving this problem?


Thanks,
Ravi



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]