Structured Streaming: multiple sinks

Previous Topic Next Topic
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
Report Content as Inappropriate

Structured Streaming: multiple sinks

This post has NOT been accepted by the mailing list yet.
1) We are consuming from  kafka using  structured streaming and  writing the processed data set to s3.
We also want to write the processed data to kafka moving forward, is it possible to do it from the same streaming query ? (spark  version 2.1.1)

2) In the logs, I see the streaming  query progress output and I have a sample duration JSON from the log, can some one please provide more clarity on what  the difference is between addBatch and getBatch ?  

3)  TriggerExecution - is it the time take  to both process the fetched data and writing to the sink?

"durationMs" : {
    "addBatch" : 2263426,
    "getBatch" : 12,
    "getOffset" : 273,
    "queryPlanning" : 13,
    "triggerExecution" : 2264288,
    "walCommit" : 552