I have a spark streaming job that reads from several kinesis streams and unions them together in a single streaming context.

val streams = => {

import spark.sqlContext.implicits._
  .foreachRDD(jsonRdd => ...)

I see correct numbers of records within the Spark Streaming tab in the UI. However the number of actual records processed by foreachRDD is less.

Within the executor logs I see many ProvisionedThroughputExceededException however this should be benign in that the KCL should retry those records.

Unfortunately I am not seeing the missing records processed at a later date. Where to look next?

