Are there some pitfalls in my spark structured streaming code which causes slow response after several hours running?
The spark job has the correct functions and logic. However, after several hours running, it becomes slower and slower. Are there some pitfalls in the below code? Thanks!
val query = "(select * from meta_table) as meta_data" val meta_schema = new StructType() .add("config_id", BooleanType) .add("threshold", LongType) var meta_df = spark.read.jdbc(url, query, connectionProperties) var meta_df_explode=meta_df.select(col("id"), from_json(col("config"), meta_schema).as("config")).select("config_id", "thresold", "config.*")
//rules_imsi_df: joining of kafka ingestion with the meta_df_explode
//rules_monitoring_df: static dataframe for monitoring purpose