However, if the dataset is a structure streaming dataset, Spark prompts that "Stream-stream outer join between two streaming
DataFrame/Datasets is not supported without a watermark in the join keys". Since the dataset joins itself, I was thinking to just use an arbitrary time interval as the watermark to create two streaming datasets and join them:
I am stuck and don't know how to do what I intended to do for the static datasets for this streaming dataset. The join seems to me mean different when I added the time interval watermark, as the original one was joining tables with different rows. Can someone explain how I can realize the original logic in streaming dataset. Probably I don't even need a join?