something happened to MemoryStream after spark 2.3

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

something happened to MemoryStream after spark 2.3

Koert Kuipers
hi,
we just started testing internally with spark 2.4 snapshots, and it seems our streaming tests are broken.

i believe it has to do with MemoryStream.

before we were able to create a MemoryStream, add data to it, convert it to a streaming unbounded DataFrame and use it repeatedly. by using it repeatedly i mean repeatedly doing: create a query (with a random uuid name) from dataframe, process all available, stop the query. every time we did this all the data in the MemoryStream would be processed.

now with spark 2.4.0-SNAPSHOT the second time we create a query no data is processed at all. it is as if the MemoryStream is empty. it this expected? should we refactor our tests?