Spark Streaming Memory

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Spark Streaming Memory

andraskolbert
Hi,

I have a streaming job (Spark 2.4.4) in which the memory usage keeps increasing over time.

Periodically (20-25) mins the executors fall over (org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output location for shuffle 6987) due to out of memory. In the UI, I can see that the memory keeps increasing batch by batch, although I do not keep more data in memory (I keep unpersisting, checkpointing and caching new data frames though), the storage tabs shows only the expected 4 objects overtime.

I wish I just missed a parameter in the spark configuration (like garbage collection, reference tracking, etc) that would solve my issue. I have seen a few JIRA tickets around memory leak (SPARK-19644SPARK-29055SPARK-29321) it might be the same issue?

     ("spark.cleaner.referenceTracking.cleanCheckpoints", "true"),
     ('spark.cleaner.periodicGC.interval', '1min'),
     ('spark.cleaner.referenceTracking','true'),
     ('spark.cleaner.referenceTracking.blocking.shuffle','true'),
     ('spark.sql.streaming.minBatchesToRetain', '2'),
     ('spark.sql.streaming.maxBatchesToRetainInMemory', '5'),
     ('spark.ui.retainedJobs','50' ),
     ('spark.ui.retainedStages','50'),
     ('spark.ui.retainedTasks','500'),
     ('spark.worker.ui.retainedExecutors','50'),
     ('spark.worker.ui.retainedDrivers','50'),
     ('spark.sql.ui.retainedExecutions','50'),
     ('spark.streaming.ui.retainedBatches','1440'),
     ('spark.executor.JavaOptions','-XX:+UseG1GC -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps')

I've tried lowering the spark.streaming.ui.retainedBatches to 8, did not help.

The application works fine apart from the fact that the processing some batches take longer (when the executors fall over).

image.png
image.png


Any ideas?

I've attached my code.


Thanks,
Andras


---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

application.txt (29K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Spark Streaming Memory

AliGouta
The spark UI is misleading in spark 2.4.4. I moved to spark 2.4.5 and it fixed it. Now, your problem should be somewhere else. Probably related to memory consumption but not the one you see in the UI.

Best regards,
Ali Gouta.

On Sun, May 17, 2020 at 7:36 PM András Kolbert <[hidden email]> wrote:
Hi,

I have a streaming job (Spark 2.4.4) in which the memory usage keeps increasing over time.

Periodically (20-25) mins the executors fall over (org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output location for shuffle 6987) due to out of memory. In the UI, I can see that the memory keeps increasing batch by batch, although I do not keep more data in memory (I keep unpersisting, checkpointing and caching new data frames though), the storage tabs shows only the expected 4 objects overtime.

I wish I just missed a parameter in the spark configuration (like garbage collection, reference tracking, etc) that would solve my issue. I have seen a few JIRA tickets around memory leak (SPARK-19644SPARK-29055SPARK-29321) it might be the same issue?

     ("spark.cleaner.referenceTracking.cleanCheckpoints", "true"),
     ('spark.cleaner.periodicGC.interval', '1min'),
     ('spark.cleaner.referenceTracking','true'),
     ('spark.cleaner.referenceTracking.blocking.shuffle','true'),
     ('spark.sql.streaming.minBatchesToRetain', '2'),
     ('spark.sql.streaming.maxBatchesToRetainInMemory', '5'),
     ('spark.ui.retainedJobs','50' ),
     ('spark.ui.retainedStages','50'),
     ('spark.ui.retainedTasks','500'),
     ('spark.worker.ui.retainedExecutors','50'),
     ('spark.worker.ui.retainedDrivers','50'),
     ('spark.sql.ui.retainedExecutions','50'),
     ('spark.streaming.ui.retainedBatches','1440'),
     ('spark.executor.JavaOptions','-XX:+UseG1GC -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps')

I've tried lowering the spark.streaming.ui.retainedBatches to 8, did not help.

The application works fine apart from the fact that the processing some batches take longer (when the executors fall over).

image.png
image.png


Any ideas?

I've attached my code.


Thanks,
Andras

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]