GC issues

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

GC issues

Livni, Dana

Hi,

When running a map task I got the following exception.

It is new, I have run this code many times in the past, and it the first time it happens,

any ideas why? Or how can I monitor when it happens?

 

Thanks Dana.

 

14/02/11 16:15:56 ERROR executor.Executor: Exception in task ID 128

java.lang.OutOfMemoryError: GC overhead limit exceeded

        at java.lang.StringBuilder.toString(StringBuilder.java:430)

        at java.io.ObjectInputStream$BlockDataInputStream.readUTFBody(ObjectInputStream.java:3023)

        at java.io.ObjectInputStream$BlockDataInputStream.readUTF(ObjectInputStream.java:2819)

        at java.io.ObjectInputStream.readString(ObjectInputStream.java:1598)

        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1319)

        at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946)

        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870)

        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752)

        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328)

        at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946)

        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870)

        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752)

        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328)

        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350)

        at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:39)

        at org.apache.spark.serializer.DeserializationStream$$anon$1.getNext(Serializer.scala:101)

        at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:71)

        at scala.collection.Iterator$$anon$21.hasNext(Iterator.scala:440)

        at org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:26)

        at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:27)

        at org.apache.spark.Aggregator.combineValuesByKey(Aggregator.scala:40)

        at org.apache.spark.rdd.PairRDDFunctions$$anonfun$combineByKey$3.apply(PairRDDFunctions.scala:103)

        at org.apache.spark.rdd.PairRDDFunctions$$anonfun$combineByKey$3.apply(PairRDDFunctions.scala:102)

        at org.apache.spark.rdd.RDD$$anonfun$3.apply(RDD.scala:465)

        at org.apache.spark.rdd.RDD$$anonfun$3.apply(RDD.scala:465)

        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:34)

        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:237)

        at org.apache.spark.rdd.RDD.iterator(RDD.scala:226)

        at org.apache.spark.rdd.MappedValuesRDD.compute(MappedValuesRDD.scala:32)

        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:237)

        at org.apache.spark.rdd.RDD.iterator(RDD.scala:226)

        at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:29)

---------------------------------------------------------------------
Intel Electronics Ltd.

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

Reply | Threaded
Open this post in threaded view
|

Re: GC issues

sowen

This is just Java's way of saying 'out of memory'. Your workers need more heap.

On Feb 12, 2014 7:23 AM, "Livni, Dana" <[hidden email]> wrote:

Hi,

When running a map task I got the following exception.

It is new, I have run this code many times in the past, and it the first time it happens,

any ideas why? Or how can I monitor when it happens?

 

Thanks Dana.

 

14/02/11 16:15:56 ERROR executor.Executor: Exception in task ID 128

java.lang.OutOfMemoryError: GC overhead limit exceeded

        at java.lang.StringBuilder.toString(StringBuilder.java:430)

        at java.io.ObjectInputStream$BlockDataInputStream.readUTFBody(ObjectInputStream.java:3023)

        at java.io.ObjectInputStream$BlockDataInputStream.readUTF(ObjectInputStream.java:2819)

        at java.io.ObjectInputStream.readString(ObjectInputStream.java:1598)

        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1319)

        at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946)

        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870)

        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752)

        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328)

        at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946)

        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870)

        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752)

        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328)

        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350)

        at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:39)

        at org.apache.spark.serializer.DeserializationStream$$anon$1.getNext(Serializer.scala:101)

        at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:71)

        at scala.collection.Iterator$$anon$21.hasNext(Iterator.scala:440)

        at org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:26)

        at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:27)

        at org.apache.spark.Aggregator.combineValuesByKey(Aggregator.scala:40)

        at org.apache.spark.rdd.PairRDDFunctions$$anonfun$combineByKey$3.apply(PairRDDFunctions.scala:103)

        at org.apache.spark.rdd.PairRDDFunctions$$anonfun$combineByKey$3.apply(PairRDDFunctions.scala:102)

        at org.apache.spark.rdd.RDD$$anonfun$3.apply(RDD.scala:465)

        at org.apache.spark.rdd.RDD$$anonfun$3.apply(RDD.scala:465)

        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:34)

        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:237)

        at org.apache.spark.rdd.RDD.iterator(RDD.scala:226)

        at org.apache.spark.rdd.MappedValuesRDD.compute(MappedValuesRDD.scala:32)

        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:237)

        at org.apache.spark.rdd.RDD.iterator(RDD.scala:226)

        at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:29)

---------------------------------------------------------------------
Intel Electronics Ltd.

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

Reply | Threaded
Open this post in threaded view
|

Re: GC issues

Andrew Ash
Alternatively, Spark's estimate of how much space you're using in the heap is off on the low-side of true, so it runs out of memory when it thinks it has breathing room.

Try lowering spark.storage.memoryFraction from its default (0.6) a bit to something like 0.5 to make it more conservative with memory use within the JVM if you don't have more physical memory to expand the Xmx setting.


On Wed, Feb 12, 2014 at 12:20 AM, Sean Owen <[hidden email]> wrote:

This is just Java's way of saying 'out of memory'. Your workers need more heap.

On Feb 12, 2014 7:23 AM, "Livni, Dana" <[hidden email]> wrote:

Hi,

When running a map task I got the following exception.

It is new, I have run this code many times in the past, and it the first time it happens,

any ideas why? Or how can I monitor when it happens?

 

Thanks Dana.

 

14/02/11 16:15:56 ERROR executor.Executor: Exception in task ID 128

java.lang.OutOfMemoryError: GC overhead limit exceeded

        at java.lang.StringBuilder.toString(StringBuilder.java:430)

        at java.io.ObjectInputStream$BlockDataInputStream.readUTFBody(ObjectInputStream.java:3023)

        at java.io.ObjectInputStream$BlockDataInputStream.readUTF(ObjectInputStream.java:2819)

        at java.io.ObjectInputStream.readString(ObjectInputStream.java:1598)

        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1319)

        at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946)

        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870)

        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752)

        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328)

        at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1946)

        at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1870)

        at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1752)

        at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1328)

        at java.io.ObjectInputStream.readObject(ObjectInputStream.java:350)

        at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:39)

        at org.apache.spark.serializer.DeserializationStream$$anon$1.getNext(Serializer.scala:101)

        at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:71)

        at scala.collection.Iterator$$anon$21.hasNext(Iterator.scala:440)

        at org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:26)

        at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:27)

        at org.apache.spark.Aggregator.combineValuesByKey(Aggregator.scala:40)

        at org.apache.spark.rdd.PairRDDFunctions$$anonfun$combineByKey$3.apply(PairRDDFunctions.scala:103)

        at org.apache.spark.rdd.PairRDDFunctions$$anonfun$combineByKey$3.apply(PairRDDFunctions.scala:102)

        at org.apache.spark.rdd.RDD$$anonfun$3.apply(RDD.scala:465)

        at org.apache.spark.rdd.RDD$$anonfun$3.apply(RDD.scala:465)

        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:34)

        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:237)

        at org.apache.spark.rdd.RDD.iterator(RDD.scala:226)

        at org.apache.spark.rdd.MappedValuesRDD.compute(MappedValuesRDD.scala:32)

        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:237)

        at org.apache.spark.rdd.RDD.iterator(RDD.scala:226)

        at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:29)

---------------------------------------------------------------------
Intel Electronics Ltd.

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.