spark run's error (INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.OutOfMemoryError java.lang.OutOfMemoryError: GC overhead limit exceeded)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

spark run's error (INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.OutOfMemoryError java.lang.OutOfMemoryError: GC overhead limit exceeded)

henry
This post has NOT been accepted by the mailing list yet.
HI everyone
   I have a question about spark.When I use spark run connect   the data is 400kb the sprak can run but when I run the bigger data the spark make error like this  :
14/02/17 18:18:43 INFO storage.BlockManagerMasterActor$BlockManagerInfo: Registering block manager hw077:35742 with 323.9 MB RAM
14/02/17 18:18:43 INFO storage.BlockManagerMasterActor$BlockManagerInfo: Registering block manager hw074:53799 with 323.9 MB RAM
14/02/17 18:20:26 INFO cluster.ClusterTaskSetManager: Lost TID 2 (task 0.0:2)
14/02/17 18:20:26 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.OutOfMemoryError
java.lang.OutOfMemoryError: Java heap space
        at scala.LowPriorityImplicits.wrapRefArray(LowPriorityImplicits.scala:67)
        at cn.ac.ict.bigdatabench.ConnectedComponent$$anonfun$1.apply(ConnectedComponent.scala:142)
        at cn.ac.ict.bigdatabench.ConnectedComponent$$anonfun$1.apply(ConnectedComponent.scala:140)
        at scala.collection.Iterator$$anon$19.next(Iterator.scala:401)
        at scala.collection.Iterator$class.foreach(Iterator.scala:772)
        at scala.collection.Iterator$$anon$19.foreach(Iterator.scala:399)
        at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
        at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:102)
        at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:250)
        at scala.collection.Iterator$$anon$19.toBuffer(Iterator.scala:399)
        at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:237)
        at scala.collection.Iterator$$anon$19.toArray(Iterator.scala:399)
        at org.apache.spark.rdd.RDD$$anonfun$1.apply(RDD.scala:560)
        at org.apache.spark.rdd.RDD$$anonfun$1.apply(RDD.scala:560)
        at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:758)
        at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:758)
        at org.apache.spark.scheduler.ResultTask.run(ResultTask.scala:99)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:158)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)
14/02/17 18:20:26 INFO cluster.ClusterTaskSetManager: Starting task 0.0:2 as TID 24 on executor 1: hw077 (NODE_LOCAL)
14/02/17 18:20:26 INFO cluster.ClusterTaskSetManager: Serialized task 0.0:2 as 1861 bytes in 0 ms
14/02/17 18:20:26 INFO cluster.ClusterTaskSetManager: Lost TID 11 (task 0.0:11)
14/02/17 18:20:26 INFO cluster.ClusterTaskSetManager: Loss was due to java.lang.OutOfMemoryError
java.lang.OutOfMemoryError: GC overhead limit exceeded
        at java.util.regex.Pattern.compile(Pattern.java:1452)
        at java.util.regex.Pattern.<init>(Pattern.java:1133)
        at java.util.regex.Pattern.compile(Pattern.java:823)
        at java.lang.String.split(String.java:2292)
        at java.lang.String.split(String.java:2334)
        at cn.ac.ict.bigdatabench.ConnectedComponent$$anonfun$1.apply(ConnectedComponent.scala:141)
        at cn.ac.ict.bigdatabench.ConnectedComponent$$anonfun$1.apply(ConnectedComponent.scala:140)
        at scala.collection.Iterator$$anon$19.next(Iterator.scala:401)
        at scala.collection.Iterator$class.foreach(Iterator.scala:772)
        at scala.collection.Iterator$$anon$19.foreach(Iterator.scala:399)
        at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
        at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:102)
        at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:250)
        at scala.collection.Iterator$$anon$19.toBuffer(Iterator.scala:399)
        at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:237)
        at scala.collection.Iterator$$anon$19.toArray(Iterator.scala:399)
        at org.apache.spark.rdd.RDD$$anonfun$1.apply(RDD.scala:560)
        at org.apache.spark.rdd.RDD$$anonfun$1.apply(RDD.scala:560)
        at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:758)
        at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:758)
        at org.apache.spark.scheduler.ResultTask.run(ResultTask.scala:99)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:158)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)
my spark env I have make like this :
export SPARK_WORKER_MEMORY=30G
export SCALA_HOME=/usr/lib/scala-2.9.3
export SPARK_JAVA_OPTS="-Xmx1024M -Xms512M -XX:MaxPermSize=256m"
export SPARK_DAEMON_JAVA_OPTS="-Xms10G -Xmx=10G -XX:-UseGCOverheadLimit"
export JAVA_OPTS='-Xms512m -Xmx4096m -XX:MaxPermSize=128m -XX:-UseGCOverheadLimit -XX:+UseConcMarkSweepGC'
I really do not know where is wrong  if somebody know please help me