KryoSerializer Exception

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

KryoSerializer Exception

Andrea Esposito
Hi there,

sorry if i'm posting a lot lately.

i'm trying to add the KryoSerializer but i receive this exception:
2014 - 05 - 06 11: 45: 23 WARN TaskSetManager: 62 - Loss was due to java.io.EOFException
java.io.EOFException
at org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala: 105)
at org.apache.spark.broadcast.HttpBroadcast$.read(HttpBroadcast.scala: 165)
at org.apache.spark.broadcast.HttpBroadcast.readObject(HttpBroadcast.scala: 56)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java: 57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java: 43)
at java.lang.reflect.Method.invoke(Method.java: 606)


I set the serializer as:
System.setProperty("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
System.setProperty("spark.kryo.registrator", "test.TestKryoRegistrator")


With or without register my custom registrator it throws the exception.

Seems something related to broadcast.. but isn't Kryo already ok out of the box just setting it as default serializer?
MSc student @ UniPi
Reply | Threaded
Open this post in threaded view
|

Re: KryoSerializer Exception

Andrea Esposito
UP, doesn't anyone know something about it? ^^


2014-05-06 12:05 GMT+02:00 Andrea Esposito <[hidden email]>:
Hi there,

sorry if i'm posting a lot lately.

i'm trying to add the KryoSerializer but i receive this exception:
2014 - <a href="tel:05%20-%2006%2011" value="+39050611" target="_blank">05 - 06 11: 45: 23 WARN TaskSetManager: 62 - Loss was due to java.io.EOFException
java.io.EOFException
at org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala: 105)
at org.apache.spark.broadcast.HttpBroadcast$.read(HttpBroadcast.scala: 165)
at org.apache.spark.broadcast.HttpBroadcast.readObject(HttpBroadcast.scala: 56)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java: 57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java: 43)
at java.lang.reflect.Method.invoke(Method.java: 606)


I set the serializer as:
System.setProperty("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
System.setProperty("spark.kryo.registrator", "test.TestKryoRegistrator")


With or without register my custom registrator it throws the exception.

Seems something related to broadcast.. but isn't Kryo already ok out of the box just setting it as default serializer?

MSc student @ UniPi
Reply | Threaded
Open this post in threaded view
|

Re: KryoSerializer Exception

Andrew Ash
Hi Andrea,

What version of Spark are you using?  There were some improvements in how Spark uses Kryo in 0.9.1 and to-be 1.0 that I would expect to improve this.

Also, can you share your registrator's code?

Another possibility is that Kryo can have some difficulty serializing very large objects.  Do you have a sense of how large the serialized items in your RDD are?

Andrew


On Sat, May 10, 2014 at 6:32 AM, Andrea Esposito <[hidden email]> wrote:
UP, doesn't anyone know something about it? ^^


2014-05-06 12:05 GMT+02:00 Andrea Esposito <[hidden email]>:

Hi there,

sorry if i'm posting a lot lately.

i'm trying to add the KryoSerializer but i receive this exception:
2014 - <a href="tel:05%20-%2006%2011" value="+39050611" target="_blank">05 - 06 11: 45: 23 WARN TaskSetManager: 62 - Loss was due to java.io.EOFException
java.io.EOFException
at org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala: 105)
at org.apache.spark.broadcast.HttpBroadcast$.read(HttpBroadcast.scala: 165)
at org.apache.spark.broadcast.HttpBroadcast.readObject(HttpBroadcast.scala: 56)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java: 57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java: 43)
at java.lang.reflect.Method.invoke(Method.java: 606)


I set the serializer as:
System.setProperty("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
System.setProperty("spark.kryo.registrator", "test.TestKryoRegistrator")


With or without register my custom registrator it throws the exception.

Seems something related to broadcast.. but isn't Kryo already ok out of the box just setting it as default serializer?


Reply | Threaded
Open this post in threaded view
|

Re: KryoSerializer Exception

jaranda
This post was updated on .
I am experiencing the same issue (I tried both using Kryo as serializer and increasing the buffer size up to 256M, my objects are much smaller though). I share my registrator class just in case.

Any hints would be highly appreciated (see related so question).

Thanks,
Reply | Threaded
Open this post in threaded view
|

Re: KryoSerializer Exception

Andrea Esposito
Hi,

i just migrate to 1.0. Still having the same issue.

Either with or without the custom registrator. Just the usage of the KryoSerializer triggers the exception immediately.

I set the kryo settings through the property:
System.setProperty("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
System.setProperty("spark.kryo.registrator", "it.unipi.thesis.andrea.esposito.onjag.test.TestKryoRegistrator")

The registrator is just a sequence of:
kryo.register(classOf[MyClass])

I tried also with very small RDD (few MB of serialized data) and the problem still occurs.

The problem seems about broadcast but i'm completely stuck.

Following complete log:
2014-05-30 15:47:36 WARN  TaskSetManager:70 - Lost TID 5 (task 3.0:1)
2014-05-30 15:47:36 WARN  TaskSetManager:70 - Loss was due to java.io.EOFException
java.io.EOFException
    at org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:119)
    at org.apache.spark.broadcast.HttpBroadcast$.read(HttpBroadcast.scala:205)
    at org.apache.spark.broadcast.HttpBroadcast.readObject(HttpBroadcast.scala:89)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
    at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:63)
    at org.apache.spark.scheduler.ShuffleMapTask$.deserializeInfo(ShuffleMapTask.scala:63)
    at org.apache.spark.scheduler.ShuffleMapTask.readExternal(ShuffleMapTask.scala:135)
    at java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1837)
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
    at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:63)
    at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:85)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:169)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
2014-05-30 15:47:36 WARN  TaskSetManager:70 - Lost TID 4 (task 3.0:0)
2014-05-30 15:47:36 WARN  TaskSetManager:70 - Lost TID 6 (task 3.0:1)
2014-05-30 15:47:36 WARN  TaskSetManager:70 - Lost TID 7 (task 3.0:0)
2014-05-30 15:47:36 WARN  TaskSetManager:70 - Lost TID 8 (task 3.0:1)
2014-05-30 15:47:36 WARN  TaskSetManager:70 - Lost TID 9 (task 3.0:0)
2014-05-30 15:47:36 WARN  TaskSetManager:70 - Lost TID 10 (task 3.0:1)
2014-05-30 15:47:36 ERROR TaskSetManager:74 - Task 3.0:1 failed 4 times; aborting job
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 3.0:1 failed 4 times, most recent failure: Exception failure in TID 10 on host Andrea-Laptop.unipi.it: java.io.EOFException
        org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:119)
        org.apache.spark.broadcast.HttpBroadcast$.read(HttpBroadcast.scala:205)
        org.apache.spark.broadcast.HttpBroadcast.readObject(HttpBroadcast.scala:89)
        sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
        sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        java.lang.reflect.Method.invoke(Method.java:606)
        java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
        java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
        java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
        java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
        java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
        java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
        java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
        java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
        java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
        java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
        java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
        java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
        java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
        java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
        java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
        java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
        java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
        org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:63)
        org.apache.spark.scheduler.ShuffleMapTask$.deserializeInfo(ShuffleMapTask.scala:63)
        org.apache.spark.scheduler.ShuffleMapTask.readExternal(ShuffleMapTask.scala:135)
        java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1837)
        java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
        java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
        java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
        org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:63)
        org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:85)
        org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:169)
        java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        java.lang.Thread.run(Thread.java:745)
Driver stacktrace:
    at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1033)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1017)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1015)
    at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
    at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
    at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1015)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:633)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:633)
    at scala.Option.foreach(Option.scala:236)
    at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:633)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1207)
    at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
    at akka.actor.ActorCell.invoke(ActorCell.scala:456)
    at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
    at akka.dispatch.Mailbox.run(Mailbox.scala:219)
    at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
    at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
    at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
    at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
    at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)



2014-05-27 16:25 GMT+02:00 jaranda <[hidden email]>:
I am experiencing the same issue (I tried both using Kryo as serializer and
increasing the buffer size up to 256M, my objects are much smaller though).
I share my registrator class just in case:

https://gist.github.com/JordiAranda/5cc16cf102290c413c82

Any hints would be highly appreciated.

Thanks,




--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/KryoSerializer-Exception-tp5435p6428.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

MSc student @ UniPi
Reply | Threaded
Open this post in threaded view
|

Re: KryoSerializer Exception

ugupta
This post has NOT been accepted by the mailing list yet.
This post was updated on .
Hi,
I am also facing same issue . I am using spark & shark .Any hints abt the fix ?


Regards,
Umamaheshwar Gupta K