HDFS Server/Client IPC version mismatch while trying to access HDFS files using Spark-0.9.1

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

HDFS Server/Client IPC version mismatch while trying to access HDFS files using Spark-0.9.1

bijoy deb
Hi all,

I have build Shark-0.9.1 using sbt using the below command:

SPARK_HADOOP_VERSION=2.0.0-mr1-cdh4.6.0 sbt/sbt assembly

My Hadoop cluster is also having version 2.0.0-mr1-cdh4.6.0.

But when I try to execute the below command from Spark shell,which reads a file from HDFS, I get the "IPC version mismatch- IPC version 7 on server versus IPC version 4" on client error on org.apache.hadoop.hdfs.DFSClient class.


scala> val s = sc.textFile("hdfs://host:port/test.txt")
scala> s.count()
14/06/10 23:42:59 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/06/10 23:42:59 WARN snappy.LoadSnappy: Snappy native library not loaded
org.apache.hadoop.ipc.RemoteException: Server IPC version 7 cannot communicate with client version 4
    at org.apache.hadoop.ipc.Client.call(Client.java:1070)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
    at com.sun.proxy.$Proxy9.getProtocolVersion(Unknown Source)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)

    at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
    at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
    at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:176)
    at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:208)


Apparently this error is because of version mismatch of the hadoop-hdfs jar between client (one referred by Spark) and server(hadoop cluster).But what I don't understand is why is this mismatch (since I had built Spark with the correct Hadoop version).

Any suggestions would be highly appreciated.

Thanks
Bijoy
Reply | Threaded
Open this post in threaded view
|

Re: HDFS Server/Client IPC version mismatch while trying to access HDFS files using Spark-0.9.1

bijoy deb
Any suggestions from anyone?

Thanks
Bijoy


On Tue, Jun 10, 2014 at 11:46 PM, bijoy deb <[hidden email]> wrote:
Hi all,

I have build Shark-0.9.1 using sbt using the below command:

SPARK_HADOOP_VERSION=2.0.0-mr1-cdh4.6.0 sbt/sbt assembly

My Hadoop cluster is also having version 2.0.0-mr1-cdh4.6.0.

But when I try to execute the below command from Spark shell,which reads a file from HDFS, I get the "IPC version mismatch- IPC version 7 on server versus IPC version 4" on client error on org.apache.hadoop.hdfs.DFSClient class.


scala> val s = sc.textFile("hdfs://host:port/test.txt")
scala> s.count()
14/06/10 23:42:59 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/06/10 23:42:59 WARN snappy.LoadSnappy: Snappy native library not loaded
org.apache.hadoop.ipc.RemoteException: Server IPC version 7 cannot communicate with client version 4
    at org.apache.hadoop.ipc.Client.call(Client.java:1070)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
    at com.sun.proxy.$Proxy9.getProtocolVersion(Unknown Source)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
    at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)

    at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
    at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
    at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
    at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
    at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
    at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
    at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
    at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
    at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:176)
    at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:208)


Apparently this error is because of version mismatch of the hadoop-hdfs jar between client (one referred by Spark) and server(hadoop cluster).But what I don't understand is why is this mismatch (since I had built Spark with the correct Hadoop version).

Any suggestions would be highly appreciated.

Thanks
Bijoy

Reply | Threaded
Open this post in threaded view
|

Re: HDFS Server/Client IPC version mismatch while trying to access HDFS files using Spark-0.9.1

Marcelo Vanzin
The error is saying that your client libraries are older than what
your server is using (2.0.0-mr1-cdh4.6.0 is IPC version 7).

Try double-checking that your build is actually using that version
(e.g., by looking at the hadoop jar files in lib_managed/jars).

On Wed, Jun 11, 2014 at 2:07 AM, bijoy deb <[hidden email]> wrote:

> Any suggestions from anyone?
>
> Thanks
> Bijoy
>
>
> On Tue, Jun 10, 2014 at 11:46 PM, bijoy deb <[hidden email]>
> wrote:
>>
>> Hi all,
>>
>> I have build Shark-0.9.1 using sbt using the below command:
>>
>> SPARK_HADOOP_VERSION=2.0.0-mr1-cdh4.6.0 sbt/sbt assembly
>>
>> My Hadoop cluster is also having version 2.0.0-mr1-cdh4.6.0.
>>
>> But when I try to execute the below command from Spark shell,which reads a
>> file from HDFS, I get the "IPC version mismatch- IPC version 7 on server
>> versus IPC version 4" on client error on org.apache.hadoop.hdfs.DFSClient
>> class.
>>
>>
>> scala> val s = sc.textFile("hdfs://host:port/test.txt")
>> scala> s.count()
>> 14/06/10 23:42:59 WARN util.NativeCodeLoader: Unable to load native-hadoop
>> library for your platform... using builtin-java classes where applicable
>> 14/06/10 23:42:59 WARN snappy.LoadSnappy: Snappy native library not loaded
>> org.apache.hadoop.ipc.RemoteException: Server IPC version 7 cannot
>> communicate with client version 4
>>     at org.apache.hadoop.ipc.Client.call(Client.java:1070)
>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>     at com.sun.proxy.$Proxy9.getProtocolVersion(Unknown Source)
>>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
>>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
>>     at
>> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
>>     at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
>>     at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
>>     at
>> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
>>     at
>> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
>>     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
>>     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
>>     at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
>>     at
>> org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:176)
>>     at
>> org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:208)
>>
>>
>> Apparently this error is because of version mismatch of the hadoop-hdfs
>> jar between client (one referred by Spark) and server(hadoop cluster).But
>> what I don't understand is why is this mismatch (since I had built Spark
>> with the correct Hadoop version).
>>
>> Any suggestions would be highly appreciated.
>>
>> Thanks
>> Bijoy
>
>



--
Marcelo
Reply | Threaded
Open this post in threaded view
|

Re: HDFS Server/Client IPC version mismatch while trying to access HDFS files using Spark-0.9.1

bijoy deb
Hi,

The problem was due to a pre-built/binary Tachyon-0.4.1 jar in the SPARK_CLASSPATH, and that Tachyon jar had been built against Hadoop-1.0.4.Building the Tachyon against Hadoop-2.0.0 resolved the issue.

Thanks


On Wed, Jun 11, 2014 at 11:34 PM, Marcelo Vanzin <[hidden email]> wrote:
The error is saying that your client libraries are older than what
your server is using (2.0.0-mr1-cdh4.6.0 is IPC version 7).

Try double-checking that your build is actually using that version
(e.g., by looking at the hadoop jar files in lib_managed/jars).

On Wed, Jun 11, 2014 at 2:07 AM, bijoy deb <[hidden email]> wrote:
> Any suggestions from anyone?
>
> Thanks
> Bijoy
>
>
> On Tue, Jun 10, 2014 at 11:46 PM, bijoy deb <[hidden email]>
> wrote:
>>
>> Hi all,
>>
>> I have build Shark-0.9.1 using sbt using the below command:
>>
>> SPARK_HADOOP_VERSION=2.0.0-mr1-cdh4.6.0 sbt/sbt assembly
>>
>> My Hadoop cluster is also having version 2.0.0-mr1-cdh4.6.0.
>>
>> But when I try to execute the below command from Spark shell,which reads a
>> file from HDFS, I get the "IPC version mismatch- IPC version 7 on server
>> versus IPC version 4" on client error on org.apache.hadoop.hdfs.DFSClient
>> class.
>>
>>
>> scala> val s = sc.textFile("hdfs://host:port/test.txt")
>> scala> s.count()
>> 14/06/10 23:42:59 WARN util.NativeCodeLoader: Unable to load native-hadoop
>> library for your platform... using builtin-java classes where applicable
>> 14/06/10 23:42:59 WARN snappy.LoadSnappy: Snappy native library not loaded
>> org.apache.hadoop.ipc.RemoteException: Server IPC version 7 cannot
>> communicate with client version 4
>>     at org.apache.hadoop.ipc.Client.call(Client.java:1070)
>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
>>     at com.sun.proxy.$Proxy9.getProtocolVersion(Unknown Source)
>>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
>>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
>>     at
>> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
>>     at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
>>     at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
>>     at
>> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
>>     at
>> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
>>     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
>>     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
>>     at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
>>     at
>> org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:176)
>>     at
>> org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:208)
>>
>>
>> Apparently this error is because of version mismatch of the hadoop-hdfs
>> jar between client (one referred by Spark) and server(hadoop cluster).But
>> what I don't understand is why is this mismatch (since I had built Spark
>> with the correct Hadoop version).
>>
>> Any suggestions would be highly appreciated.
>>
>> Thanks
>> Bijoy
>
>



--
Marcelo