|
|
Hi All,
I am running the Spark Structured Streaming along with Kafka. Below is the pom.xml <properties> <maven.compiler.source>1.8</maven.compiler.source> <maven.compiler.target>1.8</maven.compiler.target> <encoding>UTF-8</encoding> <!-- Put the Scala version of the cluster --> <scalaVersion>2.12.10</scalaVersion> <sparkVersion>3.0.1</sparkVersion> </properties> <dependency> <groupId>org.apache.kafka</groupId> <artifactId>kafka-clients</artifactId> <version>2.1.0</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.12</artifactId> <version>${sparkVersion}</version> <scope>provided</scope> </dependency> <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql --> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql_2.12</artifactId> <version>${sparkVersion}</version> <scope>provided</scope> </dependency> <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql-kafka-0-10 --> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql-kafka-0-10_2.12</artifactId> <version>${sparkVersion}</version> </dependency> Building the fat jar with shade plugin. The jar is running as expected in my local setup with the command spark-submit --master local[*] --class com.stream.Main --num-executors 3 --driver-memory 2g --executor-cores 2 --executor-memory 3g prism-event-synch-rta.jar
But when I am trying to run same jar in spark cluster using yarn with command: spark-submit --master yarn --deploy-mode cluster --class com.stream.Main --num-executors 4 --driver-memory 2g --executor-cores 1 --executor-memory 4g gs://jars/prism-event-synch-rta.jar
Getting the this exception:
at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runStream(StreamExecution.scala:355) at org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:245) Caused by: org.apache.kafka.common.config.ConfigException: Missing required configuration "partition.assignment.strategy" which has no default value. at org.apache.kafka.common.config.ConfigDef.parse(ConfigDef.java:124)
I have tried setting up the "partition.assignment.strategy", then also its not working. Please help.
Regards Amit Joshi
|
|
Hi,
I think the issue is that you are overriding the kafka-clients that comes with <artifactId>spark-sql-kafka-0-10_2.12</artifactId>
I'd try removing the kafka-clients and see if it works Hi All,
I am running the Spark Structured Streaming along with Kafka. Below is the pom.xml <properties> <maven.compiler.source>1.8</maven.compiler.source> <maven.compiler.target>1.8</maven.compiler.target> <encoding>UTF-8</encoding> <!-- Put the Scala version of the cluster --> <scalaVersion>2.12.10</scalaVersion> <sparkVersion>3.0.1</sparkVersion> </properties> <dependency> <groupId>org.apache.kafka</groupId> <artifactId>kafka-clients</artifactId> <version>2.1.0</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.12</artifactId> <version>${sparkVersion}</version> <scope>provided</scope> </dependency> <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql --> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql_2.12</artifactId> <version>${sparkVersion}</version> <scope>provided</scope> </dependency> <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql-kafka-0-10 --> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql-kafka-0-10_2.12</artifactId> <version>${sparkVersion}</version> </dependency> Building the fat jar with shade plugin. The jar is running as expected in my local setup with the command spark-submit --master local[*] --class com.stream.Main --num-executors 3 --driver-memory 2g --executor-cores 2 --executor-memory 3g prism-event-synch-rta.jar
But when I am trying to run same jar in spark cluster using yarn with command: spark-submit --master yarn --deploy-mode cluster --class com.stream.Main --num-executors 4 --driver-memory 2g --executor-cores 1 --executor-memory 4g gs://jars/prism-event-synch-rta.jar
Getting the this exception:
at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runStream(StreamExecution.scala:355) at org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:245) Caused by: org.apache.kafka.common.config.ConfigException: Missing required configuration "partition.assignment.strategy" which has no default value. at org.apache.kafka.common.config.ConfigDef.parse(ConfigDef.java:124)
I have tried setting up the "partition.assignment.strategy", then also its not working. Please help.
Regards Amit Joshi
|
|
+1 on the mentioned change, Spark uses the following kafka-clients library:
<kafka.version>2.4.1</kafka.version>
G
On Mon, Dec 7, 2020 at 9:30 AM German Schiavon < [hidden email]> wrote: Hi,
I think the issue is that you are overriding the kafka-clients that comes with <artifactId>spark-sql-kafka-0-10_2.12</artifactId>
I'd try removing the kafka-clients and see if it works
Hi All,
I am running the Spark Structured Streaming along with Kafka. Below is the pom.xml <properties> <maven.compiler.source>1.8</maven.compiler.source> <maven.compiler.target>1.8</maven.compiler.target> <encoding>UTF-8</encoding> <!-- Put the Scala version of the cluster --> <scalaVersion>2.12.10</scalaVersion> <sparkVersion>3.0.1</sparkVersion> </properties> <dependency> <groupId>org.apache.kafka</groupId> <artifactId>kafka-clients</artifactId> <version>2.1.0</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.12</artifactId> <version>${sparkVersion}</version> <scope>provided</scope> </dependency> <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql --> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql_2.12</artifactId> <version>${sparkVersion}</version> <scope>provided</scope> </dependency> <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql-kafka-0-10 --> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql-kafka-0-10_2.12</artifactId> <version>${sparkVersion}</version> </dependency> Building the fat jar with shade plugin. The jar is running as expected in my local setup with the command spark-submit --master local[*] --class com.stream.Main --num-executors 3 --driver-memory 2g --executor-cores 2 --executor-memory 3g prism-event-synch-rta.jar
But when I am trying to run same jar in spark cluster using yarn with command: spark-submit --master yarn --deploy-mode cluster --class com.stream.Main --num-executors 4 --driver-memory 2g --executor-cores 1 --executor-memory 4g gs://jars/prism-event-synch-rta.jar
Getting the this exception:
at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runStream(StreamExecution.scala:355) at org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:245) Caused by: org.apache.kafka.common.config.ConfigException: Missing required configuration "partition.assignment.strategy" which has no default value. at org.apache.kafka.common.config.ConfigDef.parse(ConfigDef.java:124)
I have tried setting up the "partition.assignment.strategy", then also its not working. Please help.
Regards Amit Joshi
|
|
Hi All,
Thnks for the reply. I did tried removing the client version. But got the same exception.
Though one point there is some dependent artifacts which I am using, which contains refrence to the Kafka client saw version. I am trying to make uber jar, which will choose the closest version.
Thnks
On Monday, December 7, 2020, Gabor Somogyi < [hidden email]> wrote: +1 on the mentioned change, Spark uses the following kafka-clients library:
<kafka.version>2.4.1</kafka.version>
G
On Mon, Dec 7, 2020 at 9:30 AM German Schiavon < [hidden email]> wrote: Hi,
I think the issue is that you are overriding the kafka-clients that comes with <artifactId>spark-sql-kafka-0-10_2.12</artifactId>
I'd try removing the kafka-clients and see if it works
Hi All,
I am running the Spark Structured Streaming along with Kafka. Below is the pom.xml <properties> <maven.compiler.source>1.8</maven.compiler.source> <maven.compiler.target>1.8</maven.compiler.target> <encoding>UTF-8</encoding> <!-- Put the Scala version of the cluster --> <scalaVersion>2.12.10</scalaVersion> <sparkVersion>3.0.1</sparkVersion> </properties> <dependency> <groupId>org.apache.kafka</groupId> <artifactId>kafka-clients</artifactId> <version>2.1.0</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.12</artifactId> <version>${sparkVersion}</version> <scope>provided</scope> </dependency> <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql --> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql_2.12</artifactId> <version>${sparkVersion}</version> <scope>provided</scope> </dependency> <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql-kafka-0-10 --> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql-kafka-0-10_2.12</artifactId> <version>${sparkVersion}</version> </dependency> Building the fat jar with shade plugin. The jar is running as expected in my local setup with the command spark-submit --master local[*] --class com.stream.Main --num-executors 3 --driver-memory 2g --executor-cores 2 --executor-memory 3g prism-event-synch-rta.jar
But when I am trying to run same jar in spark cluster using yarn with command: spark-submit --master yarn --deploy-mode cluster --class com.stream.Main --num-executors 4 --driver-memory 2g --executor-cores 1 --executor-memory 4g gs://jars/prism-event-synch-rta.jar
Getting the this exception:
at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runStream(StreamExecution.scala:355) at org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:245) Caused by: org.apache.kafka.common.config.ConfigException: Missing required configuration "partition.assignment.strategy" which has no default value. at org.apache.kafka.common.config.ConfigDef.parse(ConfigDef.java:124)
I have tried setting up the "partition.assignment.strategy", then also its not working. Please help.
Regards Amit Joshi
|
|
+ Adding back user list.
I've had a look at the Spark code and it's not modifying "partition.assignment.strategy" so the problem must be either in your application or in your cluster setup.
G
It's super interesting because that field has default value: org.apache.kafka.clients.consumer.RangeAssignor
Hi,
Thnks for the reply. I did tried removing the client version. But got the same exception.
Thnks On Monday, December 7, 2020, Gabor Somogyi < [hidden email]> wrote: +1 on the mentioned change, Spark uses the following kafka-clients library:
<kafka.version>2.4.1</kafka.version>
G
On Mon, Dec 7, 2020 at 9:30 AM German Schiavon < [hidden email]> wrote: Hi,
I think the issue is that you are overriding the kafka-clients that comes with <artifactId>spark-sql-kafka-0-10_2.12</artifactId>
I'd try removing the kafka-clients and see if it works
Hi All,
I am running the Spark Structured Streaming along with Kafka. Below is the pom.xml <properties> <maven.compiler.source>1.8</maven.compiler.source> <maven.compiler.target>1.8</maven.compiler.target> <encoding>UTF-8</encoding> <!-- Put the Scala version of the cluster --> <scalaVersion>2.12.10</scalaVersion> <sparkVersion>3.0.1</sparkVersion> </properties> <dependency> <groupId>org.apache.kafka</groupId> <artifactId>kafka-clients</artifactId> <version>2.1.0</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.12</artifactId> <version>${sparkVersion}</version> <scope>provided</scope> </dependency> <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql --> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql_2.12</artifactId> <version>${sparkVersion}</version> <scope>provided</scope> </dependency> <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql-kafka-0-10 --> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql-kafka-0-10_2.12</artifactId> <version>${sparkVersion}</version> </dependency> Building the fat jar with shade plugin. The jar is running as expected in my local setup with the command spark-submit --master local[*] --class com.stream.Main --num-executors 3 --driver-memory 2g --executor-cores 2 --executor-memory 3g prism-event-synch-rta.jar
But when I am trying to run same jar in spark cluster using yarn with command: spark-submit --master yarn --deploy-mode cluster --class com.stream.Main --num-executors 4 --driver-memory 2g --executor-cores 1 --executor-memory 4g gs://jars/prism-event-synch-rta.jar
Getting the this exception:
at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runStream(StreamExecution.scala:355) at org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:245) Caused by: org.apache.kafka.common.config.ConfigException: Missing required configuration "partition.assignment.strategy" which has no default value. at org.apache.kafka.common.config.ConfigDef.parse(ConfigDef.java:124)
I have tried setting up the "partition.assignment.strategy", then also its not working. Please help.
Regards Amit Joshi
|
|
Hi Gabor,
The code is very simple Kafka consumption of data. I guess, it may be the cluster. Can you please point out the possible problem toook for in the cluster?
Regards Amit On Monday, December 7, 2020, Gabor Somogyi < [hidden email]> wrote: + Adding back user list.
I've had a look at the Spark code and it's not modifying "partition.assignment.strategy" so the problem must be either in your application or in your cluster setup.
G
It's super interesting because that field has default value: org.apache.kafka.clients.consumer.RangeAssignor
Hi,
Thnks for the reply. I did tried removing the client version. But got the same exception.
Thnks On Monday, December 7, 2020, Gabor Somogyi < [hidden email]> wrote: +1 on the mentioned change, Spark uses the following kafka-clients library:
<kafka.version>2.4.1</kafka.version>
G
On Mon, Dec 7, 2020 at 9:30 AM German Schiavon < [hidden email]> wrote: Hi,
I think the issue is that you are overriding the kafka-clients that comes with <artifactId>spark-sql-kafka-0-10_2.12</artifactId>
I'd try removing the kafka-clients and see if it works
Hi All,
I am running the Spark Structured Streaming along with Kafka. Below is the pom.xml <properties> <maven.compiler.source>1.8</maven.compiler.source> <maven.compiler.target>1.8</maven.compiler.target> <encoding>UTF-8</encoding> <!-- Put the Scala version of the cluster --> <scalaVersion>2.12.10</scalaVersion> <sparkVersion>3.0.1</sparkVersion> </properties> <dependency> <groupId>org.apache.kafka</groupId> <artifactId>kafka-clients</artifactId> <version>2.1.0</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.12</artifactId> <version>${sparkVersion}</version> <scope>provided</scope> </dependency> <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql --> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql_2.12</artifactId> <version>${sparkVersion}</version> <scope>provided</scope> </dependency> <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql-kafka-0-10 --> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql-kafka-0-10_2.12</artifactId> <version>${sparkVersion}</version> </dependency> Building the fat jar with shade plugin. The jar is running as expected in my local setup with the command spark-submit --master local[*] --class com.stream.Main --num-executors 3 --driver-memory 2g --executor-cores 2 --executor-memory 3g prism-event-synch-rta.jar
But when I am trying to run same jar in spark cluster using yarn with command: spark-submit --master yarn --deploy-mode cluster --class com.stream.Main --num-executors 4 --driver-memory 2g --executor-cores 1 --executor-memory 4g gs://jars/prism-event-synch-rta.jar
Getting the this exception:
at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runStream(StreamExecution.scala:355) at org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:245) Caused by: org.apache.kafka.common.config.ConfigException: Missing required configuration "partition.assignment.strategy" which has no default value. at org.apache.kafka.common.config.ConfigDef.parse(ConfigDef.java:124)
I have tried setting up the "partition.assignment.strategy", then also its not working. Please help.
Regards Amit Joshi
|
|
Well, I can't do miracle without cluster and logs access. What I don't understand why you need fat jar?! Spark libraries normally need provided scope because it must exist on all machines... I would take a look at the driver and executor logs which contains the consumer configs + I would take a look at the exact version of the consumer (this is printed also in the same log)
G
Hi Gabor,
The code is very simple Kafka consumption of data. I guess, it may be the cluster. Can you please point out the possible problem toook for in the cluster?
Regards Amit On Monday, December 7, 2020, Gabor Somogyi < [hidden email]> wrote: + Adding back user list.
I've had a look at the Spark code and it's not modifying "partition.assignment.strategy" so the problem must be either in your application or in your cluster setup.
G
It's super interesting because that field has default value: org.apache.kafka.clients.consumer.RangeAssignor
Hi,
Thnks for the reply. I did tried removing the client version. But got the same exception.
Thnks On Monday, December 7, 2020, Gabor Somogyi < [hidden email]> wrote: +1 on the mentioned change, Spark uses the following kafka-clients library:
<kafka.version>2.4.1</kafka.version>
G
On Mon, Dec 7, 2020 at 9:30 AM German Schiavon < [hidden email]> wrote: Hi,
I think the issue is that you are overriding the kafka-clients that comes with <artifactId>spark-sql-kafka-0-10_2.12</artifactId>
I'd try removing the kafka-clients and see if it works
Hi All,
I am running the Spark Structured Streaming along with Kafka. Below is the pom.xml <properties> <maven.compiler.source>1.8</maven.compiler.source> <maven.compiler.target>1.8</maven.compiler.target> <encoding>UTF-8</encoding> <!-- Put the Scala version of the cluster --> <scalaVersion>2.12.10</scalaVersion> <sparkVersion>3.0.1</sparkVersion> </properties> <dependency> <groupId>org.apache.kafka</groupId> <artifactId>kafka-clients</artifactId> <version>2.1.0</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.12</artifactId> <version>${sparkVersion}</version> <scope>provided</scope> </dependency> <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql --> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql_2.12</artifactId> <version>${sparkVersion}</version> <scope>provided</scope> </dependency> <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql-kafka-0-10 --> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql-kafka-0-10_2.12</artifactId> <version>${sparkVersion}</version> </dependency> Building the fat jar with shade plugin. The jar is running as expected in my local setup with the command spark-submit --master local[*] --class com.stream.Main --num-executors 3 --driver-memory 2g --executor-cores 2 --executor-memory 3g prism-event-synch-rta.jar
But when I am trying to run same jar in spark cluster using yarn with command: spark-submit --master yarn --deploy-mode cluster --class com.stream.Main --num-executors 4 --driver-memory 2g --executor-cores 1 --executor-memory 4g gs://jars/prism-event-synch-rta.jar
Getting the this exception:
at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runStream(StreamExecution.scala:355) at org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:245) Caused by: org.apache.kafka.common.config.ConfigException: Missing required configuration "partition.assignment.strategy" which has no default value. at org.apache.kafka.common.config.ConfigDef.parse(ConfigDef.java:124)
I have tried setting up the "partition.assignment.strategy", then also its not working. Please help.
Regards Amit Joshi
|
|
Hi Gabor,
Pls find the logs attached. These are truncated logs.
Command used : spark-submit --verbose --packages org.apache.spark:spark-sql-kafka-0-10_2.12:3.0.1,com.typesafe:config:1.4.0 --master yarn --deploy-mode cluster --class com.stream.Main --num-executors 2 --driver-memory 2g --executor-cores 1 --executor-memory 4g --files gs://x/jars_application.conf,gs://x/log4j.properties gs://x/a-synch-r-1.0-SNAPSHOT.jar
For this I used a snapshot jar, not a fat jar.
Regards Amit Well, I can't do miracle without cluster and logs access. What I don't understand why you need fat jar?! Spark libraries normally need provided scope because it must exist on all machines... I would take a look at the driver and executor logs which contains the consumer configs + I would take a look at the exact version of the consumer (this is printed also in the same log)
G
Hi Gabor,
The code is very simple Kafka consumption of data. I guess, it may be the cluster. Can you please point out the possible problem toook for in the cluster?
Regards Amit On Monday, December 7, 2020, Gabor Somogyi < [hidden email]> wrote: + Adding back user list.
I've had a look at the Spark code and it's not modifying "partition.assignment.strategy" so the problem must be either in your application or in your cluster setup.
G
It's super interesting because that field has default value: org.apache.kafka.clients.consumer.RangeAssignor
Hi,
Thnks for the reply. I did tried removing the client version. But got the same exception.
Thnks On Monday, December 7, 2020, Gabor Somogyi < [hidden email]> wrote: +1 on the mentioned change, Spark uses the following kafka-clients library:
<kafka.version>2.4.1</kafka.version>
G
On Mon, Dec 7, 2020 at 9:30 AM German Schiavon < [hidden email]> wrote: Hi,
I think the issue is that you are overriding the kafka-clients that comes with <artifactId>spark-sql-kafka-0-10_2.12</artifactId>
I'd try removing the kafka-clients and see if it works
Hi All,
I am running the Spark Structured Streaming along with Kafka. Below is the pom.xml <properties> <maven.compiler.source>1.8</maven.compiler.source> <maven.compiler.target>1.8</maven.compiler.target> <encoding>UTF-8</encoding> <!-- Put the Scala version of the cluster --> <scalaVersion>2.12.10</scalaVersion> <sparkVersion>3.0.1</sparkVersion> </properties> <dependency> <groupId>org.apache.kafka</groupId> <artifactId>kafka-clients</artifactId> <version>2.1.0</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.12</artifactId> <version>${sparkVersion}</version> <scope>provided</scope> </dependency> <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql --> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql_2.12</artifactId> <version>${sparkVersion}</version> <scope>provided</scope> </dependency> <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql-kafka-0-10 --> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql-kafka-0-10_2.12</artifactId> <version>${sparkVersion}</version> </dependency> Building the fat jar with shade plugin. The jar is running as expected in my local setup with the command spark-submit --master local[*] --class com.stream.Main --num-executors 3 --driver-memory 2g --executor-cores 2 --executor-memory 3g prism-event-synch-rta.jar
But when I am trying to run same jar in spark cluster using yarn with command: spark-submit --master yarn --deploy-mode cluster --class com.stream.Main --num-executors 4 --driver-memory 2g --executor-cores 1 --executor-memory 4g gs://jars/prism-event-synch-rta.jar
Getting the this exception:
at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runStream(StreamExecution.scala:355) at org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:245) Caused by: org.apache.kafka.common.config.ConfigException: Missing required configuration "partition.assignment.strategy" which has no default value. at org.apache.kafka.common.config.ConfigDef.parse(ConfigDef.java:124)
I have tried setting up the "partition.assignment.strategy", then also its not working. Please help.
Regards Amit Joshi
---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]
|
|
Hi All,
Can someone pls hellp with this.
Thanks On Tuesday, December 8, 2020, Amit Joshi < [hidden email]> wrote: Hi Gabor,
Pls find the logs attached. These are truncated logs.
Command used : spark-submit --verbose --packages org.apache.spark:spark-sql-kafka-0-10_2.12:3.0.1,com.typesafe:config:1.4.0 --master yarn --deploy-mode cluster --class com.stream.Main --num-executors 2 --driver-memory 2g --executor-cores 1 --executor-memory 4g --files gs://x/jars_application.conf,gs://x/log4j.properties gs://x/a-synch-r-1.0-SNAPSHOT.jar
For this I used a snapshot jar, not a fat jar.
Regards Amit
Well, I can't do miracle without cluster and logs access. What I don't understand why you need fat jar?! Spark libraries normally need provided scope because it must exist on all machines... I would take a look at the driver and executor logs which contains the consumer configs + I would take a look at the exact version of the consumer (this is printed also in the same log)
G
Hi Gabor,
The code is very simple Kafka consumption of data. I guess, it may be the cluster. Can you please point out the possible problem toook for in the cluster?
Regards Amit On Monday, December 7, 2020, Gabor Somogyi < [hidden email]> wrote: + Adding back user list.
I've had a look at the Spark code and it's not modifying "partition.assignment.strategy" so the problem must be either in your application or in your cluster setup.
G
It's super interesting because that field has default value: org.apache.kafka.clients.consumer.RangeAssignor
Hi,
Thnks for the reply. I did tried removing the client version. But got the same exception.
Thnks On Monday, December 7, 2020, Gabor Somogyi < [hidden email]> wrote: +1 on the mentioned change, Spark uses the following kafka-clients library:
<kafka.version>2.4.1</kafka.version>
G
On Mon, Dec 7, 2020 at 9:30 AM German Schiavon < [hidden email]> wrote: Hi,
I think the issue is that you are overriding the kafka-clients that comes with <artifactId>spark-sql-kafka-0-10_2.12</artifactId>
I'd try removing the kafka-clients and see if it works
Hi All,
I am running the Spark Structured Streaming along with Kafka. Below is the pom.xml <properties> <maven.compiler.source>1.8</maven.compiler.source> <maven.compiler.target>1.8</maven.compiler.target> <encoding>UTF-8</encoding> <!-- Put the Scala version of the cluster --> <scalaVersion>2.12.10</scalaVersion> <sparkVersion>3.0.1</sparkVersion> </properties> <dependency> <groupId>org.apache.kafka</groupId> <artifactId>kafka-clients</artifactId> <version>2.1.0</version> </dependency> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.12</artifactId> <version>${sparkVersion}</version> <scope>provided</scope> </dependency> <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql --> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql_2.12</artifactId> <version>${sparkVersion}</version> <scope>provided</scope> </dependency> <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-sql-kafka-0-10 --> <dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-sql-kafka-0-10_2.12</artifactId> <version>${sparkVersion}</version> </dependency> Building the fat jar with shade plugin. The jar is running as expected in my local setup with the command spark-submit --master local[*] --class com.stream.Main --num-executors 3 --driver-memory 2g --executor-cores 2 --executor-memory 3g prism-event-synch-rta.jar
But when I am trying to run same jar in spark cluster using yarn with command: spark-submit --master yarn --deploy-mode cluster --class com.stream.Main --num-executors 4 --driver-memory 2g --executor-cores 1 --executor-memory 4g gs://jars/prism-event-synch-rta.jar
Getting the this exception:
at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runStream(StreamExecution.scala:355) at org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:245) Caused by: org.apache.kafka.common.config.ConfigException: Missing required configuration "partition.assignment.strategy" which has no default value. at org.apache.kafka.common.config.ConfigDef.parse(ConfigDef.java:124)
I have tried setting up the "partition.assignment.strategy", then also its not working. Please help.
Regards Amit Joshi
|
|