Strange codegen error for SortMergeJoin in Spark 2.2.1

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Strange codegen error for SortMergeJoin in Spark 2.2.1

Rico B.

Hi!

I get a strange error when executing a complex SQL-query involving 4 tables that are left-outer-joined:

Caused by: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 37, Column 18: failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 37, Column 18: No applicable constructor/method found for actual parameters "int"; candidates are: "org.apache.spark.sql.execution.ExternalAppendOnlyUnsafeRowArray(org.apache.spark.memory.TaskMemoryManager, org.apache.spark.storage.BlockManager, org.apache.spark.serializer.SerializerManager, org.apache.spark.TaskContext, int, long, int, int)", "org.apache.spark.sql.execution.ExternalAppendOnlyUnsafeRowArray(int, int)"

...

/* 037 */     smj_matches = new org.apache.spark.sql.execution.ExternalAppendOnlyUnsafeRowArray(2147483647);

The same query works with Spark 2.2.0.

I checked the Spark source code and saw that in ExternalAppendOnlyUnsafeRowArray a second int was introduced into the constructor in 2.2.1

But looking at the codegeneration part of SortMergeJoinExec:

// A list to hold all matched rows from right side.
val matches = ctx.freshName("matches")
val clsName = classOf[ExternalAppendOnlyUnsafeRowArray].getName

val spillThreshold = getSpillThreshold
val inMemoryThreshold = getInMemoryThreshold

ctx.addMutableState(clsName, matches,
  s"$matches = new $clsName($inMemoryThreshold, $spillThreshold);")

it should get 2 parameters, not just one.


May be anyone has an idea?


Best,

Rico.

Reply | Threaded
Open this post in threaded view
|

Re: Strange codegen error for SortMergeJoin in Spark 2.2.1

Kazuaki Ishizaki
Thank you for reporting a problem.
Would it be possible to create a JIRA entry with a small program that can reproduce this problem?

Best Regards,
Kazuaki Ishizaki



From:        Rico Bergmann <[hidden email]>
To:        "[hidden email]" <[hidden email]>
Date:        2018/06/05 19:58
Subject:        Strange codegen error for SortMergeJoin in Spark 2.2.1




Hi!

I get a strange error when executing a complex SQL-query involving 4 tables that are left-outer-joined:
Caused by: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 37, Column 18: failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 37, Column 18: No applicable constructor/method found for actual parameters "int"; candidates are: "org.apache.spark.sql.execution.ExternalAppendOnlyUnsafeRowArray(org.apache.spark.memory.TaskMemoryManager, org.apache.spark.storage.BlockManager, org.apache.spark.serializer.SerializerManager, org.apache.spark.TaskContext, int, long, int, int)", "org.apache.spark.sql.execution.ExternalAppendOnlyUnsafeRowArray(int, int)"

...

/* 037 */     smj_matches = new org.apache.spark.sql.execution.ExternalAppendOnlyUnsafeRowArray(2147483647);

The same query works with Spark 2.2.0.

I checked the Spark source code and saw that in ExternalAppendOnlyUnsafeRowArray a second int was introduced into the constructor in 2.2.1

But looking at the codegeneration part of SortMergeJoinExec:

// A list to hold all matched rows from right side.
val matches = ctx.freshName("matches")
val clsName = classOf[ExternalAppendOnlyUnsafeRowArray].getName

val spillThreshold = getSpillThreshold
val inMemoryThreshold = getInMemoryThreshold

ctx.addMutableState(clsName
, matches,
 
s"$matches= new $clsName($inMemoryThreshold, $spillThreshold);")

it should get 2 parameters, not just one.

May be anyone has an idea?

Best,

Rico.


Reply | Threaded
Open this post in threaded view
|

Re: Strange codegen error for SortMergeJoin in Spark 2.2.1

Rico B.

Hi!


I finally found the problem. I was not aware, that the program was run in Client mode. The client used version 2.2.0. This caused the problem.

Best,

Rico.


Am 07.06.2018 um 08:49 schrieb Kazuaki Ishizaki:
Thank you for reporting a problem.
Would it be possible to create a JIRA entry with a small program that can reproduce this problem?

Best Regards,
Kazuaki Ishizaki



From:        Rico Bergmann [hidden email]
To:        [hidden email] [hidden email]
Date:        2018/06/05 19:58
Subject:        Strange codegen error for SortMergeJoin in Spark 2.2.1




Hi!

I get a strange error when executing a complex SQL-query involving 4 tables that are left-outer-joined:
Caused by: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 37, Column 18: failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 37, Column 18: No applicable constructor/method found for actual parameters "int"; candidates are: "org.apache.spark.sql.execution.ExternalAppendOnlyUnsafeRowArray(org.apache.spark.memory.TaskMemoryManager,org.apache.spark.storage.BlockManager, org.apache.spark.serializer.SerializerManager, org.apache.spark.TaskContext, int, long, int, int)", "org.apache.spark.sql.execution.ExternalAppendOnlyUnsafeRowArray(int, int)"

...

/* 037 */     smj_matches = new org.apache.spark.sql.execution.ExternalAppendOnlyUnsafeRowArray(2147483647);

The same query works with Spark 2.2.0.

I checked the Spark source code and saw that in ExternalAppendOnlyUnsafeRowArray a second int was introduced into the constructor in 2.2.1

But looking at the codegeneration part of SortMergeJoinExec:

// A list to hold all matched rows from right side.
val matches = ctx.freshName("matches")
val clsName = classOf[ExternalAppendOnlyUnsafeRowArray].getName

val spillThreshold = getSpillThreshold
val inMemoryThreshold = getInMemoryThreshold

ctx.addMutableState(clsName
, matches,
 
s"$matches= new $clsName($inMemoryThreshold, $spillThreshold);")

it should get 2 parameters, not just one.

May be anyone has an idea?

Best,

Rico.