Why were changes of SPARK-9241 removed?

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Why were changes of SPARK-9241 removed?

马阳阳
Hi,
I wonder why the changes made in
"[SPARK-9241][SQL] Supporting
multiple DISTINCT columns (2) -
Rewriting Rule" are not present in
Spark (verson 2.4) now. This caused
execution of count distinct in Spark
much slower than Spark 1.6 and hive
(Spark 2.4.4 more than 18 minutes;
hive about 80s, spark 1.6 about 3
minutes).



--
Sent from Postbox
Reply | Threaded
Open this post in threaded view
|

Re: Why were changes of SPARK-9241 removed?

Xiao Li-2
I do not think we intentionally dropped it. Could you open a ticket in Spark JIRA with your query? 

Cheers,

Xiao

On Thu, Mar 12, 2020 at 8:24 PM 马阳阳 <[hidden email]> wrote:
Hi,
I wonder why the changes made in
"[SPARK-9241][SQL] Supporting
multiple DISTINCT columns (2) -
Rewriting Rule" are not present in
Spark (verson 2.4) now. This caused
execution of count distinct in Spark
much slower than Spark 1.6 and hive
(Spark 2.4.4 more than 18 minutes;
hive about 80s, spark 1.6 about 3
minutes).



--
Sent from Postbox


--