Fw:multiple group by action

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Fw:multiple group by action

崔苗






-------- Forwarding messages --------
From: "崔苗" <[hidden email]>
Date: 2018-08-25 10:54:31
To: [hidden email]
Subject: multiple group by action
Hi,
we have some user data with columns(userId,company,client,country,region,city),
now we want to count userId by multiple column,such as :
select count(distinct userId) group by company
select count(distinct userId) group by company,client
select count(distinct userId) group by company,client,country
select count(distinct userId) group by company,client,country,region
etc
so each action will bring a shuffle stage, as for columns( company,client) contain column company,
Is there a way to reduce shuffle stage?

Thanks for any replys



Reply | Threaded
Open this post in threaded view
|

Re: Fw:multiple group by action

rxin
Use rollout and cube. 

On Fri, Aug 24, 2018 at 7:55 PM 崔苗 <[hidden email]> wrote:






-------- Forwarding messages --------
From: "崔苗" <[hidden email]>
Date: 2018-08-25 10:54:31
To: [hidden email]
Subject: multiple group by action

Hi,
we have some user data with columns(userId,company,client,country,region,city),
now we want to count userId by multiple column,such as :
select count(distinct userId) group by company
select count(distinct userId) group by company,client
select count(distinct userId) group by company,client,country
select count(distinct userId) group by company,client,country,region
etc
so each action will bring a shuffle stage, as for columns( company,client) contain column company,
Is there a way to reduce shuffle stage?

Thanks for any replys