hi,
im using spark dataframe API.
i'm trying to give sum() a list parameter containing columns names as
strings.
when i'm putting columns names directly into the function- the script works'
when i'm trying to provide it to the function as a parameter of type list- i
get the error:
"
py4j.protocol.Py4JJavaError: An error occurred while calling o155.sum.
: java.lang.ClassCastException: java.util.ArrayList cannot be cast to
java.lang.String
"
using same kind of list parameter for groupBy() is working.
this is my script:
groupBy_cols = ['date_expense_int', 'customer_id']
agged_cols_list = ['total_customer_exp_last_m','total_customer_exp_last_3m']
df = df.groupBy(groupBy_cols).sum(agged_cols_list)
when i write it like so it works:
df =
df.groupBy(groupBy_cols).sum('total_customer_exp_last_m','total_customer_exp_last_3m')
i tryied also to give sum() a list of column by using
agged_cols_list2 = []
for i in agged_cols_list:
agged_cols_list2.append(col(i))
also didn't work
--
Sent from:
http://apache-spark-user-list.1001560.n3.nabble.com/---------------------------------------------------------------------
To unsubscribe e-mail:
[hidden email]