'requirement failed: OneHotEncoderModel expected x categorical values for input column label, but the input column had metadata specifying n values.'

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

'requirement failed: OneHotEncoderModel expected x categorical values for input column label, but the input column had metadata specifying n values.'

Mina Aslani
Hi,

I am getting the following exception when I am using OneHotEncoderEstimator with MultilayerPerceptronClassifier in Pyspark. (using version 2.4.4)

'requirement failed: OneHotEncoderModel expected x categorical values for input column label, but the input column had metadata specifying n values.'

Using LogisticRegression, RandomForestClassifier or LinearRegression works fine for the same data and OneHotEncoderEstimator.

Any insight on how to resolve this?

Regards,
Mina



Reply | Threaded
Open this post in threaded view
|

Re: 'requirement failed: OneHotEncoderModel expected x categorical values for input column label, but the input column had metadata specifying n values.'

Mina Aslani
Hi,

The metadata for both LogisticRegression and MultilayerPerceptronClassifier follows:


{'ml_attr': {'vals': ['bad', 'good', '__unknown'], 'type': 'nominal', 'name': 'label'}}


And, the label values are 

+-----+

|label|

+-----+

|  0.0|

|  1.0|

+-----+


How come it throws java.lang.IllegalArgumentException:requirement failed: OneHotEncoderModel expected 2 categorical values for input column label, but the input column had metadata specifying 3 values.' in MultilayerPerceptronClassifier and not LogisticRegression. How can I resolve this?


Regards,

Mina



On Tue, Nov 5, 2019 at 3:55 PM Mina Aslani <[hidden email]> wrote:
Hi,

I am getting the following exception when I am using OneHotEncoderEstimator with MultilayerPerceptronClassifier in Pyspark. (using version 2.4.4)

'requirement failed: OneHotEncoderModel expected x categorical values for input column label, but the input column had metadata specifying n values.'

Using LogisticRegression, RandomForestClassifier or LinearRegression works fine for the same data and OneHotEncoderEstimator.

Any insight on how to resolve this?

Regards,
Mina