[PySpark join] Resolved attribute(s) missing from... Attribute(s) with the same name appear in the operation

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[PySpark join] Resolved attribute(s) missing from... Attribute(s) with the same name appear in the operation

Buckler, Christine

Does anyone know what would cause this type of error? I can’t find anything wrong with the dataframes or the join method.

 

Dataframe columns:

neighbors_for_one.columns            ['style_id', 'colorcode', 'asset_id', 'distance']

apparel_seeds.columns                    ['style_number', 'style_id', 'colorcode', 'parent_pt_id', 'pt_id', 'gender', 'age', 'asset_id', 'avail', 'price', 'price_facet', 'lookid', 'slotid']

 

Job:        neighbors_for_one.join(neighbors_for_one, on=['style_id', 'colorcode'], how='left')

 

Error:

'Resolved attribute(s) style_id#16377 missing from avail#16494,style_id#16496,colorcode#16497,seed#731,seed_styleid#733,asset_id#16495,price_facet#16513,price#16514,colorcode#727,slotid#729,looktype#730,style_id#357,pt_id#16499,sub#732,parent_pt_id#16498,seed_colorcode#734,lookid#728,age#16502,gender#16504,style_number#16507 in operator !Join LeftOuter, ((style_id#16377 = style_id#16496) && (colorcode#727 = colorcode#16497)). Attribute(s) with the same name appear in the operation: style_id. Please check if the right attribute(s) are used.;;\nProject [style_id#357, colorcode#340, asset_id#12, distance#15330, asset_id#16495, distance#16317]\n+- Join LeftOuter, ((style_id#357 = style_id#16377) && (colorcode#340 = colorcode#16437))\n :- Project [style_id#357, colorcode#340, asset_id#12, distance#15330]\n : +- GlobalLimit 5\n : +- LocalLimit 5\n : +- Sort [distance#15330 ASC NULLS FIRST], true\n : +- Project [asset_id#12, style_number#24, style_id#357, colorcode#340, parent_pt_id#15, pt_id#16, gender#21, avail#11, resNet50#1762, feature_vector#3096, feature_cbrt#3107, feature_pca#3119, feature_vector_transformed#3132, feature_style#3146, hashes#15313, UDF(feature_style#3146) AS distance#15330]\n : +- Filter UDF(hashes#15313)\n : +- Project [asset_id#12, style_number#24, style_id#357, colorcode#340, parent_pt_id#15, pt_id#16, gender#21, avail#11, resNet50#1762, feature_vector#3096, feature_cbrt#3107, feature_pca#3119, feature_vector_transformed#3132, feature_style#3146, UDF(feature_style#3146) AS hashes#15313]\n : +- Project [asset_id#12, style_number#24, style_id#357, colorcode#340, parent_pt_id#15, pt_id#16, gender#21, avail#11, resNet50#1762, feature_vector#3096, feature_cbrt#3107, feature_pca#3119, feature_vector_transformed#3132, <lambda>(feature_vector_transformed#3132) AS feature_style#3146]\n : +- Project [asset_id#12, style_number#24, style_id#357, colorcode#340, parent_pt_id#15, pt_id#16, gender#21, avail#11, resNet50#1762, feature_vector#3096, feature_cbrt#3107, feature_pca#3119, UDF(feature_pca#3119) AS feature_vector_transformed#3132]\n : +- Project [asset_id#12, style_number#24, style_id#357, colorcode#340, parent_pt_id#15, pt_id#16, gender#21, avail#11, resNet50#1762, feature_vector#3096, feature_cbrt#3107, UDF(feature_cbrt#3107) AS feature_pca#3119]\n : +- Project [asset_id#12, style_number#24, style_id#357, colorcode#340, parent_pt_id#15, pt_id#16, gender#21, avail#11, resNet50#1762, feature_vector#3096, <lambda>(feature_vector#3096) AS feature_cbrt#3107]\n : +- Project [asset_id#12, style_number#24, style_id#357, colorcode#340, parent_pt_id#15, pt_id#16, gender#21, avail#11, resNet50#1762, <lambda>(resNet50#1762) AS feature_vector#3096]\n : +- Filter isnotnull(resNet50#1762)\n : +- Project [asset_id#12, style_number#24, style_id#357, colorcode#340, parent_pt_id#15, pt_id#16, gender#21, avail#11, resNet50#1762]\n : +- Join LeftOuter, (asset_id#12 = asset_id#1767)\n : :- Project [style_number#24, style_id#357, colorcode#340, parent_pt_id#15, pt_id#16, gender#21, asset_id#12, avail#11]\n : : +- Deduplicate [style_id#357, colorcode#340]\n : : +- Filter (avail#11 = true)\n : : +- Project [style_number#24, style_id#357, colorcode#340, parent_pt_id#15, pt_id#16, gender#21, age#19, asset_id#12, avail#11, price#31, price_facet#30, lookid#341, slotid#342]\n : : +- Filter (((looktype#343 = Apparel) && (seed#344 = 1)) && (check_looktype#703 = apparel))\n : : +- Project [lookid#341, style_id#357, colorcode#340, slotid#342, looktype#343, seed#344, sub#345, seed_styleid#346, seed_colorcode#347, style_number#24, parent_pt_id#15, pt_id#16, gender#21, age#19, asset_id#12, avail#11, price#31, price_facet#30, check_looktype#703]\n : : +- Join LeftOuter, (lookid#341 = lookid#728)\n : : :- Project [style_id#357, colorcode#340, lookid#341, slotid#342, looktype#343, seed#344, sub#345, seed_styleid#346, seed_colorcode#347, style_number#24, parent_pt_id#15, pt_id#16, gender#21, age#19, asset_id#12, avail#11, price#31, price_facet#30]\n : : : +- Join LeftOuter, ((style_id#357 = style_id#13) && (colorcode#340 = colorcode#14))\n : : : :- Project [styleid#339 AS style_id#357, colorcode#340, lookid#341, slotid#342, looktype#343, seed#344, sub#345, seed_styleid#346, seed_colorcode#347]\n : : : : +- Relation[styleid#339,colorcode#340,lookid#341,slotid#342,looktype#343,seed#344,sub#345,seed_styleid#346,seed_colorcode#347] csv\n : : : +- Project [style_number#24, style_id#13, colorcode#14, parent_pt_id#15, pt_id#16, gender#21, age#19, asset_id#12, avail#11, price#31, price_facet#30]\n : : : +- RepartitionByExpression [style_id#13, colorcode#14], 200\n : : : +- Project [style_number#24, style_id#13, colorcode#14, parent_pt_id#15, pt_id#16, parent_pt#17, pt#18, gender#21, age#19, asset_id#12, avail#11, price#31, price_facet#30]\n : : : +- Deduplicate [style_id#13, colorcode#14]\n : : : +- Relation[_c0#10,avail#11,asset_id#12,style_id#13,colorcode#14,parent_pt_id#15,pt_id#16,parent_pt#17,pt#18,age#19,size#20,gender#21,brand_id#22,brand#23,style_number#24,url#25,has_flat#26,main_shottype#27,amp_rgb#28,colorfamily#29,price_facet#30,price#31,min_price#32,max_price#33] csv\n : : +- Project [lookid#728, check_looktype#703]\n : : +- Project [lookid#728, parent_pts_in_lookid#694, gender_in_lookid#696, age_in_lookid#698, check_looktype#703, look_gender#709, <lambda>(age_in_lookid#698) AS look_age#716]\n : : +- Project [lookid#728, parent_pts_in_lookid#694, gender_in_lookid#696, age_in_lookid#698, check_looktype#703, <lambda>(gender_in_lookid#696) AS look_gender#709]\n : : +- Project [lookid#728, parent_pts_in_lookid#694, gender_in_lookid#696, age_in_lookid#698, <lambda>(parent_pts_in_lookid#694) AS check_looktype#703]\n : : +- Aggregate [lookid#728], [lookid#728, collect_list(parent_pt_id#15, 0, 0) AS parent_pts_in_lookid#694, collect_list(gender#21, 0, 0) AS gender_in_lookid#696, collect_list(age#19, 0, 0) AS age_in_lookid#698]\n : : +- Filter ((looktype#730 = Apparel) && (seed#731 = 1))\n : : +- Project [style_id#357, colorcode#727, lookid#728, slotid#729, looktype#730, seed#731, sub#732, seed_styleid#733, seed_colorcode#734, style_number#24, parent_pt_id#15, pt_id#16, gender#21, age#19, asset_id#12, avail#11, price#31, price_facet#30]\n : : +- Join LeftOuter, ((style_id#357 = style_id#13) && (colorcode#727 = colorcode#14))\n : : :- Project [styleid#726 AS style_id#357, colorcode#727, lookid#728, slotid#729, looktype#730, seed#731, sub#732, seed_styleid#733, seed_colorcode#734]\n : : : +- Relation[styleid#726,colorcode#727,lookid#728,slotid#729,looktype#730,seed#731,sub#732,seed_styleid#733,seed_colorcode#734] csv\n : : +- Project [style_number#24, style_id#13, colorcode#14, parent_pt_id#15, pt_id#16, gender#21, age#19, asset_id#12, avail#11, price#31, price_facet#30]\n : : +- RepartitionByExpression [style_id#13, colorcode#14], 200\n : : +- Project [style_number#24, style_id#13, colorcode#14, parent_pt_id#15, pt_id#16, parent_pt#17, pt#18, gender#21, age#19, asset_id#12, avail#11, price#31, price_facet#30]\n : : +- Deduplicate [style_id#13, colorcode#14]\n : : +- Relation[_c0#10,avail#11,asset_id#12,style_id#13,colorcode#14,parent_pt_id#15,pt_id#16,parent_pt#17,pt#18,age#19,size#20,gender#21,brand_id#22,brand#23,style_number#24,url#25,has_flat#26,main_shottype#27,amp_rgb#28,colorfamily#29,price_facet#30,price#31,min_price#32,max_price#33] csv\n : +- RepartitionByExpression [asset_id#1767], 200\n : +- Deduplicate [asset_id#1767]\n : +- Project [assetId#1761 AS asset_id#1767, resNet50#1762]\n : +- Relation[assetId#1761,resNet50#1762,stylegroupColors#1763] json\n +- Project [style_id#16377, colorcode#16437, asset_id#16495, distance#16317]\n +- GlobalLimit 5\n +- LocalLimit 5\n +- Sort [distance#16317 ASC NULLS FIRST], true\n +- Project [asset_id#16495, style_number#16507, style_id#16377, colorcode#16437, parent_pt_id#16498, pt_id#16499, gender#16504, avail#16494, resNet50#1762, feature_vector#3096, feature_cbrt#3107, feature_pca#3119, feature_vector_transformed#3132, feature_style#3146, hashes#15313, UDF(feature_style#3146) AS distance#16317]\n +- Filter UDF(hashes#15313)\n +- Project [asset_id#16495, style_number#16507, style_id#16377, colorcode#16437, parent_pt_id#16498, pt_id#16499, gender#16504, avail#16494, resNet50#1762, feature_vector#3096, feature_cbrt#3107, feature_pca#3119, feature_vector_transformed#3132, feature_style#3146, UDF(feature_style#3146) AS hashes#15313]\n +- Project [asset_id#16495, style_number#16507, style_id#16377, colorcode#16437, parent_pt_id#16498, pt_id#16499, gender#16504, avail#16494, resNet50#1762, feature_vector#3096, feature_cbrt#3107, feature_pca#3119, feature_vector_transformed#3132, <lambda>(feature_vector_transformed#3132) AS feature_style#3146]\n +- Project [asset_id#16495, style_number#16507, style_id#16377, colorcode#16437, parent_pt_id#16498, pt_id#16499, gender#16504, avail#16494, resNet50#1762, feature_vector#3096, feature_cbrt#3107, feature_pca#3119, UDF(feature_pca#3119) AS feature_vector_transformed#3132]\n +- Project [asset_id#16495, style_number#16507, style_id#16377, colorcode#16437, parent_pt_id#16498, pt_id#16499, gender#16504, avail#16494, resNet50#1762, feature_vector#3096, feature_cbrt#3107, UDF(feature_cbrt#3107) AS feature_pca#3119]\n +- Project [asset_id#16495, style_number#16507, style_id#16377, colorcode#16437, parent_pt_id#16498, pt_id#16499, gender#16504, avail#16494, resNet50#1762, feature_vector#3096, <lambda>(feature_vector#3096) AS feature_cbrt#3107]\n +- Project [asset_id#16495, style_number#16507, style_id#16377, colorcode#16437, parent_pt_id#16498, pt_id#16499, gender#16504, avail#16494, resNet50#1762, <lambda>(resNet50#1762) AS feature_vector#3096]\n +- Filter isnotnull(resNet50#1762)\n +- Project [asset_id#16495, style_number#16507, style_id#16377, colorcode#16437, parent_pt_id#16498, pt_id#16499, gender#16504, avail#16494, resNet50#1762]\n +- Join LeftOuter, (asset_id#16495 = asset_id#1767)\n :- Project [style_number#16507, style_id#16377, colorcode#16437, parent_pt_id#16498, pt_id#16499, gender#16504, asset_id#16495, avail#16494]\n : +- Deduplicate [style_id#16377, colorcode#16437]\n : +- Filter (avail#16494 = true)\n : +- Project [style_number#16507, style_id#16377, colorcode#16437, parent_pt_id#16498, pt_id#16499, gender#16504, age#16502, asset_id#16495, avail#16494, price#16514, price_facet#16513, lookid#16438, slotid#16439]\n : +- Filter (((looktype#16440 = Apparel) && (seed#16441 = 1)) && (check_looktype#703 = apparel))\n : +- Project [lookid#16438, style_id#16377, colorcode#16437, slotid#16439, looktype#16440, seed#16441, sub#16442, seed_styleid#16443, seed_colorcode#16444, style_number#16507, parent_pt_id#16498, pt_id#16499, gender#16504, age#16502, asset_id#16495, avail#16494, price#16514, price_facet#16513, check_looktype#703]\n : +- Join LeftOuter, (lookid#16438 = lookid#728)\n : :- Project [style_id#16377, colorcode#16437, lookid#16438, slotid#16439, looktype#16440, seed#16441, sub#16442, seed_styleid#16443, seed_colorcode#16444, style_number#16507, parent_pt_id#16498, pt_id#16499, gender#16504, age#16502, asset_id#16495, avail#16494, price#16514, price_facet#16513]\n : : +- Join LeftOuter, ((style_id#16377 = style_id#16496) && (colorcode#16437 = colorcode#16497))\n : : :- Project [styleid#16436 AS style_id#16377, colorcode#16437, lookid#16438, slotid#16439, looktype#16440, seed#16441, sub#16442, seed_styleid#16443, seed_colorcode#16444]\n : : : +- Relation[styleid#16436,colorcode#16437,lookid#16438,slotid#16439,looktype#16440,seed#16441,sub#16442,seed_styleid#16443,seed_colorcode#16444] csv\n : : +- Project [style_number#16507, style_id#16496, colorcode#16497, parent_pt_id#16498, pt_id#16499, gender#16504, age#16502, asset_id#16495, avail#16494, price#16514, price_facet#16513]\n : : +- RepartitionByExpression [style_id#16496, colorcode#16497], 200\n : : +- Project [style_number#16507, style_id#16496, colorcode#16497, parent_pt_id#16498, pt_id#16499, parent_pt#16500, pt#16501, gender#16504, age#16502, asset_id#16495, avail#16494, price#16514, price_facet#16513]\n : : +- Deduplicate [style_id#16496, colorcode#16497]\n : : +- Relation[_c0#16493,avail#16494,asset_id#16495,style_id#16496,colorcode#16497,parent_pt_id#16498,pt_id#16499,parent_pt#16500,pt#16501,age#16502,size#16503,gender#16504,brand_id#16505,brand#16506,style_number#16507,url#16508,has_flat#16509,main_shottype#16510,amp_rgb#16511,colorfamily#16512,price_facet#16513,price#16514,min_price#16515,max_price#16516] csv\n : +- Project [lookid#728, check_looktype#703]\n : +- Project [lookid#728, parent_pts_in_lookid#694, gender_in_lookid#696, age_in_lookid#698, check_looktype#703, look_gender#709, <lambda>(age_in_lookid#698) AS look_age#716]\n : +- Project [lookid#728, parent_pts_in_lookid#694, gender_in_lookid#696, age_in_lookid#698, check_looktype#703, <lambda>(gender_in_lookid#696) AS look_gender#709]\n : +- Project [lookid#728, parent_pts_in_lookid#694, gender_in_lookid#696, age_in_lookid#698, <lambda>(parent_pts_in_lookid#694) AS check_looktype#703]\n : +- Aggregate [lookid#728], [lookid#728, collect_list(parent_pt_id#16498, 0, 0) AS parent_pts_in_lookid#694, collect_list(gender#16504, 0, 0) AS gender_in_lookid#696, collect_list(age#16502, 0, 0) AS age_in_lookid#698]\n : +- Filter ((looktype#730 = Apparel) && (seed#731 = 1))\n : +- !Project [style_id#16377, colorcode#727, lookid#728, slotid#729, looktype#730, seed#731, sub#732, seed_styleid#733, seed_colorcode#734, style_number#16507, parent_pt_id#16498, pt_id#16499, gender#16504, age#16502, asset_id#16495, avail#16494, price#16514, price_facet#16513]\n : +- !Join LeftOuter, ((style_id#16377 = style_id#16496) && (colorcode#727 = colorcode#16497))\n : :- Project [styleid#726 AS style_id#357, colorcode#727, lookid#728, slotid#729, looktype#730, seed#731, sub#732, seed_styleid#733, seed_colorcode#734]\n : : +- Relation[styleid#726,colorcode#727,lookid#728,slotid#729,looktype#730,seed#731,sub#732,seed_styleid#733,seed_colorcode#734] csv\n : +- Project [style_number#16507, style_id#16496, colorcode#16497, parent_pt_id#16498, pt_id#16499, gender#16504, age#16502, asset_id#16495, avail#16494, price#16514, price_facet#16513]\n : +- RepartitionByExpression [style_id#16496, colorcode#16497], 200\n : +- Project [style_number#16507, style_id#16496, colorcode#16497, parent_pt_id#16498, pt_id#16499, parent_pt#16500, pt#16501, gender#16504, age#16502, asset_id#16495, avail#16494, price#16514, price_facet#16513]\n : +- Deduplicate [style_id#16496, colorcode#16497]\n : +- Relation[_c0#16493,avail#16494,asset_id#16495,style_id#16496,colorcode#16497,parent_pt_id#16498,pt_id#16499,parent_pt#16500,pt#16501,age#16502,size#16503,gender#16504,brand_id#16505,brand#16506,style_number#16507,url#16508,has_flat#16509,main_shottype#16510,amp_rgb#16511,colorfamily#16512,price_facet#16513,price#16514,min_price#16515,max_price#16516] csv\n +- RepartitionByExpression [asset_id#1767], 200\n +- Deduplicate [asset_id#1767]\n +- Project [assetId#1761 AS asset_id#1767, resNet50#1762]\n +- Relation[assetId#1761,resNet50#1762,stylegroupColors#1763] json\n'

 

Thanks,

Christine