You have to check the size of the row groups in your parquet files and maybe tweak it a little.
In my memories, if parquet detects that it has two much cardinality in a row group Pedicate push-down will not be enable and you'll be forced to read the full row group even if you only need a single row.
Check the schema of your parquet files with parquet-tools (you don't need spark for this) and do some tuning of your spark writing.
hadoop jar /.../parquet-tools-<VERSION>.jar <command> my_parquet_file.parquet
You may also have a look to your hadoopConfiguration and in particular to :
Le ven. 20 sept. 2019 à 15:37, Tomas Bartalos <[hidden email]> a écrit :
I forgot to mention important part that I'm issuing same query to both parquets - selecting only one column: