Which predicate pushdown work or does not work with Parquet?

Which predicate pushdown work or does not work with Parquet?

Manuel Vonthron
Hi all,

I am trying to determine which predicate pushdown work or does not work with Spark+Parquet (mostly for versions 2.1.0 and/or 2.2.0).

I've read a lot of messages from the pull requests comments, JIRA tickets, even the comments in Parquet's source but it's hard to have a clear picture of when a pushdown is honoured depending on 
  - the data type (Int? String? Timestamp?)
  - operator involved (isNull, >=, ...)
  - and even the column name (is there a "." in it or not?) 

The only types I consistently got working in my tests and reads are "regular numbers" but support for Strings and Timestamps is crucial for my use case.

Do you have any "reference" on this subject?

Additionally, here is a test I've been running with it's results:

There might be errors or misconfigured things but the TL;DR is: I only got INTs and BOOLs to reliably work with no weirdness :| 


