Within each partition, I'd like the first 4 elements to return null / NaN because there aren't enough rows to be a true "last 5." This is the behavior when I do this in pandas using rolling mean. Instead, it appears to calculate the mean of whatever rows happen to be in the partition, even if there is only 1 row.
Is there a simple way already in Spark to do this? It seems like a normal thing so I wonder if I am missing something.