Hi,I was exploring SPARK. And in the process, I was trying to search a column containing URL.Basically we are doing a contains operator on the column. This is taking around >3 min to return the results. Is there any way to optimize this query ?
.filter( line=>line.contains("someUrl"))I currently have a system in standalone mode with 8GB ram. Everything is stored in memory in De-serialized format. The data size in memory( De-serialized ) is around 1 GB.
Any suggestions ?Thanks in advance.Regards,SB