How to validate orc vectorization is working within spark application?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

How to validate orc vectorization is working within spark application?

umargeek
Hi Folks,

I have enabled below listed configurations within my spark streaming
application but I did not gain performance benefit even after setting these
parameters ,can you please help me is there a way to validate whether
vectorization is working as expeced/enabled correctly !

Note: I am using Spark 2.3 and converted all the data within my application
in orc format.

    sparkSqlCtx.setConf("spark.sql.orc.filterPushdown", "true")
    sparkSqlCtx.setConf("spark.sql.orc.enabled", "true")
    sparkSqlCtx.setConf("spark.sql.hive.convertMetastoreOrc", "true")
    sparkSqlCtx.setConf("spark.sql.orc.char.enabled", "true")
    sparkSqlCtx.setConf("spark.sql.orc.impl","native")
    sparkSqlCtx.setConf("spark.sql.orc.enableVectorizedReader","true")

Thanks,
Umar



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: How to validate orc vectorization is working within spark application?

umargeek
Hi Folks,

I would just require few pointers on the above query w.r.t vectorization
looking forward for support from the community.

Thanks,
Umar



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: How to validate orc vectorization is working within spark application?

Jörn Franke
Full code? What is expected performance and actual ?
What is the use case?

> On 20. Jun 2018, at 05:33, umargeek <[hidden email]> wrote:
>
> Hi Folks,
>
> I would just require few pointers on the above query w.r.t vectorization
> looking forward for support from the community.
>
> Thanks,
> Umar
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: [hidden email]
>

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: How to validate orc vectorization is working within spark application?

umargeek
Hello Jorn,

I am unable to post the entire code due to some data sharing related issues.

Use Case: I am performing aggregations after reading data from HDFS file
every min would like to understand how to perform using vectorisation
enabled and what are pre requisite to successfully to enable the same.

Thanks,
Umar



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: [hidden email]