Yarn log aggregation of spark streaming job

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

Yarn log aggregation of spark streaming job

By default, YARN aggregates logs after an application completes. But I am
trying aggregate logs for spark streaming job which in theory will run
forever. I have set the property the following properties for log
aggregation and restarted yarn by restarting hadoop-yarn-nodemanager on core
& task nodes and hadoop-yarn-resourcemanager on master node on my emr
cluster. I can view my changes in http://node-ip:8088/conf.

yarn.log-aggregation-enable => true
yarn.log-aggregation.retain-seconds => 172800
yarn.log-aggregation.retain-check-interval-seconds => -1
yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds => 3600

All the articles and resources have only mentioned to include
yarn.nodemanager.log-aggregation.roll-monitoring-interval-seconds property
and yarn will starting aggregating logs for running jobs. But it is not
working in my case.

Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

To unsubscribe e-mail: [hidden email]