Spark events log behavior in interactive vs batch job

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Spark events log behavior in interactive vs batch job

Sriram Ganesh
Hi,

I am working on writing spark events and application logs in the blob storage. I am using a similar path for writing spark events and application logs in blob storage.
For example: spark.eventLog.dir = wasb://<containerName>@<storageAccountURI>/logs and application log dir = wasb://<containerName>@<storageAccountURI>/logs/app/.

Since I'm using blob storage I need to create a root directory with a placeholder file otherwise writing action fails. But I'm not creating a placeholder file.

Now in case of batch job where spark driver runs in the cluster mode it works. Because my application logging is taking care of the creation of folders. Whereas in the case of interactive job which runs in client mode is failing since I'm not creating a placeholder file.

I would like to understand how spark emits event in case of an interactive vs batch job. I feel the time of the event emit is causing this issue. 

Can someone help to understand this better?

--
Sriram G
Tech