I am required to use a specific user id to save files on a remote hdfs cluster. Remote in the sense, spark jobs run on EMR and write to a CDH cluster. Hence I cannot change the hdfs-site.xml etc to point to the destination cluster. As a result I am using webhdfs to save the files into it.
There are few challenges I have with this approach
1. I cannot use nameservice of the namenode and have to specify the IP address of the active namenode, which is risky when there is a failover