Problem in persisting file in S3 using Spark: xxx file does not exist Exception

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Problem in persisting file in S3 using Spark: xxx file does not exist Exception

Marco Mistroni
HI all
 i am using the following code for persisting data into S3 (aws keys are already stored in the environment variables)

dataFrame.coalesce(1).write.format("com.databricks.spark.csv").save(fileName)

However, i keep on receiving an exception that the file does not exist

here's what comes from logs

18/04/24 22:15:32 INFO Persiste: Persisting data to text file: s3://ec2-bucket-mm-spark/form4-results-2404.results
Exception in thread "main" java.io.IOException: /form4-results-2404.results doesn't exist

It seems that Spark expects the file to be there before writing? which seems bizzarre?

I Have even tried to remove the coalesce ,but still got the same exception
Could anyone help pls?
kind regarsd
 marco
Reply | Threaded
Open this post in threaded view
|

Re: Problem in persisting file in S3 using Spark: xxx file does not exist Exception

Paul Tremblay
I would like to see the full error. However, S3 can give misleading messages if you don't have the correct permissions. 

On Tue, Apr 24, 2018, 2:28 PM Marco Mistroni <[hidden email]> wrote:
HI all
 i am using the following code for persisting data into S3 (aws keys are already stored in the environment variables)

dataFrame.coalesce(1).write.format("com.databricks.spark.csv").save(fileName)

However, i keep on receiving an exception that the file does not exist

here's what comes from logs

18/04/24 22:15:32 INFO Persiste: Persisting data to text file: s3://ec2-bucket-mm-spark/form4-results-2404.results
Exception in thread "main" java.io.IOException: /form4-results-2404.results doesn't exist

It seems that Spark expects the file to be there before writing? which seems bizzarre?

I Have even tried to remove the coalesce ,but still got the same exception
Could anyone help pls?
kind regarsd
 marco
Reply | Threaded
Open this post in threaded view
|

Re: Problem in persisting file in S3 using Spark: xxx file does not exist Exception

Marco Mistroni
Hi
 Sorted ..I just replaced s3 with s3a....I think I recall similar issues in the past with aws libraries.
Thx anyway for getting back
Kr

On Wed, May 2, 2018, 4:57 PM Paul Tremblay <[hidden email]> wrote:
I would like to see the full error. However, S3 can give misleading messages if you don't have the correct permissions. 

On Tue, Apr 24, 2018, 2:28 PM Marco Mistroni <[hidden email]> wrote:
HI all
 i am using the following code for persisting data into S3 (aws keys are already stored in the environment variables)

dataFrame.coalesce(1).write.format("com.databricks.spark.csv").save(fileName)

However, i keep on receiving an exception that the file does not exist

here's what comes from logs

18/04/24 22:15:32 INFO Persiste: Persisting data to text file: s3://ec2-bucket-mm-spark/form4-results-2404.results
Exception in thread "main" java.io.IOException: /form4-results-2404.results doesn't exist

It seems that Spark expects the file to be there before writing? which seems bizzarre?

I Have even tried to remove the coalesce ,but still got the same exception
Could anyone help pls?
kind regarsd
 marco