This post has NOT been accepted by the mailing list yet.
I am trying to use Spark with Hadoop on the Google Compute Engine. They have a connector to Google Cloud Storage (think S3) described here: GCS Connector
Everything works fine, except I can't load a textFile using it in pyspark:
f = sc.textFile("gs://mybucketname")
This throws "java.io.IOException: No FileSystem for scheme: gs". I emailed the Google Hadoop devs and they said the URI's have to start with "gs". Is support for this URI something that can easily be patched/added to Spark?