Quantcast

Apache Spark User List

This forum is an archive for the mailing list user@spark.apache.org (more options) Messages posted here will be sent to this mailing list.
1234 ... 236
Topics (8245)
Replies Last Post Views
Spark 1.4 RDD to DF fails with toDF() by stati
1
by deepikakhera
Why is huge data shuffling in Spark when using union()/coalesce(1,false) on DataFrame? by unk1102
0
by unk1102
New to Spark - Paritioning Question by mmike87
0
by mmike87
Python Spark Streaming example with textFileStream does not work. Why? by Kamilbek
0
by Kamilbek
Mongodb update through forEach Operation - Task Not Serializable by vibhavsr
0
by vibhavsr
[spark-streaming] New directStream API reads topic's partitions sequentially. Why? by ponkin
0
by ponkin
Drools integration with Spark by Shiva Moorthy
0
by Shiva Moorthy
Does Spark.ml LogisticRegression assumes only Double valued features? by njoshi
0
by njoshi
Running Examples by cetaylor
2
by delbert
Spark UDF for columns more than 22 columns by wesley
0
by wesley
SparkSQL without access to arrays? by Terry
0
by Terry
spark.cassandra.input.split.size_in_mb is not a valid Spark Cassandra Connector variable by Munai Das Udasin
0
by Munai Das Udasin
PySpark usage by Munai Das Udasin
0
by Munai Das Udasin
spark-submit throws errors, unlike pyspark by kraster
0
by kraster
LZO-compressed files by Bertrand
0
by Bertrand
exception handling during an exception in an RDD by saikrishnagopu
1
by Himanshu Mehra
What should be the optimal value for spark.sql.shuffle.partition? by unk1102
5
by oubrik
wild cards in spark sql by Hafiz Mujadid
1
by Anas Sherwani
Spark DataFrame saveAsTable with partitionBy creates no ORC file in HDFS by unk1102
0
by unk1102
Spark MLlib Decision Tree Node Accuracy by derechan
0
by derechan
Inferring JSON schema from a JSON string in a dataframe column by mstang
0
by mstang
Simple join of two Spark DataFrame failing with “org.apache.spark.sql.AnalysisException: Cannot resolve column name” by steve.felsheim
0
by steve.felsheim
Question about Google Books Ngrams with pyspark (1.4.1) by Bertrand
4
by Bertrand
java.lang.NoClassDefFoundError: org/apache/hadoop/crypto/key/KeyProvider by wangwei
0
by wangwei
Save dataframe into hbase by Hafiz Mujadid
0
by Hafiz Mujadid
Question regarding SparkStreaming in SparkR by Niharika
0
by Niharika
Spark UI returning error 500 in yarn-client mode by mosheeshel
2
by rajeevpra
Does pyspark have a function to partition data by column value? by yanze
2
by yanze
Hung spark executors don't count toward worker memory limit by Keith Simmons
3
by hai
Conditionally do things different on the first minibatch vs subsequent minibatches in a dstream by steve_ash
0
by steve_ash
FAILURE TO PRINT THE OUTPUT OF STREAMED LINEAR REGRESSION by shem
0
by shem
Creating in-memory JavaPairInputDStream for testing. by anujojha
2
by anujojha
Potential NPE while exiting spark-shell by nasokan
1
by nasokan
Using groupByKey() with many values per key by christoph.pirkl-2
0
by christoph.pirkl-2
reading multiple parquet file using spark sql by Hafiz Mujadid
0
by Hafiz Mujadid
1234 ... 236