Apache Spark User List

This forum is an archive for the mailing list user@spark.apache.org (more options) Messages posted here will be sent to this mailing list.
1234567 ... 382
Topics (13344)
Replies Last Post Views
spark df.write.partitionBy run very slow by JF Chen
10
by Shyam P
common logging in spark by kumar.rajat20del
0
by kumar.rajat20del
Spark SQL LIMIT Gets Stuck by Shahab Yunus
0
by Shahab Yunus
Best notebook for developing for apache spark using scala on Amazon EMR Cluster by V0lleyBallJunki3
1
by zjffdu
Spark Structured Streaming | Highly reliable de-duplication strategy by Akshay Bhardwaj
3
by Akshay Bhardwaj
Error while using spark-avro module in pyspark 2.4 by kanchan tewary
0
by kanchan tewary
Turning off Jetty Http Options Method by ankit jain
4
by ankit jain
spark on kubernetes driver pod phase changed from running to pending and starts another container in pod by zyfo2
0
by zyfo2
handling skewness issues by kumar.rajat20del
3
by Yeikel
Issue with offset management using Spark on Dataproc by Austin Weaver
4
by Shixiong(Ryan) Zhu
Re: How to specify number of Partition using newAPIHadoopFile() by Prateek Rajput
0
by Prateek Rajput
Fwd: How to specify number of Partition using newAPIHadoopFile() by Vatsal Patel
0
by Vatsal Patel
Handle Null Columns in Spark Structured Streaming Kafka by SNEHASISH DUTTA
4
by SNEHASISH DUTTA
unsubscribe by Amrit Jangid
1
by Arne Zachlod
spark.sql.hive.exec.dynamic.partition description by Mike Chan
0
by Mike Chan
spark hive concurrency by CPC
1
by Mich Talebzadeh
K8s-Spark client mode : Executor image not able to download application jar from driver by Nishant Ranjan
4
by Nishant Ranjan
unsubscribe by mayur bhole
0
by mayur bhole
unsubscribe by Byron Lee
1
by Song Yang
[Spark SQL]: Slow insertInto overwrite if target table has many partitions by Juho Autio
8
by van den Heever, Chri...
Different query result between spark thrift server and spark-shell by Jun Zhu-2
1
by Jun Zhu-2
[GraphX] Preserving Partitions when reading from HDFS by Mbilal
2
by Mbilal
repartition in df vs partitionBy in df by kumar.rajat20del
4
by moqi
[pyspark] Use output of one aggregated function for another aggregated function within the same groupby by rishishah.star
1
by geoHeil
Use derived column for other derived column in the same statement by rishishah.star
3
by rishishah.star
Is it possible to obtain the full command to be invoked by SparkLauncher? by Jeff Evans
4
by Jeff Evans
Re: DataFrameWriter does not adjust spark.sql.session.timeZone offset while writing orc files by Shubham Chaurasia
1
by Wenchen Fan
'No plan for EventTimeWatermark' error while using structured streaming with column pruning (spark 2.3.1) by kineret M
0
by kineret M
autoBroadcastJoinThreshold not working as expected by Mike Chan
2
by Mike Chan
Fwd: Issue with spark while reading from avro file by Prateek Rajput
1
by Prateek Rajput
spark stddev() giving '?' as output how to handle it ? i.e replace null/0 by Shyam P
1
by Shyam P
Handle empty partitions in pyspark by kanchan tewary
0
by kanchan tewary
Spark LogisticRegression got stuck on dataset with millions of columns by Qian He
3
by Weichen Xu
spark 2.4.1 -> 3.0.0-SNAPSHOT mllib by Koert Kuipers
0
by Koert Kuipers
toDebugString - RDD Logical Plan by kanchan tewary
2
by kanchan tewary
1234567 ... 382