Apache Spark User List

This forum is an archive for the mailing list user@spark.apache.org (more options) Messages posted here will be sent to this mailing list.
1 ... 3456789 ... 384
Topics (13417)
Replies Last Post Views
Spark SQL Teradata load is very slow by khajaasmath786
1
by Shyam P
Update / Delete records in Parquet by Chetan Khatri
5
by Chetan Khatri
Getting EOFFileException while reading from sequence file in spark by Prateek Rajput
3
by Prateek Rajput
What is Spark context cleaner in structured streaming by Akshay Bhardwaj
1
by kanchan tewary
spark df.write.partitionBy run very slow by JF Chen
10
by Shyam P
common logging in spark by kumar.rajat20del
0
by kumar.rajat20del
Spark SQL LIMIT Gets Stuck by Shahab Yunus
0
by Shahab Yunus
Best notebook for developing for apache spark using scala on Amazon EMR Cluster by V0lleyBallJunki3
1
by zjffdu
Spark Structured Streaming | Highly reliable de-duplication strategy by Akshay Bhardwaj
3
by Akshay Bhardwaj
Error while using spark-avro module in pyspark 2.4 by kanchan tewary
0
by kanchan tewary
Turning off Jetty Http Options Method by ankit jain
4
by ankit jain
spark on kubernetes driver pod phase changed from running to pending and starts another container in pod by zyfo2
0
by zyfo2
handling skewness issues by kumar.rajat20del
3
by Yeikel
Issue with offset management using Spark on Dataproc by Austin Weaver
4
by Shixiong(Ryan) Zhu
Re: How to specify number of Partition using newAPIHadoopFile() by Prateek Rajput
0
by Prateek Rajput
Fwd: How to specify number of Partition using newAPIHadoopFile() by Vatsal Patel
0
by Vatsal Patel
Handle Null Columns in Spark Structured Streaming Kafka by SNEHASISH DUTTA
4
by SNEHASISH DUTTA
unsubscribe by Amrit Jangid
1
by Arne Zachlod
spark.sql.hive.exec.dynamic.partition description by Mike Chan
0
by Mike Chan
spark hive concurrency by CPC
1
by Mich Talebzadeh
K8s-Spark client mode : Executor image not able to download application jar from driver by Nishant Ranjan
4
by Nishant Ranjan
unsubscribe by mayur bhole
0
by mayur bhole
unsubscribe by Byron Lee
1
by Song Yang
[Spark SQL]: Slow insertInto overwrite if target table has many partitions by Juho Autio
8
by van den Heever, Chri...
Different query result between spark thrift server and spark-shell by Jun Zhu-2
1
by Jun Zhu-2
[GraphX] Preserving Partitions when reading from HDFS by Mbilal
2
by Mbilal
repartition in df vs partitionBy in df by kumar.rajat20del
4
by moqi
[pyspark] Use output of one aggregated function for another aggregated function within the same groupby by rishishah.star
1
by geoHeil
Use derived column for other derived column in the same statement by rishishah.star
3
by rishishah.star
Is it possible to obtain the full command to be invoked by SparkLauncher? by Jeff Evans
4
by Jeff Evans
Re: DataFrameWriter does not adjust spark.sql.session.timeZone offset while writing orc files by Shubham Chaurasia
1
by Wenchen Fan
'No plan for EventTimeWatermark' error while using structured streaming with column pruning (spark 2.3.1) by kineret M
0
by kineret M
autoBroadcastJoinThreshold not working as expected by Mike Chan
2
by Mike Chan
Fwd: Issue with spark while reading from avro file by Prateek Rajput
1
by Prateek Rajput
spark stddev() giving '?' as output how to handle it ? i.e replace null/0 by Shyam P
1
by Shyam P
1 ... 3456789 ... 384