Apache Spark User List

This forum is an archive for the mailing list user@spark.apache.org (more options) Messages posted here will be sent to this mailing list.
1234 ... 378
Topics (13210)
Replies Last Post Views
Writing to Aerospike from Spark with bulk write with user authentication fails by Mich Talebzadeh
0
by Mich Talebzadeh
Spark job running for long time by kumar.rajat20del
4
by kumar.rajat20del
Difference between 'cores' config params: spark submit on k8s by Battini Lakshman
1
by Li Gao-2
How to execute non-timestamp-based aggregations in spark structured streaming? by Stephen Boesch
0
by Stephen Boesch
repartition in df vs partitionBy in df by kumar.rajat20del
0
by kumar.rajat20del
toDebugString - RDD Logical Plan by kanchan tewary
1
by Dylan Guedes
Feature engineering ETL for machine learning by Subash Prabakar
0
by Subash Prabakar
--jars vs --spark.executor.extraClassPath vs --spark.driver.extraClassPath by kumar.rajat20del
3
by jasonnerothin@gmail....
Not able to convert Image binary to an image by swastik mittal
0
by swastik mittal
K8s-Spark client mode : Executor image not able to download application jar from driver by Nikhil Chinnapa
1
by Stavros Kontopoulos
writing into oracle database is very slow by Lian Jiang
6
by Mich Talebzadeh
Difference between Checkpointing and Persist by Subash Prabakar
3
by gene.pang
Error: NoSuchFieldError: HIVE_STATS_JDBC_TIMEOUT while running a Spark-Hive Job by Rishikesh Gawade
1
by rajiv shah
BigDL and Analytics Zoo talks at upcoming Spark+AI Summit and Strata London by Jason Dai
1
by Khare, Ankit
Spark-submit and no java log file generated by Mann Du
0
by Mann Du
[Spark SQL]: Slow insertInto overwrite if target table has many partitions by Juho Autio
0
by Juho Autio
autoBroadcastJoinThreshold not working as expected by Mike Chan
0
by Mike Chan
cache table vs. parquet table performance by Tomas Bartalos
5
by Bin Fan
Boto3 library send to pyspark by Gorka Bravo Martinez
4
by Gourav Sengupta
Spark SQL API taking longer time than DF API. by neerajbhadani
7
by Yeikel
Parallelize Join Problem by Paul.Bauriegel
2
by asma zgolli
An alternative logic to collaborative filtering works fine but we are facing run time issues in executing the job by Balakumar iyer S
1
by Ankit Khettry
How to use same SparkSession in another app? by Rishikesh Gawade
2
by Anthony, Olufemi
Reading RDD by (key, data) from s3 by Gorka Bravo Martinez
1
by yujhe.li
Dynamic executor scaling spark/Kubernetes by purna pradeep
0
by purna pradeep
Fwd: Issue with spark while reading from avro file by Prateek Rajput
0
by Prateek Rajput
[GraphX] Preserving Partitions when reading from HDFS by Mbilal
1
by Manu Zhang
JvmPauseMonitor by Eugene Koifman
1
by Arun Mahadevan
How to speedup your Spark ML training by chris_inaccel
0
by chris_inaccel
ApacheCon NA 2019 Call For Proposal and help promoting Spark project by Felix Cheung
1
by Felix Cheung
How to print DataFrame.show(100) to text file at HDFS by Chetan Khatri
4
by Yeikel
Best Practice for Writing data into a Hive table by Debabrata Ghosh
1
by Yeikel
Question about relationship between number of files and initial tasks(partitions) by Arthur Li
4
by Yeikel
Offline state manipulation tool for structured streaming query by Jungtaek Lim
0
by Jungtaek Lim
Is there any spark API function to handle a group of companies at once in this scenario? by Shyam P
9
by Shyam P
1234 ... 378