Apache Spark User List

This forum is an archive for the mailing list user@spark.apache.org (more options) Messages posted here will be sent to this mailing list.
1234 ... 398
Topics (13916)
Replies Last Post Views
In Spark Streaming, Direct Kafak Consumers are not evenly distrubuted across executors by sd.hrishi
0
by sd.hrishi
Convert each partition of RDD to Dataframe by Manjunath Shetty H
7
by Manjunath Shetty H
Structured Streaming: mapGroupsWithState UDT serialization does not work by bryan.jeffrey@gmail....
5
by bryan.jeffrey@gmail....
Aggregating values by a key field in Spark Streaming by Something Something
0
by Something Something
Pyspark Convert Struct Type to Map Type by anbutech
0
by anbutech
Compute the Hash of each row in new column by Chetan Khatri
2
by Enrico Minack
dropDuplicates and watermark in structured streaming by shicheng31604@gmail....
4
by shicheng31604@gmail....
Spark Streaming: Aggregating values across batches by Something Something
1
by Tathagata Das
Spark join: grouping of records having same value for a particular column in same partition by ARAVIND ARUMUGHAM SE...
0
by ARAVIND ARUMUGHAM SE...
Standard practices for building dashboards for spark processed data by Aniruddha P Tekade
3
by Breno Arosa
[Spark SQL] Memory problems with packing too many joins into the same WholeStageCodegen by Jianneng Li-2
5
by Liu Genie
dataframe null safe joins given a list of columns by Marcelo Valle
1
by Enrico Minack
Integration testing Framework Spark SQL Scala by Ruijing Li
1
by Ruijing Li
What options do I have to handle third party classes that are not serializable? by Yeikel
1
by Jeff Evans
[Spark SQL] NegativeArraySizeException When Parse InternalRow to DTO Field with Type Array[String] by Proust (Feng Guizhou...
3
by Proust (Feng Guizhou...
[SPARK Dependencies] Security Vulnerability with Xerces version < 2.12 by Anthony Poncet
0
by Anthony Poncet
setting initial state for mapGroupsWithState by dpristin
0
by dpristin
Spark reading from Hbase throws java.lang.NoSuchMethodError: org.json4s.jackson.JsonMethods by Mich Talebzadeh
8
by Jörn Franke
Does dataframe spark API write/create a single file instead of directory as a result of write operation. by Kshitij
5
by Nicolas Paris-2
PowerIterationClustering by Monish R
0
by Monish R
Serialization error when using scala kernel with Jupyter by Nikhil Goyal
1
by Apostolos N. Papadop...
Spark RDD ouput path for data lineage by ard3nte
0
by ard3nte
Better way to debug serializable issues by Ruijing Li
2
by Ruijing Li
CBO not working? by Aelur Sadgod
1
by Aelur Sadgod
Spark Streaming job having issue with Java Flight Recorder (JFR) by Pramod Biligiri
0
by Pramod Biligiri
unsubscribe by julio.cesare
2
by Alexey Kovyazin
Questions about count() performance with dataframes and parquet files by WranglingData
8
by Nicolas Paris-2
Apache Arrow support for Apache Spark by Subash Prabakar
1
by Chris Teoh
[ML] [How-to]: How to unload the loaded W2V model in Pyspark? by Zhefu PENG
0
by Zhefu PENG
Connected components using GraphFrames is significantly slower than GraphX? by kant kodali
0
by kant kodali
Best way to read batch from Kafka and Offsets by Ruijing Li
12
by Burak Yavuz-2
Spark 2.4.4 has bigger memory impact than 2.3? by Ruijing Li
0
by Ruijing Li
Spark 2.4.4 with Hive 2.3.6 by Vinod Kancharana
0
by Vinod Kancharana
Environment variable for deleting .sparkStaging by mailfordebu
1
by mailfordebu
Start a standalone server as root and use it with user accounts by Ben Caine
1
by WranglingData
1234 ... 398