Apache Spark User List

This forum is an archive for the mailing list user@spark.apache.org (more options) Messages posted here will be sent to this mailing list.
123456 ... 400
Topics (13986)
Replies Last Post Views
setup pom.xml by Zahid Rahman
0
by Zahid Rahman
configuration error by Zahid Rahman
1
by Zahid Rahman
Structured Streaming: mapGroupsWithState UDT serialization does not work by bryan.jeffrey@gmail....
8
by bryan.jeffrey@gmail....
Convert each partition of RDD to Dataframe by Manjunath Shetty H
7
by Manjunath Shetty H
Aggregating values by a key field in Spark Streaming by Something Something
0
by Something Something
Pyspark Convert Struct Type to Map Type by anbutech
0
by anbutech
dropDuplicates and watermark in structured streaming by shicheng31604@gmail....
4
by shicheng31604@gmail....
Spark Streaming: Aggregating values across batches by Something Something
1
by Tathagata Das
Spark join: grouping of records having same value for a particular column in same partition by ARAVIND ARUMUGHAM SE...
0
by ARAVIND ARUMUGHAM SE...
Standard practices for building dashboards for spark processed data by Aniruddha P Tekade
3
by Breno Arosa
[Spark SQL] Memory problems with packing too many joins into the same WholeStageCodegen by Jianneng Li-2
5
by Liu Genie
dataframe null safe joins given a list of columns by Marcelo Valle
1
by Enrico Minack
Integration testing Framework Spark SQL Scala by Ruijing Li
1
by Ruijing Li
What options do I have to handle third party classes that are not serializable? by Yeikel
1
by Jeff Evans
[Spark SQL] NegativeArraySizeException When Parse InternalRow to DTO Field with Type Array[String] by Proust (Feng Guizhou...
3
by Proust (Feng Guizhou...
[SPARK Dependencies] Security Vulnerability with Xerces version < 2.12 by Anthony Poncet
0
by Anthony Poncet
setting initial state for mapGroupsWithState by dpristin
0
by dpristin
Spark reading from Hbase throws java.lang.NoSuchMethodError: org.json4s.jackson.JsonMethods by Mich Talebzadeh
8
by Jörn Franke
Does dataframe spark API write/create a single file instead of directory as a result of write operation. by Kshitij
5
by Nicolas Paris-2
PowerIterationClustering by Monish R
0
by Monish R
Serialization error when using scala kernel with Jupyter by Nikhil Goyal
1
by Apostolos N. Papadop...
Spark RDD ouput path for data lineage by ard3nte
0
by ard3nte
Better way to debug serializable issues by Ruijing Li
2
by Ruijing Li
CBO not working? by Aelur Sadgod
1
by Aelur Sadgod
Spark Streaming job having issue with Java Flight Recorder (JFR) by Pramod Biligiri
0
by Pramod Biligiri
unsubscribe by julio.cesare
2
by Alexey Kovyazin
Questions about count() performance with dataframes and parquet files by WranglingData
8
by Nicolas Paris-2
Apache Arrow support for Apache Spark by Subash Prabakar
1
by Chris Teoh
[ML] [How-to]: How to unload the loaded W2V model in Pyspark? by Zhefu PENG
0
by Zhefu PENG
Connected components using GraphFrames is significantly slower than GraphX? by kant kodali
0
by kant kodali
Best way to read batch from Kafka and Offsets by Ruijing Li
12
by Burak Yavuz-2
Spark 2.4.4 has bigger memory impact than 2.3? by Ruijing Li
0
by Ruijing Li
Spark 2.4.4 with Hive 2.3.6 by Vinod Kancharana
0
by Vinod Kancharana
Environment variable for deleting .sparkStaging by mailfordebu
1
by mailfordebu
Start a standalone server as root and use it with user accounts by Ben Caine
1
by WranglingData
123456 ... 400