Apache Spark User List

This forum is an archive for the mailing list user@spark.apache.org (more options) Messages posted here will be sent to this mailing list.
1234 ... 384
Topics (13417)
Replies Last Post Views
A basic question by Shyam P
3
by Shyam P
Filter cannot be pushed via a Join by William Wong
2
by Gourav Sengupta
Exposing JIRA issue types at GitHub PRs by Dongjoon Hyun-2
7
by Gabor Somogyi
Spark yarn-client encounter HTTP ERROR 500 when accessing spark.driver.appUIAddress by Pipster Neko
0
by Pipster Neko
Spark read csv option - capture exception in a column in permissive mode by ajay.thompson
3
by Anselmi Rodriguez, A...
Creating Spark buckets that Presto / Athena / Hive can leverage by Daniel Mateus Pires
1
by Gourav Sengupta
[Pyspark 2.3+] Timeseries with Spark by rishishah.star
2
by rishishah.star
Spark 2.4.3 - Structured Streaming - high on Storage Memory by puneetloya
1
by puneetloya
unsubscribe by Humberto Marchezi
0
by Humberto Marchezi
Spark Kafka Streaming stopped by Amit Sharma
0
by Amit Sharma
[pyspark 2.3+] CountDistinct by rishishah.star
0
by rishishah.star
Spark on Yarn - Dynamically getting a list of archives from --archives in spark-submit by Tommy Li
0
by Tommy Li
[Spark Core]: What is the release date for Spark 3 ? by Alex Dettinger
2
by Vadim Semenov-2
best docker image to use by Marcelo Valle
2
by Marcelo Valle
High level explanation of dropDuplicates by Yeikel
4
by Yeikel
Spark Dataframe NTILE function by Subash Prabakar
0
by Subash Prabakar
Getting driver logs in Standalone Cluster by tkrol
2
by tkrol
[StructuredStreaming] HDFSBackedStateStoreProvider is leaking .crc files. by maasg
2
by Jungtaek Lim
Re: Clean up method for DataSourceReader by Shubham Chaurasia
0
by Shubham Chaurasia
Performance difference between Dataframe and Dataset especially on parquet data. by Shivam Sharma
0
by Shivam Sharma
unsubscribe by Sonu Jyotshna
1
by B2B Web ID
Employment opportunities. by Prashant Sharma
0
by Prashant Sharma
Why my spark job STATE--> Running FINALSTATE --> Undefined. by Shyam P
1
by Akshay Bhardwaj
What is the compatibility between releases? by Yeikel
0
by Yeikel
[pyspark 2.3+] count distinct returns different value every time it is run on the same dataset by rishishah.star
0
by rishishah.star
Spark 2.4.1 on Kubernetes - DNS resolution of driver fails by Olivier Girardot-2
3
by Prudhvi Chennuru (CO...
Spark on Kubernetes - log4j.properties not read by Dave Jaffe-2
2
by Dave Jaffe
Fwd: [Spark SQL Thrift Server] Persistence errors with PostgreSQL and MySQL in 2.4.3 by rmartine
1
by rmartine
AWS EMR slow write to HDFS by femibyte
0
by femibyte
Fwd: Spark kafka streaming job stopped by Amit Sharma
1
by Amit Sharma
Read hdfs files in spark streaming by Deepak Sharma
6
by nitin jain
Spark structured streaming leftOuter join not working as I expect by Joe Ammann
5
by Jungtaek Lim
[Pyspark 2.4] Best way to define activity within different time window by rishishah.star
4
by geoHeil
Spark 2.2 With Column usage by anbutech
3
by Jacek Laskowski
ARM CI for spark by huangtianhua
1
by Youngwoo Kim (김영우)
1234 ... 384