Apache Spark User List

This forum is an archive for the mailing list user@spark.apache.org (more options) Messages posted here will be sent to this mailing list.
1234 ... 323
Topics (11290)
Replies Last Post Views
Infer JSON schema in structured streaming Kafka. by satyajit vegesna
2
by satyajit vegesna
Why Spark 2.2.1 still bundles old Hive jars? by An Qin
0
by An Qin
Loading a spark dataframe column into T-Digest using java by Himasha de Silva
0
by Himasha de Silva
pyspark + from_json(col("col_name"), schema) returns all null by salemi
2
by Jacek Laskowski
Save hive table from spark in hive 2.1.0 by Alejandro Reina
4
by Alejandro Reina
Row Encoder For DataSet by Sandip Mehta
5
by Tomasz Dudek
UDF issues with spark by Afshin, Bardia
1
by Daniel Haviv
Weight column values not used in Binary Logistic Regression Summary by Stephen Boesch
1
by Sea aj
[CFP] DataWorks Summit Europe 2018 - Call for abstracts by Yanbo Liang-2
0
by Yanbo Liang-2
Spark + AI Summit CfP Open by Jules Damji
0
by Jules Damji
Structured Streaming + Kafka 0.10. connectors + valueDecoder and messageHandler with python by salemi
0
by salemi
JDBC to hive batch use case in spark by hokam chauhan
2
by ayan guha
ML Transformer: create feature that uses multiple columns by davideanastasia
1
by Filipp Zhinkin
Best way of shipping self-contained pyspark jobs with 3rd-party dependencies by Sergey Zhemzhitsky
0
by Sergey Zhemzhitsky
Programmatically get status of job (WAITING/RUNNING) by bsikander
16
by bsikander
Question on using pseudo columns in spark jdbc options by ☼ R Nair (रविशंकर ना...
2
by ☼ R Nair (रविशंकर ना...
[Spark SQL]: Dataset<Row> can not map into Dataset<Integer> in java by Himasha de Silva
0
by Himasha de Silva
[Spark SQL]: Dataset<Row> can not map into Dataset<Integer> in java by Himasha de Silva
0
by Himasha de Silva
RDD[internalRow] -> DataSet by satyajit vegesna
0
by satyajit vegesna
Spark job only starts tasks on a single node by Ji Yan
5
by Ji Yan
Best way of shipping self-contained pyspark jobs with 3rd-party dependencies by Sergey Zhemzhitsky
0
by Sergey Zhemzhitsky
Streaming Analytics/BI tool to connect Spark SQL by umargeek
1
by Pierce Lamb
How to write dataframe to kafka topic in spark streaming application using pyspark other than collect? by umargeek
1
by umargeek
sparkSession.sql("sql query") vs df.sqlContext().sql(this.query) ? by kant kodali
1
by khathiravan raj maad...
Do I need to do .collect inside forEachRDD by kant kodali
10
by Qiao, Richard
LDA and evaluating topic number by cbuntain
1
by Stephen Boesch
Spark ListenerBus by khajaasmath786
0
by khajaasmath786
Json Parsing. by satyajit vegesna
3
by satyajit vegesna
Explode schema name question by tj5527
0
by tj5527
Buffer/cache exhaustion Spark standalone inside a Docker container by Stein Welberg
0
by Stein Welberg
A possible bug? Must call persist to make code run by kwunlyou
0
by kwunlyou
[ML] LogisticRegression and dataset's standardization before training by Filipp Zhinkin
0
by Filipp Zhinkin
unable to connect to connect to cluster 2.2.0 by Imran Rajjad
2
by Imran Rajjad
How to export the Spark SQL jobs from the HiveThriftServer2 by wenxing zheng
1
by wenxing zheng
Access to Applications metrics by Nick Dimiduk
4
by Holden Karau
1234 ... 323