[Spark SQL]pyspark to count total number of days-no of holidays by using sql

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[Spark SQL]pyspark to count total number of days-no of holidays by using sql

Lakshmi Nivedita

I have a table with dates  date1  date2 in one table and number of holidays in another table
df1 = select date1,date2 ,ctry ,unixtimestamp(date2-date1) totalnumberofdays  - df2.holidays  from table A;

df2 = select count(holiays)
from table B
where holidate >= 'date1'(table A)
and holidate < = date2(table A)
and country = A.ctry(table A)

Except country no other column is not a unique key
sample data would be 
For a particular order the dates are date 1 --26.12.2012  and date2---- 06.01.2013 in a country. Total number of days 06.01.203-26.12.2012 = 10

I have to pass these above days  to get number of holidays .ie, suppose in India we get 2 holidays .

So final result would be 10-2 = 8  .I have to do this in pyspark

Thanks in advance.Any help is appreciated.
Thanks
Nivedita

--
k.Lakshmi Nivedita