Streaming JSON string from REST Api in Spring

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Streaming JSON string from REST Api in Spring

sonyjv
Hi,

I am very new to Spark and currently trying to implement a use case. We have a JSON based REST Api implemented in Spring which gets around 50 calls/sec. I would like to stream these JSON strings to Spark for processing and aggregation. We are having strict SLA and would like to know the best way to design the interface between the REST Api and Spark.

Also, the processing part has different steps and is think of having multiple Spark jobs for performing these steps. What is the best way of triggering one job from another and passing data between these jobs.

Thanks,
Sony
 
Reply | Threaded
Open this post in threaded view
|

Re: Streaming JSON string from REST Api in Spring

Mayur Rustagi
You can create an RDD with json credentials & then run a mapper which takes these credentials & queries the api & stores results in another RDD. 
You can pass that RDD from task to task for further computation steps. 
Thr are two issues here: 
1. how is number of calls /sec throttled, if you want spark to throttle it you should create rdd of appropriate size, if your API throttles it then also you should use spark of appropriate size as the slower responses will delay your overall processing
2. failed query response: this can be handled by filtering the rdd & storing failed responses in some disk location for debugging
3. Pipelining: do you expect processing to last long, in which case are you planning to pipeline the processing(basically tasks running on already downloaded data while you are downloading fresh data). This is a toughie , most likely you can do that with threading but there is no guarantee that you will get pipelining benefits . 


Mayur Rustagi
Ph: +1 (760) 203 3257


On Thu, Mar 6, 2014 at 10:21 AM, sonyjv <[hidden email]> wrote:
Hi,

I am very new to Spark and currently trying to implement a use case. We have
a JSON based REST Api implemented in Spring which gets around 50 calls/sec.
I would like to stream these JSON strings to Spark for processing and
aggregation. We are having strict SLA and would like to know the best way to
design the interface between the REST Api and Spark.

Also, the processing part has different steps and is think of having
multiple Spark jobs for performing these steps. What is the best way of
triggering one job from another and passing data between these jobs.

Thanks,
Sony




--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Streaming-JSON-string-from-REST-Api-in-Spring-tp2358.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|

Re: Streaming JSON string from REST Api in Spring

sonyjv
Thanks Mayur for your response.

I think I need to clarify the first part of my query. The JSON based REST API will be called by external interfaces. These requests needs to be processed in a streaming mode in Spark. I am not clear about the following points

1. How can JSON request string (50 per sec) be continuously streamed to Spark.
2. The processing of the request in Spark will not last long. But would require to be split into multiple steps to render fast initial response. So for coordinating the Spark jobs do I have to use Kafka or any other queues. Or can I directly stream from one job to another.

Regards,
Sony
Reply | Threaded
Open this post in threaded view
|

Re: Streaming JSON string from REST Api in Spring

Mayur Rustagi
Easiest is to use a queue, Kafka for example. So push your json request string into kafka, 
connect spark streaming to kafka & pull data from it & execute it. 
Spark streaming will split up the jobs & pipeline the data. 

Mayur Rustagi
Ph: +1 (760) 203 3257


On Thu, Mar 6, 2014 at 6:24 PM, sonyjv <[hidden email]> wrote:
Thanks Mayur for your response.

I think I need to clarify the first part of my query. The JSON based REST
API will be called by external interfaces. These requests needs to be
processed in a streaming mode in Spark. I am not clear about the following
points

1. How can JSON request string (50 per sec) be continuously streamed to
Spark.
2. The processing of the request in Spark will not last long. But would
require to be split into multiple steps to render fast initial response. So
for coordinating the Spark jobs do I have to use Kafka or any other queues.
Or can I directly stream from one job to another.

Regards,
Sony



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Streaming-JSON-string-from-REST-Api-in-Spring-tp2358p2383.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|

Re: Streaming JSON string from REST Api in Spring

sonyjv
Thanks Mayur for your clarification.