Records - Input Byte

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Records - Input Byte

danilopds
Hi,

I was reading the paper of Spark Streaming:
"Discretized Streams: Fault-Tolerant Streaming Computation at Scale"

So,
I read that performance evaluation used 100-byte input records in test Grep and WordCount.

I don't have much experience and I'd like to know how can I control this value in my records (like words in an input file)?
Can anyone suggest me something to start?

Thanks!
Reply | Threaded
Open this post in threaded view
|

Re: Records - Input Byte

Mayur Rustagi
What do you mean by "control your input”, are you trying to pace your spark streaming by number of words. If so that is not supported as of now, you can only control time & consume all files within that time period. 
--
Regards,
Mayur Rustagi
Ph: +1 (760) 203 3257
http://www.sigmoidanalytics.com
@mayur_rustagi


On Tue, Sep 9, 2014 at 2:24 AM, danilopds <[hidden email]> wrote:

Hi,

I was reading the paper of Spark Streaming:
"Discretized Streams: Fault-Tolerant Streaming Computation at Scale"

So,
I read that performance evaluation used 100-byte input records in test Grep
and WordCount.

I don't have much experience and I'd like to know how can I control this
value in my records (like words in an input file)?
Can anyone suggest me something to start?

Thanks!



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Records-Input-Byte-tp13733.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: [hidden email]
For additional commands, e-mail: [hidden email]