Having trouble with streaming (updateStateByKey)

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Having trouble with streaming (updateStateByKey)

mcampbell
I'm having a little trouble getting an "updateStateByKey()" call to work; was wondering if anyone could help.

In my chain of calls from getting Kafka messages out of the queue to converting the message to a set of "things", then pulling out 2 attributes of those things to a Tuple2, everything works.

So what I end up with is about a 1 second dump of things like this (this is crufted up data, but it's basically 2 IPV6 addresses...)

-------------------------------------------
Time: 1402507839000 ms
-------------------------------------------
(::ffff:a14:b03,::ffff:a0a:2bd4)
(::ffff:a14:b03,::ffff:a0a:2bd4)
(::ffff:a0a:25a7,::ffff:a14:b03)
(::ffff:a14:b03,::ffff:a0a:2483)
(::ffff:a14:b03,::ffff:a0a:2480)
(::ffff:a0a:2d96,::ffff:a14:b03)
(::ffff:a0a:abb5,::ffff:a14:100)
(::ffff:a0a:dcd7,::ffff:a14:28)
(::ffff:a14:28,::ffff:a0a:da94)
(::ffff:a14:b03,::ffff:a0a:2def)
...


This works ok.

The problem is when I add a call to updateStateByKey - the streaming app runs and runs and runs and never outputs anything.  When I debug, I can't confirm that my state update passed-in function is ever actually getting called.

Indeed I have breakpoints and println statements in my updateFunc and it LOOKS like it's never getting called.  I can confirm that the updateStateByKey function IS getting called (via it stopping on a breakpoint).

Is there something obvious I'm missing?
Reply | Threaded
Open this post in threaded view
|

Re: Having trouble with streaming (updateStateByKey)

mcampbell
I rearranged my code to do a reduceByKey which I think is working.  I also don't think the problem was that updateState call, but something else; unfortunately I changed a lot in looking for this issue, so not sure what the actual fix might have been, but I think it's working now.


On Wed, Jun 11, 2014 at 1:47 PM, Michael Campbell <[hidden email]> wrote:
I'm having a little trouble getting an "updateStateByKey()" call to work; was wondering if anyone could help.

In my chain of calls from getting Kafka messages out of the queue to converting the message to a set of "things", then pulling out 2 attributes of those things to a Tuple2, everything works.

So what I end up with is about a 1 second dump of things like this (this is crufted up data, but it's basically 2 IPV6 addresses...)

-------------------------------------------
Time: 1402507839000 ms
-------------------------------------------
(::ffff:a14:b03,::ffff:a0a:2bd4)
(::ffff:a14:b03,::ffff:a0a:2bd4)
(::ffff:a0a:25a7,::ffff:a14:b03)
(::ffff:a14:b03,::ffff:a0a:2483)
(::ffff:a14:b03,::ffff:a0a:2480)
(::ffff:a0a:2d96,::ffff:a14:b03)
(::ffff:a0a:abb5,::ffff:a14:100)
(::ffff:a0a:dcd7,::ffff:a14:28)
(::ffff:a14:28,::ffff:a0a:da94)
(::ffff:a14:b03,::ffff:a0a:2def)
...


This works ok.

The problem is when I add a call to updateStateByKey - the streaming app runs and runs and runs and never outputs anything.  When I debug, I can't confirm that my state update passed-in function is ever actually getting called.

Indeed I have breakpoints and println statements in my updateFunc and it LOOKS like it's never getting called.  I can confirm that the updateStateByKey function IS getting called (via it stopping on a breakpoint).

Is there something obvious I'm missing?