It seems like it never came to the mind of the developers of Apache Spark that people might want to use two of their libraries together to perform graph analysis on streamed Twitter data.

Why can't I simply use the streamed batches to create and extend a graph and perform continuous graph Algorithms on it??

  • 0
    Because you're not Kafkaesque enough. Create two topics and a handler to condense the set.
  • 0
    I'm not really familiar with Kafka, but I don't think this can help me, as Graphx Graphs can only be constructed from Rdds and I can't combine Rdds coming from the stream, only their values/keys.
Add Comment