mission statement : unified

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

mission statement : unified

Hulio andres
 
Apache Spark's  mission statement is  Apache Spark™ is a unified analytics engine for large-scale data processing. 
 
To what is the word "unified" inferring ?
 
 
 
 
 
 
--------------------------------------------------------------------- To unsubscribe e-mail: [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: mission statement : unified

Gourav Sengupta
Hi,

I think that it is just a marketing statement. But with SPARK 3.x, now that you are seeing that SPARK is no more than just another distributed data processing engine, they are trying to join data pre-processing into ML pipelines directly. I may call that unified. 

But you get the same with several other frameworks as well now so not quite sure how unified creates a unique brand value.


Regards,
Gourav Sengupta 

On Sun, Oct 18, 2020 at 6:40 PM Hulio andres <[hidden email]> wrote:
 
Apache Spark's  mission statement is  Apache Spark™ is a unified analytics engine for large-scale data processing. 
 
To what is the word "unified" inferring ?
 
 
 
 
 
 
--------------------------------------------------------------------- To unsubscribe e-mail: [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: mission statement : unified

Sonal Goyal
My thought is that Spark supports analytics for structured and unstructured data, batch as well as real time. This was pretty revolutionary when Spark first came out. That's where the unified term came from I think. Even after all these years, Spark remains the trusted framework for enterprise analytics. 

On Mon, 19 Oct 2020, 11:24 Gourav Sengupta <[hidden email] wrote:
Hi,

I think that it is just a marketing statement. But with SPARK 3.x, now that you are seeing that SPARK is no more than just another distributed data processing engine, they are trying to join data pre-processing into ML pipelines directly. I may call that unified. 

But you get the same with several other frameworks as well now so not quite sure how unified creates a unique brand value.


Regards,
Gourav Sengupta 

On Sun, Oct 18, 2020 at 6:40 PM Hulio andres <[hidden email]> wrote:
 
Apache Spark's  mission statement is  Apache Spark™ is a unified analytics engine for large-scale data processing. 
 
To what is the word "unified" inferring ?
 
 
 
 
 
 
--------------------------------------------------------------------- To unsubscribe e-mail: [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: mission statement : unified

Khalid Mammadov
Correct. Also as explained in the book LearningSpark2.0 by Databiricks:

Unified Analytics
While the notion of unification is not unique to Spark, it is a core component of its design philosophy and evolution. In November 2016, the Association for Computing Machinery (ACM) recognized Apache Spark and conferred upon its original creators the prestigious ACM Award for their paper describing Apache Spark as a “Unified Engine for Big Data Processing.” The award-winning paper notes that Spark replaces all the separate batch processing, graph, stream, and query engines like Storm, Impala, Dremel, Pregel, etc. with a unified stack of components that addresses diverse workloads under a single distributed fast engine.

Khalid

On 19 Oct 2020, at 07:03, Sonal Goyal <[hidden email]> wrote:


My thought is that Spark supports analytics for structured and unstructured data, batch as well as real time. This was pretty revolutionary when Spark first came out. That's where the unified term came from I think. Even after all these years, Spark remains the trusted framework for enterprise analytics. 

On Mon, 19 Oct 2020, 11:24 Gourav Sengupta <[hidden email] wrote:
Hi,

I think that it is just a marketing statement. But with SPARK 3.x, now that you are seeing that SPARK is no more than just another distributed data processing engine, they are trying to join data pre-processing into ML pipelines directly. I may call that unified. 

But you get the same with several other frameworks as well now so not quite sure how unified creates a unique brand value.


Regards,
Gourav Sengupta 

On Sun, Oct 18, 2020 at 6:40 PM Hulio andres <[hidden email]> wrote:
 
Apache Spark's  mission statement is  Apache Spark™ is a unified analytics engine for large-scale data processing. 
 
To what is the word "unified" inferring ?
 
 
 
 
 
 
--------------------------------------------------------------------- To unsubscribe e-mail: [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: mission statement : unified

Stephen Boesch
While the core of the Spark is and has been quite solid and a go-to infrastructure, the streaming part of the story was still quite weak at least through mid last year.  I went into depth on both structured and the older DStream.  The structured in particular was difficult to use: both in terms of limitations on what it supports and documentation/examples.  Has there been meaningful advancements in the past twelve+ months?

On Sun, 25 Oct 2020 at 13:58, Khalid Mammadov <[hidden email]> wrote:
Correct. Also as explained in the book LearningSpark2.0 by Databiricks:

Unified Analytics
While the notion of unification is not unique to Spark, it is a core component of its design philosophy and evolution. In November 2016, the Association for Computing Machinery (ACM) recognized Apache Spark and conferred upon its original creators the prestigious ACM Award for their paper describing Apache Spark as a “Unified Engine for Big Data Processing.” The award-winning paper notes that Spark replaces all the separate batch processing, graph, stream, and query engines like Storm, Impala, Dremel, Pregel, etc. with a unified stack of components that addresses diverse workloads under a single distributed fast engine.

Khalid

On 19 Oct 2020, at 07:03, Sonal Goyal <[hidden email]> wrote:


My thought is that Spark supports analytics for structured and unstructured data, batch as well as real time. This was pretty revolutionary when Spark first came out. That's where the unified term came from I think. Even after all these years, Spark remains the trusted framework for enterprise analytics. 

On Mon, 19 Oct 2020, 11:24 Gourav Sengupta <[hidden email] wrote:
Hi,

I think that it is just a marketing statement. But with SPARK 3.x, now that you are seeing that SPARK is no more than just another distributed data processing engine, they are trying to join data pre-processing into ML pipelines directly. I may call that unified. 

But you get the same with several other frameworks as well now so not quite sure how unified creates a unique brand value.


Regards,
Gourav Sengupta 

On Sun, Oct 18, 2020 at 6:40 PM Hulio andres <[hidden email]> wrote:
 
Apache Spark's  mission statement is  Apache Spark™ is a unified analytics engine for large-scale data processing. 
 
To what is the word "unified" inferring ?
 
 
 
 
 
 
--------------------------------------------------------------------- To unsubscribe e-mail: [hidden email]