how to decide broadcast join data size

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

how to decide broadcast join data size

Selvam Raman
Hi,

I could not find useful formula or documentation which will help me to decide the broadcast join data size depends on the cluster size. 

Please let me know is there thumb rule available to find.

For example
cluster size - 20 node cluster, 32 gb per node and 8 core per node.

executor-memory = 8gb, executor-core=4

Memory:
8gb(0.4% per internal) - 4.8gb for actual computation and storage. lets consider i have not done any persist in this case i could utilize 4.8gb per executor.
IS IT POSSIBLE FOR ME TO USE 400MB file for BROADCAST JOIN?

--
Selvam Raman
"லஞ்சம் தவிர்த்து நெஞ்சம் நிமிர்த்து"