Using newApiHadoopRDD for reading from HBase

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Using newApiHadoopRDD for reading from HBase

Biplob Biswas
Hi,

I had a few questions regarding the way newApiHadoopRDD accesses data from HBase. 

1. Does it load all the data from a scan operation directly in memory? 
2. According to my understanding, the data is loaded from different regions to different executors, is that assumption/understanding correct?
3. If it does load all the data from the scan operation, what happens when the data size is more than executor memory?
4. What happens when we have a huge number of column qualifiers for a given row ?


Thanks & Regards
Biplob Biswas
Reply | Threaded
Open this post in threaded view
|

Re: Using newApiHadoopRDD for reading from HBase

Biplob Biswas
Can someone please help me out here, maybe point to some documentation for the same? I couldn't find almost anything.

Thanks & Regards
Biplob Biswas


On Thu, Jun 28, 2018 at 11:13 AM Biplob Biswas <[hidden email]> wrote:
Hi,

I had a few questions regarding the way newApiHadoopRDD accesses data from HBase. 

1. Does it load all the data from a scan operation directly in memory? 
2. According to my understanding, the data is loaded from different regions to different executors, is that assumption/understanding correct?
3. If it does load all the data from the scan operation, what happens when the data size is more than executor memory?
4. What happens when we have a huge number of column qualifiers for a given row ?


Thanks & Regards
Biplob Biswas