Jdbc Hook in Spark Batch Application

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Jdbc Hook in Spark Batch Application

lec ssmi
Hi:
   guys,  I have some spark programs that have database connection operations.  I want to acquire the connection information, such as jdbc connection properties ,  but not too intrusive to the code.
  Any good ideas ? Can java agent make it ?
  
Reply | Threaded
Open this post in threaded view
|

Re: Jdbc Hook in Spark Batch Application

Gabor Somogyi
One can wrap the JDBC driver and such a way eveything can be sniffed.

On Thu, 24 Dec 2020, 03:51 lec ssmi, <[hidden email]> wrote:
Hi:
   guys,  I have some spark programs that have database connection operations.  I want to acquire the connection information, such as jdbc connection properties ,  but not too intrusive to the code.
  Any good ideas ? Can java agent make it ?
  
Reply | Threaded
Open this post in threaded view
|

Re: Jdbc Hook in Spark Batch Application

lec ssmi
Thanks.
But there is a problem that the classes referenced in the code need to be modified. I want to try not to change the existing code.

Gabor Somogyi <[hidden email]> 于2020年12月25日周五 上午12:16写道:
One can wrap the JDBC driver and such a way eveything can be sniffed.

On Thu, 24 Dec 2020, 03:51 lec ssmi, <[hidden email]> wrote:
Hi:
   guys,  I have some spark programs that have database connection operations.  I want to acquire the connection information, such as jdbc connection properties ,  but not too intrusive to the code.
  Any good ideas ? Can java agent make it ?
  
Reply | Threaded
Open this post in threaded view
|

Re: Jdbc Hook in Spark Batch Application

Gabor Somogyi
AFAIK there is no other way. In the latest release JDBC connection provider API is added but it also needs some code modification. BTW, if there would be a hook API then code changes need to be added too, right?

On Fri, 25 Dec 2020, 02:35 lec ssmi, <[hidden email]> wrote:
Thanks.
But there is a problem that the classes referenced in the code need to be modified. I want to try not to change the existing code.

Gabor Somogyi <[hidden email]> 于2020年12月25日周五 上午12:16写道:
One can wrap the JDBC driver and such a way eveything can be sniffed.

On Thu, 24 Dec 2020, 03:51 lec ssmi, <[hidden email]> wrote:
Hi:
   guys,  I have some spark programs that have database connection operations.  I want to acquire the connection information, such as jdbc connection properties ,  but not too intrusive to the code.
  Any good ideas ? Can java agent make it ?
  
Reply | Threaded
Open this post in threaded view
|

Re: Jdbc Hook in Spark Batch Application

Mich Talebzadeh
In reply to this post by lec ssmi
If I understand correctly you can store JDBC connection properties in a configuration file and refer to them in the code  in your Scala/python module.

Example:

# oracle variables
driverName = "oracle.jdbc.OracleDriver"
_username = "user"
_password = "......"
_dbschema = "schema"
_dbtable = "table"
oracleHost = 'host'
oraclePort = '1521'
oracleDB = 'mydb'
url= "jdbc:oracle:thin:@"+oracleHost+":"+oraclePort+":"+oracleDB
serviceName = oracleDB + '.abc.global'
import the config file into the python module and refer to the connection properties

import conf.file as v
def run_oracle_module():
dsn_tns = cx_Oracle.makedsn(v.oracleHost, v.oraclePort, service_name=v.serviceName)
conn = cx_Oracle.connect(v._username, v._password, dsn_tns)
cursor = conn.cursor()
sqlTable = "SELECT COUNT(1) from USER_TABLES WHERE TABLE_NAME = '" +v._dbtable + "'"
sql="SELECT ID, CLUSTERED, SCATTERED, RANDOMISED, RANDOM_STRING, SMALL_VC, PADDING FROM " + v._dbschema + "." + v._dbtable + " WHERE ROWNUM <= 10"
# Check Oracle is accessible
try:
conn
except cx_Oracle.DatabaseError as e:
print("Error: {0} [{1}]".format(e.msg, e.code))
sys.exit(1)
else:
# Check if table exists
cursor.execute (sqlTable)
if cursor.fetchone()[0] == 1:
print("\nTable " + v._dbschema+"."+ v._dbtable + " exists\n")
cursor.execute(sql)

HTH,




LinkedIn  https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

 



Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.

 



On Thu, 24 Dec 2020 at 02:51, lec ssmi <[hidden email]> wrote:
Hi:
   guys,  I have some spark programs that have database connection operations.  I want to acquire the connection information, such as jdbc connection properties ,  but not too intrusive to the code.
  Any good ideas ? Can java agent make it ?