The Hortonworks Hive ODBC Driver is used for direct SQL and HiveQL access to Apache
Hadoop / Hive distributions, enabling Business Intelligence (BI), analytics, and reporting on
Hadoop / Hive-based data. The driver efficiently transforms an application’s SQL query into
the equivalent form in HiveQL.
Setup/Configuration
Download Hive Driver
Download Hive driver from below location and extract on your desired location.
https://www.cloudera.com/downloads/connectors/hive/odbc/2-5-12.html
https://www.progress.com/download/thank-you?interface=odbc&ds=apache-hadoop-hive&os=linux-64
Installing the Driver
Make sure that you use the version of the driver that matches the bitness (32/64) of the client application. The Hortonworks ODBC Driver files are installed in the following directories:
• /usr/lib/hive/lib/native/hiveodbc contains the release notes and the Hive ODBC Driver Installation and Configuration Guide.
• /usr/lib/hive/lib/native/hiveodbc/ErrorMessages contains error message files required by the driver.
• /usr/lib/hive/lib/native/hiveodbc/Setup contains sample configuration files named odbc.ini and odbcinst.ini.
• /usr/lib/hive/lib/native/Linux-amd64-64 contains the 64-bit shared libraries and the hortonworks.hiveodbc.ini configuration file.
[root@en01 hadoopsw]# yum --nogpgcheck localinstall hive-odbc-native-2.1.12.1017-1.el7.x86_64.rpm
Loaded plugins: langpacks, product-id, search-disabled-repos, subscription-manager
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
Repository 'local' is missing name in configuration, using id
Examining hive-odbc-native-2.1.12.1017-1.el7.x86_64.rpm: hive-odbc-native-2.1.12.1017-1.x86_64
Marking hive-odbc-native-2.1.12.1017-1.el7.x86_64.rpm to be installed
Resolving Dependencies
--> Running transaction check
---> Package hive-odbc-native.x86_64 0:2.1.12.1017-1 will be installed
--> Finished Dependency Resolution
ambari-2.5.0.3 | 2.9 kB 00:00:00
epel/x86_64/metalink | 6.4 kB 00:00:00
epel/x86_64 | 4.7 kB 00:00:00
epel/x86_64/updateinfo | 858 kB 00:00:28
https://ftp.yzu.edu.tw/Linux/Fedora-EPEL/7/x86_64/repodata/b2f68d3910acd4e6b1f2c96337129e288afaf2003daefdd78bb7c8a1da1ae45f-primary.sqlite.bz2: [Errno 14] curl#56 - "TCP connection reset by peer"
Trying other mirror.
https://mirrors.tongji.edu.cn/epel/7/x86_64/repodata/b2f68d3910acd4e6b1f2c96337129e288afaf2003daefdd78bb7c8a1da1ae45f-primary.sqlite.bz2: [Errno 12] Timeout on https://mirrors.tongji.edu.cn/epel/7/x86_64/repodata/b2f68d3910acd4e6b1f2c96337129e288afaf2003daefdd78bb7c8a1da1ae45f-primary.sqlite.bz2: (28, 'Connection timed out after 30000 milliseconds')
Trying other mirror.
epel/x86_64/primary_db | 6.2 MB 00:00:30
google-chrome | 951 B 00:00:00
google-chrome/primary | 1.9 kB 00:00:00
local | 2.9 kB 00:00:00
mysql-connectors-community/x86_64 | 2.5 kB 00:00:00
mysql-tools-community/x86_64 | 2.5 kB 00:00:00
mysql57-community/x86_64 | 2.5 kB 00:00:00
Dependencies Resolved
=================================================================================================================================================
Package Arch Version Repository Size
=================================================================================================================================================
Installing:
hive-odbc-native x86_64 2.1.12.1017-1 /hive-odbc-native-2.1.12.1017-1.el7.x86_64 53 M
Transaction Summary
=================================================================================================================================================
Install 1 Package
Total size: 53 M
Installed size: 53 M
Is this ok [y/d/N]: y
Downloading packages:
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Installing : hive-odbc-native-2.1.12.1017-1.x86_64 1/1
google-chrome 3/3
Verifying : hive-odbc-native-2.1.12.1017-1.x86_64 1/1
Installed:
hive-odbc-native.x86_64 0:2.1.12.1017-1
Complete!
Verifying the Version Number
[root@en01 hadoopsw]# whereis hive
hive: /usr/lib/hive
[ODBC]
# Specify any global ODBC configuration here such as ODBC tracing.
[ODBC Data Sources]
Hortonworks Hive 32-bit=Hortonworks Hive ODBC Driver 32-bit
Hortonworks Hive 64-bit=Hortonworks Hive ODBC Driver 64-bit
# add a new entry for Hive Data Source Name (DSN),
HiveDSN=Hortonworks Hive ODBC Driver 64-bit
[HiveDSN]
Description=Hortonworks Hive ODBC Driver (64-bit) DSN
Driver=/usr/lib/hive/lib/native/Linux-amd64-64/libhortonworkshiveodbc64.so
HOST=dn04
PORT=10000
HiveServerType=2
ThriftTransport=SASL
Schema=scott
....
[ODBC Drivers]
Hortonworks Hive ODBC Driver 32-bit=Installed
Hortonworks Hive ODBC Driver 64-bit=Installed
[Hortonworks Hive ODBC Driver 32-bit]
Description=Hortonworks Hive ODBC Driver (32-bit)
Driver=/usr/lib/hive/lib/native/Linux-i386-32/libhortonworkshiveodbc32.so
[Hortonworks Hive ODBC Driver 64-bit]
Description=Hortonworks Hive ODBC Driver (64-bit)
Driver=/usr/lib/hive/lib/native/Linux-amd64-64/libhortonworkshiveodbc64.so
## The option below is for using unixODBC when compiled with -DSQL_WCHART_CONVERT.
## Execute 'odbc_config --cflags' to determine if you need to uncomment it.
# IconvEncoding=UCS-4LE
root@en01 hadoopsw]# isql -v HiveDSN
+---------------------------------------+
| Connected! |
| |
| sql-statement |
| help [tablename] |
| quit |
| |
+---------------------------------------+
SQL> show tables
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| tab_name |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| dept |
| dual |
| emp |
| emptalend |
| hive_vw_syslogs_current_regex |
| metadata |
| presto_view_syslogs_regex |
| presto_vw_syslogs2 |
| scottmetadata |
| testorc |
| testtext |
| v1 |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
SQLRowCount returns -1
12 rows fetched
SQL>
Configure Oracle OS User
vi /home/oracle/.bash_profile
# .bash_profile
# Get the aliases and functions
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi
# User specific environment and startup programs
PATH=$PATH:$HOME/.local/bin:$HOME/bin
export PATH
# Oracle Settings
TMP=/tmp; export TMP
TMPDIR=$TMP; export TMPDIR
ORACLE_HOSTNAME=en01; export ORACLE_HOSTNAME
ORACLE_UNQNAME=hadob; export ORACLE_UNQNAME
ORACLE_BASE=/u01/app/oracle; export ORACLE_BASE
ORACLE_HOME=$ORACLE_BASE/product/12.2.0.1/db_1; export ORACLE_HOME
ORACLE_SID=hadob; export ORACLE_SID
export NLS_LANG=American_America.AL32UTF8
PATH=/usr/sbin:$PATH; export PATH
PATH=$ORACLE_HOME/bin:$PATH; export PATH
LD_LIBRARY_PATH=$ORACLE_HOME/lib:/lib:/usr/lib; export LD_LIBRARY_PATH
CLASSPATH=$ORACLE_HOME/jlib:$ORACLE_HOME/rdbms/jlib; export CLASSPATH
##For Presto ODBC Settings
export ODBCINI=/opt/teradata/prestoodbc/Setup/odbc.ini
export ODBCSYSINI=/opt/teradata/prestoodbc/Setup/odbcinst.ini
export TERADATAPRESTOINI=/opt/teradata/prestoodbc/lib/64/teradata.prestoodbc.ini
export LD_PRELOAD=/usr/lib64/libodbcinst.so
##For Hive ODBC Settings
export ODBCINI=/usr/lib/hive/lib/native/hiveodbc/Setup/odbc.ini
export ODBCSYSINI=/usr/lib/hive/lib/native/hiveodbc/Setup/odbcinst.ini
export HORTONWORKSHORTONWORKSHIVEODBCINI=/usr/lib/hive/lib/native/Linux-amd64-64/hortonworks.hiveodbc.ini
export LD_PRELOAD=/usr/lib64/libodbcinst.so
Configure Oracle Network
Congrats! You have successfully communicated with Hivefrom Oracle database.
Updated on 20-Dec-2017
This update due to getting error while running aggregate function from Oracle
There are two types of queries run on Hive , query which runs without mapreduce and query which requires mapreduce. I ran the below query from SQPPlus and found the error.
SQL> select count(*) from dept@hivelnk;
select count(*) from dept@hivelnk
*
ERROR at line 1:
ORA-28500: connection from ORACLE to a non-Oracle system returned this message:
[Hortonworks][Hardy] (35) Error from server: error code: '1' error message:
'Error while processing statement: FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.tez.TezTask'. {HY000,NativeErr = 35}
ORA-02063: preceding 2 lines from HIVELNK
In order to investigate this error I checked the same query with iSQL and found the same issue as below.
[oracle@te1-hdp-rp-en01 ~]$ isql -v HiveDSN
As above with success, it means now in Oracle there should not be any problem, so tested in Oracle but got another error which quiet self explanatory.
SQL> select count(1) from scott.dept@hivelnk
2 /
select count(1) from scott.dept@hivelnk
*
ERROR at line 1:
ORA-28500: connection from ORACLE to a non-Oracle system returned this message:
[Hortonworks][Hardy] (35) Error from server: error code: '1' error message:
'Error while processing statement: FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.mr.MapRedTask. Permission denied: user=HIVE,
access=WRITE, inode="/user/HIVE/.staging":hdfs:hdfs:drwxr-xr-x
at org.apache.hadoop.hdfs.server.name'. {HY000,NativeErr = 35}
ORA-02063: preceding 3 lines from HIVELNK
Observe the error it is because of the user configured in dblnik definition so I made it correct.
SQL> drop public database link hivelnk;
Database link dropped.
SQL> Create public database link hivelnk connect to "anonymous" identified by anonymous using 'HIVECON';
Database link created.
SQL> select count(1) from scott.dept@hivelnk;
COUNT(1)
----------
4
Now this time got success :)
After this I did some test to run the query from SQLPlus with Presto (already DSN configured in other post), Hive MapReduce and Hive with Tez and below are findings.
-- With Map Reduce
SQL> set timing on;
SQL> select count(1) from scott.dept@hivelnk;
COUNT(1)
----------
4
Elapsed: 00:00:22.19
-- With Presto
SQL> select count(1) from scott.dept@prestolnk;
COUNT(1)
----------
4
Elapsed: 00:00:00.87
--Witn TEZ
SQL> select count(1) from scott.dept@hivelnk;
COUNT(1)
----------
4
Elapsed: 00:00:05.74
Test Join - Hadoop and Oracle
select d."dname", e.name from dept@hivelnk d,scott.employee e where d."deptno"=e.deptno
[root@en01 hadoopsw]# yum --nogpgcheck localinstall hive-odbc-native-2.1.12.1017-1.el7.x86_64.rpm
Loaded plugins: langpacks, product-id, search-disabled-repos, subscription-manager
This system is not registered to Red Hat Subscription Management. You can use subscription-manager to register.
Repository 'local' is missing name in configuration, using id
Examining hive-odbc-native-2.1.12.1017-1.el7.x86_64.rpm: hive-odbc-native-2.1.12.1017-1.x86_64
Marking hive-odbc-native-2.1.12.1017-1.el7.x86_64.rpm to be installed
Resolving Dependencies
--> Running transaction check
---> Package hive-odbc-native.x86_64 0:2.1.12.1017-1 will be installed
--> Finished Dependency Resolution
ambari-2.5.0.3 | 2.9 kB 00:00:00
epel/x86_64/metalink | 6.4 kB 00:00:00
epel/x86_64 | 4.7 kB 00:00:00
epel/x86_64/updateinfo | 858 kB 00:00:28
https://ftp.yzu.edu.tw/Linux/Fedora-EPEL/7/x86_64/repodata/b2f68d3910acd4e6b1f2c96337129e288afaf2003daefdd78bb7c8a1da1ae45f-primary.sqlite.bz2: [Errno 14] curl#56 - "TCP connection reset by peer"
Trying other mirror.
https://mirrors.tongji.edu.cn/epel/7/x86_64/repodata/b2f68d3910acd4e6b1f2c96337129e288afaf2003daefdd78bb7c8a1da1ae45f-primary.sqlite.bz2: [Errno 12] Timeout on https://mirrors.tongji.edu.cn/epel/7/x86_64/repodata/b2f68d3910acd4e6b1f2c96337129e288afaf2003daefdd78bb7c8a1da1ae45f-primary.sqlite.bz2: (28, 'Connection timed out after 30000 milliseconds')
Trying other mirror.
epel/x86_64/primary_db | 6.2 MB 00:00:30
google-chrome | 951 B 00:00:00
google-chrome/primary | 1.9 kB 00:00:00
local | 2.9 kB 00:00:00
mysql-connectors-community/x86_64 | 2.5 kB 00:00:00
mysql-tools-community/x86_64 | 2.5 kB 00:00:00
mysql57-community/x86_64 | 2.5 kB 00:00:00
Dependencies Resolved
=================================================================================================================================================
Package Arch Version Repository Size
=================================================================================================================================================
Installing:
hive-odbc-native x86_64 2.1.12.1017-1 /hive-odbc-native-2.1.12.1017-1.el7.x86_64 53 M
Transaction Summary
=================================================================================================================================================
Install 1 Package
Total size: 53 M
Installed size: 53 M
Is this ok [y/d/N]: y
Downloading packages:
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
Installing : hive-odbc-native-2.1.12.1017-1.x86_64 1/1
google-chrome 3/3
Verifying : hive-odbc-native-2.1.12.1017-1.x86_64 1/1
Installed:
hive-odbc-native.x86_64 0:2.1.12.1017-1
Complete!
Verifying the Version Number
[root@en01 hadoopsw]# odbcinst --version
unixODBC 2.3.1
[root@en01 hadoopsw]# rpm -qa | grep hive-odbc-native
hive-odbc-native-2.1.12.1017-1.x86_64
[root@en01 hadoopsw]# rpm -qlp hive-odbc-native-2.1.12.1017-1.el7.x86_64.rpm
/usr/lib/hive
/usr/lib/hive/lib
/usr/lib/hive/lib/native
/usr/lib/hive/lib/native/Linux-amd64-64
/usr/lib/hive/lib/native/Linux-amd64-64/HiveODBC.did
/usr/lib/hive/lib/native/Linux-amd64-64/cacerts.pem
/usr/lib/hive/lib/native/Linux-amd64-64/hortonworks.hiveodbc.ini
/usr/lib/hive/lib/native/Linux-amd64-64/libhortonworkshiveodbc64.so
/usr/lib/hive/lib/native/hiveodbc
/usr/lib/hive/lib/native/hiveodbc/EULA.txt
/usr/lib/hive/lib/native/hiveodbc/ErrorMessages
/usr/lib/hive/lib/native/hiveodbc/ErrorMessages/en-US
/usr/lib/hive/lib/native/hiveodbc/ErrorMessages/en-US/DSMessages.xml
/usr/lib/hive/lib/native/hiveodbc/ErrorMessages/en-US/HiveODBCMessages.xml
/usr/lib/hive/lib/native/hiveodbc/ErrorMessages/en-US/ODBCMessages.xml
/usr/lib/hive/lib/native/hiveodbc/ErrorMessages/en-US/SQLEngineMessages.xml
/usr/lib/hive/lib/native/hiveodbc/ErrorMessages/en-US/ThriftExtensionMessages.xml
/usr/lib/hive/lib/native/hiveodbc/Hortonworks Hive ODBC Driver User Guide.pdf
/usr/lib/hive/lib/native/hiveodbc/Release Notes.txt
/usr/lib/hive/lib/native/hiveodbc/Setup
/usr/lib/hive/lib/native/hiveodbc/Setup/odbc.ini
/usr/lib/hive/lib/native/hiveodbc/Setup/odbcinst.ini
hive: /usr/lib/hive
Configuring ODBC Connection
Configuration Files
ODBC driver managers use configuration files to define and configure ODBC data sources and drivers. By default, the following configuration files are used:
• .odbc.ini is used to define ODBC data sources, and it is required for DSNs.
• .odbcinst.ini is used to define ODBC drivers, and it is optional.
The driver installation contains abobe configuration files in the Setup directory.
These files are located in the user's home directory. Also, by default the Hive ODBC Driver is configured using the hortonworks.hiveodbc.ini file, which is required. This file is located in one of the following directories depending on the version of the driver that you are using:
• /usr/lib/hive/lib/native/Linux-amd32-32 for the 32-bit driver on Linux.
• /usr/lib/hive/lib/native/Linux-amd64-64 for the 64-bit driver on Linux.
Configuring the Environment
Use three environment variables, ODBCINI, ODBCSYSINI, and HORTONWORKSHORTONWORKSHIVEODBCINI, to specify different locations for the odbc.ini, odbcinst.ini, and hortonworks.hiveodbc.ini configuration files.
export ODBCINI=/usr/lib/hive/lib/native/hiveodbc/Setup/odbc.ini
export ODBCSYSINI=/usr/lib/hive/lib/native/hiveodbc/Setup/odbcinst.ini
export HORTONWORKSHORTONWORKSHIVEODBCINI=/usr/lib/hive/lib/native/Linux-amd64-64/hortonworks.hiveodbc.ini
Defining DSNs in odbc.ini
ODBC Data Source Names (DSNs) are defined in the odbc.ini configuration file. This file is divided into several sections:
• [ODBC] is optional. This section is used to control global ODBC configuration, such as ODBC tracing.
• [ODBC Data Sources] is required. This section lists the DSNs and associates them with a driver. The driver can be specified by the name given in odbcinst.ini or by the path to the driver shared object file.
• A section having the same name as the data source specified in the [ODBC Data Sources] section is required to configure the data source.
vi /usr/lib/hive/lib/native/hiveodbc/Setup/odbc.ini
vi /usr/lib/hive/lib/native/hiveodbc/Setup/odbc.ini
# Specify any global ODBC configuration here such as ODBC tracing.
[ODBC Data Sources]
Hortonworks Hive 32-bit=Hortonworks Hive ODBC Driver 32-bit
Hortonworks Hive 64-bit=Hortonworks Hive ODBC Driver 64-bit
# add a new entry for Hive Data Source Name (DSN),
HiveDSN=Hortonworks Hive ODBC Driver 64-bit
[HiveDSN]
Description=Hortonworks Hive ODBC Driver (64-bit) DSN
Driver=/usr/lib/hive/lib/native/Linux-amd64-64/libhortonworkshiveodbc64.so
HOST=dn04
PORT=10000
HiveServerType=2
ThriftTransport=SASL
Schema=scott
Specifying ODBC Drivers in odbcinst.ini
ODBC drivers are defined in the odbcinst.ini configuration file. This configuration file is optional because drivers can be specified directly in the odbc.ini configuration file.
The odbcinst.ini file is divided into the following sections:
• [ODBC Drivers] lists the names of all the installed ODBC drivers.
• For each driver, a section having the same name as the driver name specified in the [ODBC Drivers] section lists the driver attributes and values.
vi /usr/lib/hive/lib/native/hiveodbc/Setup/odbcinst.ini
[ODBC Drivers]
Hortonworks Hive ODBC Driver 32-bit=Installed
Hortonworks Hive ODBC Driver 64-bit=Installed
[Hortonworks Hive ODBC Driver 32-bit]
Description=Hortonworks Hive ODBC Driver (32-bit)
Driver=/usr/lib/hive/lib/native/Linux-i386-32/libhortonworkshiveodbc32.so
[Hortonworks Hive ODBC Driver 64-bit]
Description=Hortonworks Hive ODBC Driver (64-bit)
Driver=/usr/lib/hive/lib/native/Linux-amd64-64/libhortonworkshiveodbc64.so
## The option below is for using unixODBC when compiled with -DSQL_WCHART_CONVERT.
## Execute 'odbc_config --cflags' to determine if you need to uncomment it.
# IconvEncoding=UCS-4LE
Configuring Driver Settings in hortonworks.hiveodbc.ini
The hortonworks.hiveodbc.ini file contains configuration settings for the hive ODBC Driver. Settings that you define in this file apply to all connections that use the driver.
You do not need to modify the settings in the hortonworks.hiveodbc.ini file to use the driver and connect to your data source. However, to help troubleshoot issues, you can configure the hortonworks.hiveodbc.ini file to enable logging in the driver.
vi /usr/lib/hive/lib/native/Linux-amd64-64/hortonworks.hiveodbc.ini
[Driver]
ODBCInstLib=/usr/lib64/libodbcinst.so
ErrorMessagesPath=/usr/lib/hive/lib/native/hiveodbc/ErrorMessages/
LogLevel=0
LogPath=
SwapFilePath=/tmp
ODBCInstLib=/usr/lib64/libodbcinst.so
ErrorMessagesPath=/usr/lib/hive/lib/native/hiveodbc/ErrorMessages/
LogLevel=0
LogPath=
SwapFilePath=/tmp
Testing the Connection
To test the connection, you can use an ODBC-enabled client application. For a basic connection test, you can also use the test utilities that are packaged with your driver manager installation. For example, the iODBC driver manager includes simple utilities called iodbctest and iodbctestw. Similarly, the unixODBC driver manager includes simple utilities called isql and iusql.
root@en01 hadoopsw]# iodbctest DSN=HiveDSN
iODBC Demonstration program
This program shows an interactive SQL processor
Driver Manager: 03.52.0709.0909
Driver: 2.1.12.1017 (Hortonworks Hive ODBC Driver)
SQL>
root@en01 hadoopsw]# iodbctest DSN=HiveDSN
iODBC Demonstration program
This program shows an interactive SQL processor
Driver Manager: 03.52.0709.0909
Driver: 2.1.12.1017 (Hortonworks Hive ODBC Driver)
SQL>
root@en01 hadoopsw]# isql -v HiveDSN
+---------------------------------------+
| Connected! |
| |
| sql-statement |
| help [tablename] |
| quit |
| |
+---------------------------------------+
SQL> show tables
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| tab_name |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| dept |
| dual |
| emp |
| emptalend |
| hive_vw_syslogs_current_regex |
| metadata |
| presto_view_syslogs_regex |
| presto_vw_syslogs2 |
| scottmetadata |
| testorc |
| testtext |
| v1 |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
SQLRowCount returns -1
12 rows fetched
SQL>
Configuring Oracle to use ODBC Driver
Oracle Database supports heterogeneous connectivity for non-oracle database sources. To establish such heterogeneous connectivity Oracle database need Oracle Gateways and ODBC drivers. We have already configured ODBC driver and now we configure Oracle ODBC Gateway. We are using Oracle 12c for the purpose of this post and you can see the odbc gateway executable (dg3odbc) in ORACLE_HOME/bin. If you are using 11g then you can download and install and use the same way as we using for 12c.
Oracle needs system DSN to communicate from it, we have already configured the system DSN above.
Configure Oracle OS User
vi /home/oracle/.bash_profile
# Get the aliases and functions
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi
# User specific environment and startup programs
PATH=$PATH:$HOME/.local/bin:$HOME/bin
export PATH
# Oracle Settings
TMP=/tmp; export TMP
TMPDIR=$TMP; export TMPDIR
ORACLE_HOSTNAME=en01; export ORACLE_HOSTNAME
ORACLE_UNQNAME=hadob; export ORACLE_UNQNAME
ORACLE_BASE=/u01/app/oracle; export ORACLE_BASE
ORACLE_HOME=$ORACLE_BASE/product/12.2.0.1/db_1; export ORACLE_HOME
ORACLE_SID=hadob; export ORACLE_SID
export NLS_LANG=American_America.AL32UTF8
PATH=/usr/sbin:$PATH; export PATH
PATH=$ORACLE_HOME/bin:$PATH; export PATH
LD_LIBRARY_PATH=$ORACLE_HOME/lib:/lib:/usr/lib; export LD_LIBRARY_PATH
CLASSPATH=$ORACLE_HOME/jlib:$ORACLE_HOME/rdbms/jlib; export CLASSPATH
##For Presto ODBC Settings
export ODBCINI=/opt/teradata/prestoodbc/Setup/odbc.ini
export ODBCSYSINI=/opt/teradata/prestoodbc/Setup/odbcinst.ini
export TERADATAPRESTOINI=/opt/teradata/prestoodbc/lib/64/teradata.prestoodbc.ini
export LD_PRELOAD=/usr/lib64/libodbcinst.so
##For Hive ODBC Settings
export ODBCINI=/usr/lib/hive/lib/native/hiveodbc/Setup/odbc.ini
export ODBCSYSINI=/usr/lib/hive/lib/native/hiveodbc/Setup/odbcinst.ini
export HORTONWORKSHORTONWORKSHIVEODBCINI=/usr/lib/hive/lib/native/Linux-amd64-64/hortonworks.hiveodbc.ini
export LD_PRELOAD=/usr/lib64/libodbcinst.so
[oracle@en01 ~]$ source .bash_profile
Configure Oracle Network
In Oracle Gateways home configure the listener.ora file, this file will be responsible to provide connection to non-oracle data sources (ie; Hive in our case)
Configure listener.ora file
LSNRCTL for Linux: Version 12.2.0.1.0 - Production on 19-DEC-2017 13:47:22
Copyright (c) 1991, 2016, Oracle. All rights reserved.
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=en01)(PORT=1521)))
STATUS of the LISTENER
------------------------
Alias LISTENER
Version TNSLSNR for Linux: Version 12.2.0.1.0 - Production
Start Date 14-DEC-2017 14:34:02
Uptime 4 days 23 hr. 13 min. 19 sec
Trace Level off
Security ON: Local OS Authentication
SNMP OFF
Listener Parameter File /u01/app/oracle/product/12.2.0.1/db_1/network/admin/listener.ora
Listener Log File /u01/app/oracle/diag/tnslsnr/en01/listener/alert/log.xml
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=en01)(PORT=1521)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1521)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcps)(HOST=en01)(PORT=5500))(Security=(my_wallet_directory=/u01/app/oracle/admin/hadob/xdb_wallet))(Presentation=HTTP)(Session=RAW))
Services Summary...
Service "HiveDSN" has 1 instance(s).
Instance "HiveDSN", status UNKNOWN, has 1 handler(s) for this service...
Service "PrestoDSN" has 1 instance(s).
Instance "PrestoDSN", status UNKNOWN, has 1 handler(s) for this service...
Service "hadob" has 1 instance(s).
Instance "hadob", status READY, has 1 handler(s) for this service...
Service "hadobXDB" has 1 instance(s).
Instance "hadob", status READY, has 1 handler(s) for this service...
The command completed successfully
Configure tnsnames.ora file
[oracle@en01 ~]$ tnsping hivecon
TNS Ping Utility for Linux: Version 12.2.0.1.0 - Production on 19-DEC-2017 13:48:10
Copyright (c) 1997, 2016, Oracle. All rights reserved.
Used parameter files:
/u01/app/oracle/product/12.2.0.1/db_1/network/admin/sqlnet.ora
Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = en01)(PORT = 1521)) (CONNECT_DATA = (SID = HiveDSN)) (HS = OK))
OK (0 msec)
# This is a sample agent init file that contains the HS parameters that are
# needed for the Database Gateway for ODBC
#
# HS init parameters
#
HS_FDS_CONNECT_INFO = "HiveDSN"
HS_FDS_TRACE_LEVEL = DEBUG
HS_FDS_SHAREABLE_NAME=/usr/lib/hive/lib/native/Linux-amd64-64/libhortonworkshiveodbc64.so
HS_FDS_SQLLEN_INTERPRETATION=64
HS_TRANSACTION_MODEL=READ_ONLY_AUTOCOMMIT
HS_LANGUAGE=AMERICAN_AMERICA.WE8MSWIN1252
HS_FDS_SUPPORT_STATISTICS = FALSE
#HS_KEEP_REMOTE_COLUMN_SIZE=ALL
#HS_NLS_NCHAR=UCS2
#
# ODBC specific environment variables
#
set ODBCINI=/usr/lib/hive/lib/native/hiveodbc/Setup/odbc.ini
set ODBCSYSINI=/usr/lib/hive/lib/native/hiveodbc/Setup/odbcinst.ini
#
# Environment variables required for the non-Oracle system
#
#set=
Query via DBLink
# listener.ora Network Configuration File: /u01/app/oracle/product/12.2.0.1/db_1/network/admin/listener.ora
# Generated by Oracle configuration tools.
LISTENER =
(DESCRIPTION_LIST =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = en01)(PORT = 1521))
(ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC1521))
)
)
SID_LIST_LISTENER=
(SID_LIST =
(SID_DESC =
(SID_NAME = PrestoDSN)
(ORACLE_HOME = /u01/app/oracle/product/12.2.0.1/db_1)
(PROGRAM = dg4odbc)
(ARGS = ENVS=LD_LIBRARY_PATH=/opt/teradata/prestoodbc/lib/64/libteradataprestoodbc64.so)
)
(SID_DESC =
(SID_NAME = HiveDSN)
(ORACLE_HOME = /u01/app/oracle/product/12.2.0.1/db_1)
(PROGRAM = dg4odbc)
(ARGS = ENVS=LD_LIBRARY_PATH=/usr/lib/hive/lib/native/Linux-amd64-64/libhortonworkshiveodbc64.so)
)
)
SID_NAME should be the same name as you have configured your DSN Name.
Now stop and start the listener to take change effect.
[oracle@en01 ~]$ lsnrctl stop
[oracle@en01 ~]$ lsnrctl start
[oracle@en01 ~]$ lsnrctl status
Copyright (c) 1991, 2016, Oracle. All rights reserved.
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=en01)(PORT=1521)))
STATUS of the LISTENER
------------------------
Alias LISTENER
Version TNSLSNR for Linux: Version 12.2.0.1.0 - Production
Start Date 14-DEC-2017 14:34:02
Uptime 4 days 23 hr. 13 min. 19 sec
Trace Level off
Security ON: Local OS Authentication
SNMP OFF
Listener Parameter File /u01/app/oracle/product/12.2.0.1/db_1/network/admin/listener.ora
Listener Log File /u01/app/oracle/diag/tnslsnr/en01/listener/alert/log.xml
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=en01)(PORT=1521)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1521)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcps)(HOST=en01)(PORT=5500))(Security=(my_wallet_directory=/u01/app/oracle/admin/hadob/xdb_wallet))(Presentation=HTTP)(Session=RAW))
Services Summary...
Service "HiveDSN" has 1 instance(s).
Instance "HiveDSN", status UNKNOWN, has 1 handler(s) for this service...
Service "PrestoDSN" has 1 instance(s).
Instance "PrestoDSN", status UNKNOWN, has 1 handler(s) for this service...
Service "hadob" has 1 instance(s).
Instance "hadob", status READY, has 1 handler(s) for this service...
Service "hadobXDB" has 1 instance(s).
Instance "hadob", status READY, has 1 handler(s) for this service...
The command completed successfully
Configure tnsnames.ora file
# tnsnames.ora Network Configuration File: /u01/app/oracle/product/12.2.0.1/db_1/network/admin/tnsnames.ora
# Generated by Oracle configuration tools.
HADOB =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = en01)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = hadob)
)
)
LISTENER_HADOB =
(ADDRESS = (PROTOCOL = TCP)(HOST = en01)(PORT = 1521))
PRESTOCON =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = en01)(PORT = 1521))
(CONNECT_DATA =
(SID = PrestoDSN)
)
(HS = OK)
)
HIVECON =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = en01)(PORT = 1521))
(CONNECT_DATA =
(SID = HiveDSN)
)
(HS = OK)
)
Make sure, you specify "HS=OK" in tns entry then test your tns entry.
TNS Ping Utility for Linux: Version 12.2.0.1.0 - Production on 19-DEC-2017 13:48:10
Copyright (c) 1997, 2016, Oracle. All rights reserved.
Used parameter files:
/u01/app/oracle/product/12.2.0.1/db_1/network/admin/sqlnet.ora
Used TNSNAMES adapter to resolve the alias
Attempting to contact (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = en01)(PORT = 1521)) (CONNECT_DATA = (SID = HiveDSN)) (HS = OK))
OK (0 msec)
Configure Oracle HS/ODBC Gateway
Configure odbc gateway parameter file
This file contains information about the source database name which you want to connect and the location of driver which is responsible for communicating wit
h Oracle heterogeneous service and establishing connection to a non-oracle data source. File is created as init.ora in the $ORACLE_HOME/hs/admin directory
h Oracle heterogeneous service and establishing connection to a non-oracle data source. File is created as init
vi /u01/app/oracle/product/12.2.0.1/db_1/hs/admin/initHiveDSN.ora
# This is a sample agent init file that contains the HS parameters that are
# needed for the Database Gateway for ODBC
#
# HS init parameters
#
HS_FDS_CONNECT_INFO = "HiveDSN"
HS_FDS_TRACE_LEVEL = DEBUG
HS_FDS_SHAREABLE_NAME=/usr/lib/hive/lib/native/Linux-amd64-64/libhortonworkshiveodbc64.so
HS_FDS_SQLLEN_INTERPRETATION=64
HS_TRANSACTION_MODEL=READ_ONLY_AUTOCOMMIT
HS_LANGUAGE=AMERICAN_AMERICA.WE8MSWIN1252
HS_FDS_SUPPORT_STATISTICS = FALSE
#HS_KEEP_REMOTE_COLUMN_SIZE=ALL
#HS_NLS_NCHAR=UCS2
#
# ODBC specific environment variables
#
set ODBCINI=/usr/lib/hive/lib/native/hiveodbc/Setup/odbc.ini
set ODBCSYSINI=/usr/lib/hive/lib/native/hiveodbc/Setup/odbcinst.ini
set HORTONWORKSHORTONWORKSHIVEODBCINI=/usr/lib/hive/lib/native/Linux-amd64-64/hortonworks.hiveodbc.ini
set LD_LIBRARY_PATH=/usr/lib64/#
# Environment variables required for the non-Oracle system
#
#set
Configure Oracle Database
Create DBLink
Create a database link that points to the tnsnames entry created earlier:
SQL> create public database link hivelnk connect to hive identified by hive using 'HIVECON';
Database link created.
SQL> select sysdate from scott.dual@hivelnk;
SYSDATE
------------------
19-DEC-17
Congrats! You have successfully communicated with Hivefrom Oracle database.
Updated on 20-Dec-2017
This update due to getting error while running aggregate function from Oracle
There are two types of queries run on Hive , query which runs without mapreduce and query which requires mapreduce. I ran the below query from SQPPlus and found the error.
SQL> select count(*) from dept@hivelnk;
select count(*) from dept@hivelnk
*
ERROR at line 1:
ORA-28500: connection from ORACLE to a non-Oracle system returned this message:
[Hortonworks][Hardy] (35) Error from server: error code: '1' error message:
'Error while processing statement: FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.tez.TezTask'. {HY000,NativeErr = 35}
ORA-02063: preceding 2 lines from HIVELNK
In order to investigate this error I checked the same query with iSQL and found the same issue as below.
[oracle@te1-hdp-rp-en01 ~]$ isql -v HiveDSN
SQL> select count(*) from scott.dept
[S1000][Hortonworks][Hardy] (35) Error from server: error code: '1' error message: 'Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask'.
[ISQL]ERROR: Could not SQLExecute
I enabled logging on the ODBC level and on Oracle HS level too but could not get any useful information.
With an idea I just changed the execution engine in Ambari from TEZ to MAPREDUCE and tried again from iSQL and observed that now error is changed and got that issue is occurring due to permission on HDFS location.
[oracle@te1-hdp-rp-en01 ~]$ isql -v HiveDSN
+---------------------------------------+
| Connected! |
| |
| sql-statement |
| help [tablename] |
| quit |
| |
+---------------------------------------+
SQL> select count(*) from scott.dept
[S1000][Hortonworks][Hardy] (35) Error from server: error code: '1' error message: 'Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask.Permission denied: user=anonymous, access=WRITE, inode="/user/anonymous/.staging":hdfs:hdfs:drwxr-xr-x
at org.apache.hadoop.hdfs.s'.
[ISQL]ERROR: Could not SQLExecute
After observing above error I performed (self expalnatory) the below steps on the HDFS side.
[hdfs@te1-hdp-rp-dn04 ~]$ hdfs dfs -mkdir /user/anonymous
[hdfs@te1-hdp-rp-dn04 ~]$ hdfs dfs -ls /user
Found 11 items
drwxrwxrwx - admin hdfs 0 2017-08-01 11:47 /user/admin
drwxrwx--- - ambari-qa hdfs 0 2017-08-20 12:08 /user/ambari-qa
drwxr-xr-x - hdfs hdfs 0 2017-12-20 14:54 /user/anonymous
[hdfs@te1-hdp-rp-dn04 ~]$ hdfs dfs -chmod -R 777 /user/anonymous
[hdfs@te1-hdp-rp-dn04 ~]$ hdfs dfs -ls /user
Found 11 items
drwxrwxrwx - admin hdfs 0 2017-08-01 11:47 /user/admin
drwxrwx--- - ambari-qa hdfs 0 2017-08-20 12:08 /user/ambari-qa
drwxrwxrwx - hdfs hdfs 0 2017-12-20 14:54 /user/anonymous
Now tested again using iSQL and this time it went with success.
[oracle@te1-hdp-rp-en01 ~]$ isql -v HiveDSN
+---------------------------------------+
| Connected! |
| |
| sql-statement |
| help [tablename] |
| quit |
| |
+---------------------------------------+
SQL> select count(*) from scott.dept
+---------------------+
| EXPR_1 |
+---------------------+
| 4 |
+---------------------+
SQLRowCount returns -1
1 rows fetched
SQL>
SQL> select count(1) from scott.dept@hivelnk
2 /
select count(1) from scott.dept@hivelnk
*
ERROR at line 1:
ORA-28500: connection from ORACLE to a non-Oracle system returned this message:
[Hortonworks][Hardy] (35) Error from server: error code: '1' error message:
'Error while processing statement: FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.mr.MapRedTask. Permission denied: user=HIVE,
access=WRITE, inode="/user/HIVE/.staging":hdfs:hdfs:drwxr-xr-x
at org.apache.hadoop.hdfs.server.name'. {HY000,NativeErr = 35}
ORA-02063: preceding 3 lines from HIVELNK
Observe the error it is because of the user configured in dblnik definition so I made it correct.
SQL> drop public database link hivelnk;
Database link dropped.
SQL> Create public database link hivelnk connect to "anonymous" identified by anonymous using 'HIVECON';
Database link created.
SQL> select count(1) from scott.dept@hivelnk;
COUNT(1)
----------
4
Now this time got success :)
I again changed the execution engine from MAPREDUCE to TEZ and ran the above query and this time it went with success also. So changing from TEZ to MAPREDUCE just helped us to diagnose the real issue behind failure.
After this I did some test to run the query from SQLPlus with Presto (already DSN configured in other post), Hive MapReduce and Hive with Tez and below are findings.
-- With Map Reduce
SQL> set timing on;
SQL> select count(1) from scott.dept@hivelnk;
COUNT(1)
----------
4
Elapsed: 00:00:22.19
-- With Presto
SQL> select count(1) from scott.dept@prestolnk;
COUNT(1)
----------
4
Elapsed: 00:00:00.87
--Witn TEZ
SQL> select count(1) from scott.dept@hivelnk;
COUNT(1)
----------
4
Elapsed: 00:00:05.74
Test Join - Hadoop and Oracle
select d."dname", e.name from dept@hivelnk d,scott.employee e where d."deptno"=e.deptno

No comments:
Post a Comment