Hive JDBC入門示例
安裝Hadoop
安裝Hive
使用Hive的JDBC介面
(1)建一個maven project,引入以下的依賴:
<dependency> <groupId>org.apache.hive</groupId> <artifactId>hive-exec</artifactId> <version>2.1.0</version></dependency><dependency> <groupId>org.apache.hive</groupId> <artifactId>hive-metastore</artifactId> <version>2.1.0</version></dependency><dependency> <groupId>org.apache.hive</groupId> <artifactId>hive-service</artifactId> <version>2.1.0</version></dependency><dependency> <groupId>org.apache.hive</groupId> <artifactId>hive-jdbc</artifactId> <version>2.1.0</version></dependency>
JDBC 客戶端代碼,這裡只是為了走通這個過程,實現的功能很簡單,查詢u1_data表中記錄的條數。
public class HiveJdbcClient { public static void main(String[] args) throws SQLException { try { Class.forName("org.apache.hive.jdbc.HiveDriver"); } catch (ClassNotFoundException e) { e.printStackTrace(); } Connection con = DriverManager.getConnection("jdbc:hive2://localhost:10000/default", "", ""); Statement stmt = con.createStatement(); String tableName = "u1_data"; ResultSet rs = stmt.executeQuery("select count(*) from " + tableName); if(rs.next()){ System.out.println(rs.getString(1)); } }}
(2)配置Hadoop的 core-site.xml中的proxy user(使用你的用戶名,比如我這裡的「vonzhou」),否則會出現以下錯誤:
Caused by: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException:User: vonzhou is not allowed to impersonate anonymousCaused by: java.lang.RuntimeException: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.authorize.AuthorizationException): Unauthorized connection for super-user: vonzhou from IP 127.0.0.1
core-site.xml配置,具體需要改為自己的代理用戶:
<property> <name>hadoop.proxyuser.vonzhou.groups</name> <value>*</value> <description>Allow the superuser vonzhou to impersonate any members of the group</description></property><property> <name>hadoop.proxyuser.vonzhou.hosts</name> <value>127.0.0.1,localhost</value> <description>The superuser can connect only the host to impersonate a user</description></property>
(3)啟動Hadoop:
? sbin git:(master) ./start-all.sh
(4) 啟動Hive Server2(這裡使用的是 hiveserver2):
? apache-hive-2.1.0-bin git:(master) bin/hive --service hiveserver2? apache-hive-2.1.0-bin git:(master) lsof -i:10000COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAMEjava 6614 vonzhou 442u IPv4 0xd15bdc45c5568f1d 0t0 TCP *:ndmp (LISTEN)
(5)我在本機上,出現以下安全模式異常:
Caused by: org.apache.hive.service.cli.HiveSQLException: Failed to open new session: java.lang.RuntimeException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.SafeModeException): Cannot create directory /tmp/hive/anonymous. Name node is in safe mode.
解決的方法是關閉安全模式:
? sbin git:(master) hadoop dfsadmin -safemode leaveDEPRECATED: Use of this script to execute hdfs command is deprecated.Instead use the hdfs command for it.16/09/05 21:28:47 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicableSafe mode is OFF
(6)最終運行該程序,得到結果:
100000
參考
Apache Hadoop 2.7.2
Setting Up HiveServer2
推薦閱讀:
※在Hive中適不適合像傳統數據倉庫一樣利用維度建模?
※為何Hive中的數據不均勻分布會導致數據傾斜?
※Hive On Spark, SparkSQL On Spark, 與Spark On YARN如何定義呢?