Hive——第二章（Hive基本操作）

04-30

基礎知識：

hive常用命令

1.創建新表
CREATE TABLE t_hive (a int, b int, c int) ROW FORMAT DELIMITED FIELDS TERMINATED BY ;
例如：
create table user_info (user_id int, cid string, ckid string, username string)

row format delimited
fields terminated by
lines terminated by
;
導入數據表的數據格式是：欄位之間是tab鍵分割，行之間是斷行。
2.導入數據t_hive.txt到t_hive表
LOAD DATA LOCAL INPATH /home/cos/demo/t_hive.txt OVERWRITE INTO TABLE t_hive ;
3.正則匹配表名

show tables *t*;
4.增加一個欄位
ALTER TABLE t_hive ADD COLUMNS (new_col String);
5.重命令表名
ALTER TABLE t_hive RENAME TO t_hadoop;
6.從HDFS載入數據
LOAD DATA INPATH /user/hive/warehouse/t_hive/t_hive.txt OVERWRITE INTO TABLE t_hive2;

7.從其他表導入數據
INSERT OVERWRITE TABLE t_hive2 SELECT * FROM t_hive ;
8.創建表並從其他表導入數據
CREATE TABLE t_hive AS SELECT * FROM t_hive2 ;
9.僅複製表結構不導數據
CREATE TABLE t_hive3 LIKE t_hive;

10.通過Hive導出到本地文件系統
INSERT OVERWRITE LOCAL DIRECTORY /tmp/t_hive SELECT * FROM t_hive;
11.Hive查詢HiveQL
from ( select b,c as c2 from t_hive) t select t.b, t.c2 limit 2;
select b,c from t_hive limit 2;
12.創建視圖
CREATE VIEW v_hive AS SELECT a,b FROM t_hive;

13.刪表
drop table if exists t_hft;
14.創建分區表
DROP TABLE IF EXISTS t_hft;
CREATE TABLE t_hft(
SecurityID STRING,
tradeTime STRING,
PreClosePx DOUBLE
) PARTITIONED BY (tradeDate INT)

ROW FORMAT DELIMITED FIELDS TERMINATED BY ,;
15.導入分區數據
load data local inpath /home/BlueBreeze/data/t_hft_1.csv overwrite into table t_hft partition(tradeDate=20130627);
16.查看分區表
SHOW PARTITIONS t_hft;

2.1、Hive基本操作

1、本地文件導入表的測試

1）在本地新建「生詞本」

相關命令與內容：

vim vocab.txt

------------------------------內容------------------------------

1.ability

2.ambition

3.headquarters

4.industrialize

------------------------------內容------------------------------

2）進入hiveshell模式

相關命令：

hive

注意：當環境變數設置後才能直接使用以上命令。

3）建立新表並查看存在新表與新表結構

建立一個存放「生詞本」單詞的表格，欄位之間是「.」分割。

相關命令：

create table VOCAB(num int,word string)row format delimited fields terminated by .;

show tables;

desc VOCAB;

4）導入數據到表中

相關命令：

load data local inpath /home/hadoop/vocab.txt overwrite into table VOCAB;

5）查詢表中內容

相關命令：

select * from VOCAB;

2、詞頻統計

1）在本地建立不完全相同的詞頻文件

相關命令與內容：

vim wordCount.txt

------------------------------內容------------------------------

I,100

have,1000

a,200

pen,3000

you,2222

are,777

amazing,9999

------------------------------內容------------------------------

2）進入hiveshell模式

相關命令：

hive

3）建立新表並查看存在新表與新表結構

建立一個存放不完全相同的詞頻單詞的表格，欄位之間是「,」分割。

相關命令：

create table WOCO(word string,count int)row format delimited fields terminated by ,;

show table;

desc WOCO;

4）導入數據到表中

相關命令：

load data local inpath /home/hadoop/wordCount.txt overwrite into table WOCO;

5）查詢表中內容

相關命令：

select * from WOCO;

6）使用命令進行mapreduce篩選查詢

相關命令：

select WOCO.word from WOCO;

select * from WOCO where WOCO.count>1000; //篩選滿足出現次數大於1000的單詞

select * from WOCO sort by count desc limit 3;//通過降序來篩選單詞