Impala Shell命令「终于解决」

大家好，我是考100分的小小码，祝大家学习进步，加薪顺利呀。今天说一说Impala Shell命令「终于解决」,希望您对编程的造诣更进一步.

Impala Shell命令

整理自尚硅谷Impala笔记，并动手尝试。

一、Impala 的外部Shell

选项	描述
-h, –help	显示帮助信息
-v or –version	显示版本信息
-i hostname 或者 –impalad=hostname	指定连接运行 impalad 守护进程的主机。默认端口是 21000。
【-q】 query 或者–query=query	从命令行中传递一个 shell 命令。执行完这一语句后 shell 会立即退出。
【-f】 query_file 或者–query_file=query_file	传递一个文件中的 SQL 查询。文件内容必须以分号分隔
【-o】 filename 或者 –output_file filename	保存所有查询结果到指定的文件。通常用于保存在命令行使用 -q 选项执行单个查询时的查询结果。
【-c】	查询执行失败时继续执行
-d default_db 或者 –database=default_db	指定启动后使用的数据库，与建立连接后使用 use 语句选择数据库作用相同，如果没有指定，那么使用 default 数据库
【-r】或者 –refresh_after_connect	建立连接后刷新 Impala 元数据
【-p】或者 –show_profiles	对 shell 中执行的每一个查询，显示其查询执行计划
-B（–delimited）	去格式化输出？？
–output_delimiter=character	指定分隔符
–print_header	打印列名

【1】从命令行中传递一个 shell 命令。执行完这一语句后shell 会立即退出。

[root@CM-Agent-202 ~]# impala-shell -q “select * from wx.wx_test2”;

【2】传递一个文件中的 SQL 查询。文件内容必须以分号分隔 -f

查询执行失败时继续执行 -c

【注意】在HDFS用户创建文件（select * from wx.wx_test3; 会报错）
[hdfs@hadoop103 ~]$ vim impala.sql【默认创建到本地/var/lib/hadoop-hdfs/impala.sql】
select * from wx.wx_test2;
select * from wx.wx_test3;
select * from wx.wx_test2;
[hdfs@hadoop103 ~]$ impala-shell -f impala.sql;【报错】
[hdfs@hadoop103 ~]$ impala-shell -c -f impala.sql; 【报错继续】

【3】建立连接后刷新 Impala 元数据

[hdfs@hadoop103 ~]$ impala-shell -r

【4】对 shell 中执行的每一个查询，显示其查询执行计划

[hdfs@hadoop103 ~]$ impala-shell -p
[hadoop103:21000] > select * from wx.wx_test2;

输出相当多内容

【5】保存所有查询结果到指定的文件。通常用于保存在命令行使用 -q 选项执行单个查询时的查询结果。

[root@CM-Agent-202 ~]# impala-shell -q “select * from wx.wx_test2” -o output.txt

在root下执行，文件保存在服务器本地/root/output.txt 有格式

在hdfs下执行，文件保存在服务器本地/var/lib/hadoop-hdfs/output.txt 有格式

[root@CM-Agent-202 hadoop-hdfs]# vi /var/lib/hadoop-hdfs/output.txt

+—-+——+
| id | name |
+—-+——+
| 1 | jack |
| 2 | tom |
+—-+——+

[root@CM-Agent-202 ~]# impala-shell -B -q “select * from wx.wx_test2” -o output1.txt
[root@CM-Agent-202 ~]# vi /root/output1.txt

1 jack
2 tom
10 10

[root@CM-Agent-202 ~]# impala-shell -B -q “select * from wx.wx_test2” -o output2.txt –output_delimiter=#
[root@CM-Agent-202 ~]# vi /root/output2.txt

1#jack
2#tom
10#10

二、Impala的内部Shell

选项	描述
help	显示帮助信息
explain <sql>	显示执行计划
profile	(查询完成后执行）查询最近一次查询的底层信息
shell <shell>	不退出 impala-shell 执行 shell 命令
version	显示版本信息（同于 impala-shell -v）
connect	连接 impalad 主机，默认端口 21000（同于 impala-shell -i）
refresh <tablename>	增量刷新元数据库
invalidate metadata	全量刷新元数据库（慎用）（同于 impala-shell -r）
history	历史命令

impala-shell

【1】查看执行计划

[CM-Agent-202:21000] > explain select * from wx.wx_test2;
Query: explain select * from wx.wx_test2
+————————————————————————————+
| Explain String |
+————————————————————————————+
| Max Per-Host Resource Reservation: Memory=0B |
| Per-Host Resource Estimates: Memory=32.00MB |
| WARNING: The following tables are missing relevant table and/or column statistics. |
| wx.wx_test2 |
| |
| PLAN-ROOT SINK |
| | |
| 01:EXCHANGE [UNPARTITIONED] |
| | |
| 00:SCAN HDFS [wx.wx_test2] |
| partitions=1/1 files=4 size=47B |
+————————————————————————————+
Fetched 11 row(s) in 0.02s
[CM-Agent-202:21000] >

【2】查询最近一次查询的底层信息
[CM-Agent-202:21000] > select count(*) from wx.wx_test2;
Query: select count(*) from wx.wx_test2
Query submitted at: 2020-03-26 11:16:51 (Coordinator: http://CM-Agent-202:25000)
Query progress can be monitored at: http://CM-Agent-202:25000/query_plan?query_id=234ba07a5c6d9c63:b815c42300000000
+———-+
| count(*) |
+———-+
| 10 |
+———-+
Fetched 1 row(s) in 0.12s
[CM-Agent-202:21000] > profile;
Query Runtime Profile:
Query (id=234ba07a5c6d9c63:b815c42300000000):
Summary:
Session ID: f42f30a75af19b5:d8b7e8c86c3eeb8
Session Type: BEESWAX

此处省略10000字不止。。。

CodeGen:(Total: 32.017ms, non-child: 32.017ms, % non-child: 100.00%)
– CodegenTime: 1.003ms
– CompileTime: 4.948ms
– LoadTime: 0.000ns
– ModuleBitcodeSize: 1.95 MB (2039944)
– NumFunctions: 22 (22)
– NumInstructions: 267 (267)
– OptimizationTime: 8.977ms
– PeakMemoryUsage: 133.50 KB (136704)
– PrepareTime: 17.536ms

【3】不退出 impala-shell 执行 shell 命令

查看 hdfs 及 linux 文件系统
[CM-Agent-202:21000] > shell hadoop fs -ls /;
Found 6 items
drwxr-xr-x – hbase hbase 0 2020-03-23 11:18 /hbase
drwxrwxr-x – solr solr 0 2020-03-12 13:03 /solr
drwxrwxrwt – hdfs supergroup 0 2020-03-23 14:26 /tmp
drwxrwxrwx – hdfs supergroup 0 2020-03-20 20:17 /user
drwxrwxrwx – hdfs supergroup 0 2020-03-19 17:07 /wx
drwxr-xr-x – hdfs supergroup 0 2020-03-12 16:35 /yxh
——–
Executed in 3.53s

[CM-Agent-202:21000] > shell ls -al ./;
总用量 412
dr-xr-x—. 11 root root 4096 3月 26 11:12 .
dr-xr-xr-x. 22 root root 4096 3月 26 11:18 ..
-rw——-. 1 root root 1624 4月 10 2018 anaconda-ks.cfg
-rw——-. 1 root root 24387 3月 25 18:48 .bash_history
-rw-r–r–. 1 root root 18 12月 29 2013 .bash_logout

【4】刷新指定表的元数据，hive中新增数据。
hive> load data local inpath “/opt/module/datas/student.txt” into table
student;
[hadoop103:21000] > select * from student;
[hadoop103:21000] > refresh student;
[hadoop103:21000] > select * from student;

【5】查看历史命令

[CM-Agent-202:21000] > history;
[1]: show databases;
[2]: quit;
[3]: select * from wx.wx_test2;
[4]: select * from wx.wx_test2;
[5]: quit;
[6]: select * from wx.wx_test2;
[7]: quit;
[8]: explain select * from wx.wx_test2;
[9]: select count(*) from wx.wx_test2;
[10]: select count(*) from wx.wx_test2;
[11]: profile;
[12]: shell hadoop fs -ls /;
[13]: shell ls -al ./;
[14]: shell ls -al ./user;
[15]: history;
[16]: history;
[CM-Agent-202:21000] > history;
[1]: show databases;
[2]: quit;
[3]: SELECT F_GNMC,COUNT(1) AS count FROM SYS_OPLOG
where F_USER=”9999″
GROUP BY F_GNMC
order by count desc
limit 5;
[4]: SELECT F_GNMC,COUNT(1) AS count FROM wx.SYS_OPLOG
where F_USER=”9999″
GROUP BY F_GNMC
order by count desc
limit 5;
[5]: quit;
[6]: select * from wx.wx_test2;
[7]: quit;
[8]: explain select * from wx.wx_test2;
[9]: select count(*) from wx.wx_test2;
[10]: select count(*) from wx.wx_test2;
[11]: profile;
[12]: shell hadoop fs -ls /;
[13]: shell ls -al ./;
[14]: shell ls -al ./user;
[15]: history;
[16]: history;