内容纲要
- 配置hive3.1.2的基本环境为:
JDK1.8 、 Hadoop3.2.1 和 Hbase2.4.6 - 使用hive版本为3.1.2
3.确保hadoop正常运行[root@hadoop0 ~]# jps 1576 NameNode 2184 ResourceManager 2316 NodeManager 1710 DataNode 3039 HQuorumPeer 89950 Jps
- 安装hive到/usr/local目录
cd /usr/local #下载 wget https://mirror.bit.edu.cn/apache/hive/hive-3.1.2/apache-hive-3.1.2-bin.tar.gz #解压 tar -zxvf apache-hive-3.1.2-bin.tar.gz hive-3.1.2 #配置环境变量 vim /etc/profile #在profile添加以下内容 #hive3.1.2 export HIVE_HOME=/usr/local/hive-3.1.2 export PATH=$PATH:$HIVE_HOME/bin
是环境变量立即生效
source /etc/profile
- 配置hive
cd cd /usr/local/hive-3.1.2/conf mv hive-env.sh.template hive-env.sh mv hive-default.xml.template hive-site.xml #配置$HIVE_HOME/bin下hive-config.sh cd /usr/local/hive-3.1.2/bin/ vim hive-config.sh #文件末尾追加配置,明确java、hadoop和hive的安装目录 export JAVA_HOME=/usr/local/jdk export HADOOP_HOME=/usr/local/hadoop-3.2.1 export HIVE_HOME=/usr/local/hive-3.1.2
- 新建临时目录
mkdir /usr/local/hive-3.1.2/tmp
- 配置$HIVE_HOME/conf下hive-site.xml 文件,支持mysql数据库
查找文件里的ConnectionURL、ConnectionPassword、ConnectionDriverName、ConnectionUserName属性并修改如下:javax.jdo.option.ConnectionPassword 123456 password to use against metastore database javax.jdo.option.ConnectionURL jdbc:mysql://192.168.1.86:3306/hive?createDatabaseIfNotExist=true JDBC connect string for ... javax.jdo.option.ConnectionDriverName com.mysql.cj.jdbc.Driver Driver class name for a JDBC metastore javax.jdo.option.ConnectionUserName root Username to use against metastore database - 配置$HIVE_HOME/conf下hive-site.xml 文件
替换全部 ${system:java.io.tmpdir} 为 /usr/local/hive-3.1.2/tmp ,一共4处
替换全部 ${system:user.name} 为 root ,一共3处 - 上传MYsql驱动,对应mysql8.0版本
#移动 mysql驱动到 /usr/local/hive-3.1.2/lib 目录 mv mysql-connector-java-8.0.25.jar /usr/local/hive-3.1.2/lib/
- Hive数据库初始化
#数据库初始化命令 schematool -dbType mysql -initSchema
执行上面命令如果报错
Exception in thread "main" java.lang.NoSuchMethodError: com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V at org.apache.hadoop.conf.Configuration.set(Configuration.java:1357) at org.apache.hadoop.conf.Configuration.set(Configuration.java:1338) at org.apache.hadoop.mapred.JobConf.setJar(JobConf.java:536) at org.apache.hadoop.mapred.JobConf.setJarByClass(JobConf.java:554) at org.apache.hadoop.mapred.JobConf.
(JobConf.java:448) at org.apache.hadoop.hive.conf.HiveConf.initialize(HiveConf.java:5141) at org.apache.hadoop.hive.conf.HiveConf. (HiveConf.java:5104) at org.apache.hive.beeline.HiveSchemaTool. (HiveSchemaTool.java:96) at org.apache.hive.beeline.HiveSchemaTool.main(HiveSchemaTool.java:1473) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.util.RunJar.run(RunJar.java:323) at org.apache.hadoop.util.RunJar.main(RunJar.java:236) 搜索问题发现是hadoop和hive版本中用的guava包版本兼容问题,将hadoop的guava对应的JAR包复制到hive中
#查找guava的jar包 find /usr/local/ -name guava*.jar #结果如下,发现不一致 /usr/local/hadoop-3.3.0/share/hadoop/yarn/csi/lib/guava-20.0.jar /usr/local/hadoop-3.3.0/share/hadoop/common/lib/guava-27.0-jre.jar /usr/local/hadoop-3.3.0/share/hadoop/hdfs/lib/guava-27.0-jre.jar /usr/local/hadoop-3.3.1/share/hadoop/hdfs/lib/guava-27.0-jre.jar /usr/local/hadoop-3.3.1/share/hadoop/common/lib/guava-27.0-jre.jar /usr/local/hadoop-3.2.1/share/hadoop/common/lib/guava-27.0-jre.jar /usr/local/hadoop-3.2.1/share/hadoop/hdfs/lib/guava-27.0-jre.jar /usr/local/hbase-2.4.6/lib/guava-11.0.2.jar /usr/local/hive-3.1.2/lib/guava-19.0.jar #将hadoop对应的guava包复制到hive中 mv /usr/local/hadoop-3.3.1/share/hadoop/common/lib/guava-27.0-jre.jar /usr/local/hive-3.1.2/lib/ #备份hive的jar包 mv /usr/local/hive-3.1.2/lib/guava-19.0.jar /usr/local/hive-3.1.2/lib/guava-19.0.jar.bak
解决以上问题,重启初始化hive,继续报错:joy:,这次是hive-site.xml的坑,里面有特殊字符,删掉(不知道为什么模板hive-default.xml.template里会有不能启动的特殊字符)
at org.apache.hadoop.util.RunJar.main(RunJar.java:236) Caused by: com.ctc.wstx.exc.WstxParsingException: Illegal character entity: expansion character (code 0x8 at [row,col,system-id]: [3215,96,"file:/usr/local/hive-3.1.2/conf/hive-site.xml"] #解决办法,找到3215行96字符为的 删除掉即可
- hive初始化后验证hive
[root@hadoop0 ~]# hive which: no hbase in (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/usr/local/jdk/bin:/usr/local/hadoop-3.2.1/bin:/usr/local/hadoop-3.2.1/sbin:/usr/local/habase-2.4.6/bin:/usr/local/hive-3.1.2/bin:/root/bin) SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/local/hive-3.1.2/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/local/hadoop-3.2.1/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] Hive Session ID = fb8d752c-cd4c-4f83-a0e4-2f4216ea5e41 Logging initialized using configuration in jar:file:/usr/local/hive-3.1.2/lib/hive-common-3.1.2.jar!/hive-log4j2.properties Async: true Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases. Hive Session ID = 80ab9bfb-0ae0-4d74-b43d-dd3aebc03e11 hive>
至此hive安装成功
从hibe表中导出数据#切换test库 use test; #创建表 create table emp (ename string,deptid int,degree int) row format delimited fields terminated by '|'; create table dept (deptid int,dname string) row format delimited fields terminated by '|'; #从文件导入数据 load data local inpath '/usr/local/hive-3.1.2/examples/files/emp.txt' overwrite into table emp; load data local inpath '/usr/local/hive-3.1.2/examples/files/dept.txt' overwrite into table dept; #从表导出数据到文件,字段分割符指定为 '|' insert overwrite directory '/root/2' row format delimited fields terminated by '|' select distinct(ename),deptid,degree from emp;
- 使用Beeline替代Hive CLI
Beeline基于Hive JDBC,使用时为了避免权限问题,修改Hadoop配置文件core-site.xml中加入如下配置,赋予root访问权限hadoop.proxyuser.root.hosts 192.168.1.86 hadoop.proxyuser.root.groups * 然后重启hadoop启动hiveserver2服务进入beetlin
#1.重启hadoop stop-all.sh start-all.sh #2.以守护进程启动hiveserver2服务 hiveserver2 & #上面的同等命令 #hive --service hiveserver2 #3.登录beetline beeline -u jdbc:hive2://hadoop0:10000/default -n root -p
- 使用 beetline或在程序中使用jdbc方式访问hive时出现如下问题,并解决
常规sql查询正常(不触发mapreduce): select ename,deptid,degree from test.emp 计算类sql不正常(触发mapreduce): select count(0) from emp; 错误信息: org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:492) java.lang.OutOfMemoryError: Java heap space FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)
解决办法,修改/usr/local/hive-3.1.2/conf/hive-env.sh
vim /usr/local/hive-3.1.2/conf/hive-env.sh #将文件内的值修改为2048 export HADOOP_HEAPSIZE=2048 #重启 hiveServer2服务后生效
再次执行返回正确结果
- hive操作语句
# 修改表注释 alter table tm_user_info set tblproperties('comment'='员工表');