配置spark令其支持hive-白红宇

配置spark令其支持hive

阅读量：4202 次

发布时间：2019-05-26

本文共 1357 字，大约阅读时间需要 4 分钟。

确保scala版本

Spark1.4搭配Scala 2.10

Spark1.6搭配Scala 2.10

Spark2.0搭配Scala 2.11

查看lib

Hive需要三个jar包，分别是datanucleus-api-jdo-3.2.6.jar、datanucleus-core-3.2.10.jar、datanucleus-rdbms-3.2.9.jar，如果已经有了就不需要重新编译了。如果需要重新编译，源码下载地址如下：https://github.com/apache/spark/releases/tag/v1.6.2

复制hive/hdfs配置文件

cd /appl/hive-1.2.1/conf

cp hive-site.xml /appl/spark-1.6.2/conf/

cd /appl/hadoop-2.7.0/etc/hadoop

cp core-site.xml /appl/spark-1.6.2/conf/

cp hdfs-site.xml /appl/spark-1.6.2/conf/

(the datanucleus jars under the lib directory and hive-site.xml under conf/ directory need to be available on the driver and all executors launched by the YARN cluster.)

启动

./bin/spark-shell --jars /appl/hive-1.2.1/lib/mysql-connector-java-5.1.30-bin.jar

测试

import org.apache.spark.sql.SQLContext

val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)

sqlContext.sql("create table if not exists test1 (id int, name string)")

sqlContext.sql("load data local inpath '/mk/test/test1.txt' into table test1")

sqlContext.sql("FROM test1 SELECT id, name").collect().foreach(println)

val df = sqlContext.sql("SELECT * FROM test1")

df.show

参考

https://www.iteblog.com/archives/1491

http://www.mamicode.com/info-detail-395201.html

http://spark.apache.org/docs/1.6.2/sql-programming-guide.html#hive-tables

http://www.itnose.net/detail/6513344.html

http://www.cnblogs.com/shishanyuan/p/4701656.html

你可能感兴趣的文章