本文共 1357 字,大约阅读时间需要 4 分钟。
确保scala版本 Spark1.4搭配Scala 2.10 Spark1.6搭配Scala 2.10 Spark2.0搭配Scala 2.11 查看lib Hive需要三个jar包,分别是datanucleus-api-jdo-3.2.6.jar、datanucleus-core-3.2.10.jar、datanucleus-rdbms-3.2.9.jar,如果已经有了就不需要重新编译了。如果需要重新编译,源码下载地址如下:https://github.com/apache/spark/releases/tag/v1.6.2 复制hive/hdfs配置文件 cd /appl/hive-1.2.1/conf cp hive-site.xml /appl/spark-1.6.2/conf/ cd /appl/hadoop-2.7.0/etc/hadoop cp core-site.xml /appl/spark-1.6.2/conf/ cp hdfs-site.xml /appl/spark-1.6.2/conf/ (the datanucleus jars under the lib directory and hive-site.xml under conf/ directory need to be available on the driver and all executors launched by the YARN cluster.) 启动 ./bin/spark-shell --jars /appl/hive-1.2.1/lib/mysql-connector-java-5.1.30-bin.jar 测试 import org.apache.spark.sql.SQLContext val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc) sqlContext.sql("create table if not exists test1 (id int, name string)") sqlContext.sql("load data local inpath '/mk/test/test1.txt' into table test1") sqlContext.sql("FROM test1 SELECT id, name").collect().foreach(println) val df = sqlContext.sql("SELECT * FROM test1") df.show 参考 https://www.iteblog.com/archives/1491 http://www.mamicode.com/info-detail-395201.html http://spark.apache.org/docs/1.6.2/sql-programming-guide.html#hive-tables http://www.itnose.net/detail/6513344.html http://www.cnblogs.com/shishanyuan/p/4701656.html