【Hadoop】Windows下连接Hadoop进行开发

Windows下连接Hadoop集群需要做一些额外的配置,本文主要记录配置的方式,使用的是Scala,Java语言环境中配置是一样的。最后使用ScalaTest做的测试。

1.hadoop-2.6.2解压到:D:\Programs\hadoop-2.6.2下e
将%HADOOP_HOME%\bin下的hadoop.dll拷贝到:C:\Windows\System32下

2.配置环境变量:
HADOOP_HOME=D:\Programs\hadoop-2.6.2
HADOOP_BIN_PATH=%HADOOP_HOME%\bin
HADOOP_USER_NAME=hadoop
HADOOP_PREFIXX=%HADOOP_HOME%\bin
path之后添加:;%HADOOP_HOME%\bin

3.拷贝platform项目中的conf文件夹放到E盘(hadoop连接需要用到的是hadoop的:core-site.xml,hdfs-site.xml,mapred-site.xml,yarn-site.xml)

4.配置环境变量:
CONF_HOME=E:\conf

5.在项目中测试连接:

实现类:

package com.changtu.hdfs

/**
  * Created by lubinsu on 2016/6/8.
  * HDFS工具类
  */

import org.apache.hadoop.conf.Configuration
import org.apache.hadoop.fs.{FileSystem, _}

abstract class AbstractFSClient {

}

object HDFSClient extends AbstractFSClient {
  val conf = new Configuration()
  // 加载HADOOP配置文件
  try {
    conf.addResource(new Path("E:\\conf\\hdfs-site.xml"))
    conf.addResource(new Path("E:\\conf\\core-site.xml"))
    conf.addResource(new Path("E:\\conf\\yarn-site.xml"))
    conf.addResource(new Path("E:\\conf\\mapred-site.xml"))
  } catch {
    case e: IllegalArgumentException =>
      conf.addResource(new Path("/appl/conf/hdfs-site.xml"))
      conf.addResource(new Path("/appl/conf/core-site.xml"))
      conf.addResource(new Path("/appl/conf/yarn-site.xml"))
      conf.addResource(new Path("/appl/conf/mapred-site.xml"))
  }

  val hdfs = FileSystem.get(conf)

  /**
    * delete hdfs files
    *
    * @param path      path to be deleted
    * @param recursive if path is a directory and set to
    *                  true, the directory is deleted else throws an exception. In
    *                  case of a file the recursive can be set to either true or false.
    * @return true if delete is successful else false.
    */
  def delete(path: String, recursive: Boolean): Boolean = {

    val output = new Path(path)
    if (hdfs.exists(output)) {
      val flag = hdfs.delete(output, recursive)
      flag
    } else {
      true
    }
  }

  /**
    * Create a directory or file
    *
    * @param path    the path to be created
    * @param deleteF whether delete the dir if exists
    * @return true if create is successful else false.
    */
  def createDirectory(path: String, deleteF: Boolean): Boolean = {

    val output = new Path(path)

    if (hdfs.exists(output) && deleteF) {
      delete(path, recursive = true)
      hdfs.create(output)
    } else hdfs.create(output)

    if (hdfs.exists(output)) {
      true
    } else {
      false
    }
  }

  def release(): Unit = {
    hdfs.close()
  }
}

测试类:
Maven依赖:
<!-- scalaTest -->
<dependency>
    <groupId>org.scalactic</groupId>
    <artifactId>scalactic_${scala.bigVersion}</artifactId>
    <version>3.0.0-M15</version>
</dependency>
<!-- http://mvnrepository.com/artifact/org.scalatest/scalatest_2.10 -->
<dependency>
    <groupId>org.scalatest</groupId>
    <artifactId>scalatest_${scala.bigVersion}</artifactId>
    <version>3.0.0-M15</version>
    <scope>test</scope>
</dependency>


package com.changtu

import com.changtu.hdfs.HDFSClient
import org.scalatest.{FlatSpec, Matchers}

/**
  * Created by lubinsu on 2016/6/11.
  * 测试 HDFS client工具类
  */
class HDFSClientSpec extends FlatSpec with Matchers {

  "HDFS client" should "create a file" in {
    HDFSClient.createDirectory("/user/hadoop/test", deleteF = true) should be(true)
    HDFSClient.release()
  }
}

One thought on “【Hadoop】Windows下连接Hadoop进行开发

  • 2016-10-27 at 15:32
    Permalink

    Life is short, and this article saved valalbue time on this Earth.

    Reply

发表评论

电子邮件地址不会被公开。 必填项已用*标注

此站点使用Akismet来减少垃圾评论。了解我们如何处理您的评论数据