HDFSAPI

HDFS文件上传

  1. Java代码
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    public void putFileToHDFS() throws URISyntaxException, IOException, InterruptedException {

    //1、创建配置信息对象

    Configuration conf = new Configuration();//hjhj

    //2、设置部分参数
    conf.set("dfs.replication","2");

    //3、找到HDFS地址
    //final URI uri, final Configuration conf, String user
    FileSystem fs = FileSystem.get(new URI("hdfs://vmaster:9000"), conf, "root");

    //4、上传本地文件路径
    Path src = new Path("F:\\Course\\aliyun\\transaction_details.csv");

    //5、要上传的HDFS的路径
    Path dest = new Path("hdfs://vmaster:9000/");

    //6、以拷贝的方式上传,从src -> dest
    fs.copyFromLocalFile(src,dest);

    //7、关闭
    fs.close();

    //
    System.out.println("OK了");
    }
  1. 命令
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
hadoop fs -put ./p /
./p 源文件 localsrc
/ 上传到对应目录,这里表示HDFS的根目录 dst

-put [-f] [-p] [-l] [-d] <localsrc> ... <dst> :
Copy files from the local file system into fs. Copying fails if the file already
exists, unless the -f flag is given.
Flags:

-p Preserves access and modification times, ownership and the mode.
-f Overwrites the destination if it already exists.
-l Allow DataNode to lazily persist the file to disk. Forces
replication factor of 1. This flag will result in reduced
durability. Use with care.

-d Skip creation of temporary file(<dst>._COPYING_).
  1. 使用RESTFUL API上传
    • 使用PUT方法发送一个请求,这里-X 后面加PUT就是以PUT发起请求
    • 然后使用返回的Location,在发送一个PUT请求
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
Step1: 
curl -i -X PUT "http://vmaster:50070/webhdfs/v1/1234.txt?op=create&noredirect=true"
HTTP/1.1 100 Continue

HTTP/1.1 200 OK
Cache-Control: no-cache
Expires: Mon, 11 Jan 2021 02:20:57 GMT
Date: Mon, 11 Jan 2021 02:20:57 GMT
Pragma: no-cache
Expires: Mon, 11 Jan 2021 02:20:57 GMT
Date: Mon, 11 Jan 2021 02:20:57 GMT
Pragma: no-cache
X-FRAME-OPTIONS: SAMEORIGIN
Content-Type: application/json
Transfer-Encoding: chunked

{"Location":"http://vslave2:50075/webhdfs/v1/1234.txt?op=CREATE&namenoderpcaddress=vmaster:9000&createflag=&createparent=true&overwrite=false"}

Step2:
curl -i -X PUT -T "F:\\Course\\BigData\\H.txt" "http://vslave2:50075/webhdfs/v1/1234.txt?op=CREATE&namenoderpcaddress=vmaster:9000&createflag=&createparent=true&overwrite=false"
HTTP/1.1 100 Continue

HTTP/1.1 201 Created
Location: hdfs://vmaster:9000/1234.txt
Content-Length: 0
Access-Control-Allow-Origin: *
Connection: close

如何将curl的命令转为直接发送请求,比如在程序中如何发起,https://zhuanlan.zhihu.com/p/33481273


HDFS目录创建

  1. Java代码
1
2
3
4
5
6
7
8
9
10
public void mkdirHDFS() throws URISyntaxException, IOException, InterruptedException {
//1、创建配置对象
Configuration conf = new Configuration();
//2、获取文件系统
FileSystem fs = FileSystem.get(new URI("hdfs://vmaster:9000"), conf, "root");
//3、创建目录
fs.mkdirs(new Path("hdfs://vmaster:9000/Good"));
//4、关闭文件系统
fs.close();
}
  1. 命令
1
2
3
hadoop fs -mkdir /dir
递归创建
hadoop fs -mkdir -p /dir/1/2/3
  1. RESTful API
1
curl -i -X PUT  "http://vmaster:50070/webhdfs/v1/dir?op=mkdirs"

HDFS目录,文件删除

  1. Java代码
1
2
3
4
5
6
7
8
9
10
11
12
13
public void delHDFS() throws URISyntaxException, IOException, InterruptedException {
//1、创建配置对象
Configuration conf = new Configuration();
//2、获取文件系统
FileSystem fs = FileSystem.get(new URI("hdfs://vmaster:9000"), conf, "root");
//3、删除文件
//Path var1,删除的路径
// boolean var2//是否递归删除
fs.delete(new Path("hdfs://vmaster:9000/Good"),true);
//4、关闭系统
fs.close();
System.out.println("del");
}
  1. 命令
1
hadoop fs -rm  /LICENSE.txt
  1. RESTful
1
curl -i -X DELETE "http://vmaster:50070/webhdfs/v1/12.txt?op=DELETE"

HDFS文件下载

  1. Java代码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
public void  getFileFromHDFS() throws URISyntaxException, IOException, InterruptedException {
//1、创建配置信息对象
Configuration conf = new Configuration();

//2、找到文件系统
FileSystem fs = FileSystem.get(new URI("hdfs://vmaster:9000"),conf,"root");

//3、下载文件
//boolean delSrc,是否将源文件删除
// Path src,要下载的路径
// Path dst, 要下载到那
// boolean useRawLocalFileSystem:是否校验文件
//
fs.copyToLocalFile(false,new Path("hdfs://vmaster:9000/transaction_details.csv"),new Path("F:\\Course\\BigData"),true);

//4、关闭文件系统
fs.close();
System.out.println("OKl");
}
  1. 命令
1
hadoop fs -get /NOTICE.txt

查看文件信息

  1. Java代码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
public void readLsitFile() throws URISyntaxException, IOException, InterruptedException {
//1、创建配置对象
Configuration conf = new Configuration();
//2、获取文件系统
FileSystem fs = FileSystem.get(new URI("hdfs://vmaster:9000"), conf, "root");
//3、迭代器
RemoteIterator<LocatedFileStatus> listFiles = fs.listFiles(new Path("hdfs://vmaster:9000/"), true);
//4、遍历迭代器
while(listFiles.hasNext()){
LocatedFileStatus fileStatus = listFiles.next();

//路径名
System.out.println("文件名" + fileStatus.getPath().getName());
//块的大小
System.out.println("大小" + fileStatus.getBlockSize());
//权限
System.out.println("权限" + fileStatus.getPermission());
//
System.out.println(fileStatus.getLen());

BlockLocation[] locations = fileStatus.getBlockLocations();
for(BlockLocation bl:locations){
System.out.println("block-offset" + bl.getOffset());
String[] hosts = bl.getHosts();
for(String host:hosts){
System.out.println(host);
}
}
System.out.println("----------------------------------------");
}
}

文件,目录判断

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
public void checkFile() throws URISyntaxException, IOException, InterruptedException {
//1、创建配置对象
Configuration conf = new Configuration();
//2、获取文件系统
FileSystem fs = FileSystem.get(new URI("hdfs://nna:9000"), conf, "root");
//3、遍历所有文件
FileStatus[] status = fs.listStatus(new Path("/"));
for(FileStatus status1:status){
//判断是否为文件
if(status1.isFile()){
System.out.println("文件:" + status1.getPath().getName());
}else {
System.out.println("目录:" + status1.getPath().getName());
}
}

//4、关闭文件系统
fs.close();
}

RESTful 更多查看 https://hadoop.apache.org/docs/r2.9.2/hadoop-project-dist/hadoop-hdfs/WebHDFS.html
Java 文档 https://hadoop.apache.org/docs/r2.9.2/api/index.html
命令使用hadoop fs -help 查看


#

Comments

Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×