We recommend understanding replication factor using -ls
.
-ls
on a file provides the replication factor. The second column of the output shows the default replication factor of the file as shown in the example below.
[root@kcadmin]# hdfs dfs -ls Found 3 items drwx------ - root hadoop 0 2014-01-29 06:14 .staging -rw-r--r-- 3 root hadoop 1943 2014-01-24 01:01 passwd drwxr-xr-x - root hadoop 0 2014-04-22 12:45 test
Where:
passwd
, file the replication factor is 3.'-'
symbol represents a directory.Change or modify the replication factor for file passwd
using the following command:
[root@kcadmin]# hdfs dfs -setrep 2 passwd Replication 2 set: passwd
[root@kcadmin]# hdfs dfs -ls Found 1 items -rw-r--r-- 2 root hadoop 1943 2014-01-24 01:01 passwd
Changing the replication factor for a directory only affects the existing files. The new files in the directory will be created with the default replication factor (dfs.replication from hdfs-site.xml) of the cluster, as shown below:
[root@kcadmin]# hdfs dfs -ls test/ Found 1 items -rw-r--r-- 4 root hadoop 316 2014-04-29 01:57 test/host1 [root@kcadmin]# hdfs dfs -setrep -R 2 test Replication 2 set: test/host1 [root@kcadmin]# hdfs dfs -ls test/ Found 1 items -rw-r--r-- 2 root hadoop 316 2014-04-29 01:57 test/host1 [root@kcadmin]# hdfs dfs -copyFromLocal /etc/passwd test/ [root@kcadmin]# hdfs dfs -ls test/ Found 2 items -rw-r--r-- 2 root hadoop 316 2014-04-29 01:57 test/host1 -rw-r--r-- 4 root hadoop 1943 2014-04-29 02:11 test/passwd