Creating a Directory in HDFS

Home directories within HDFS are stored in /user/$HOME. From the previous example with -ls, it can be seen that the /user directory does not currently exist. To create the /user directory within HDFS, use the -mkdir command:

$ hdfs dfs -mkdir /user

To make a home directory for the current user, hduser, use the

-mkdir command again:
$ hdfs dfs -mkdir /user/hduser

Use the -ls command to verify that the previous directories were created:

$ hdfs dfs -ls -R /user
drwxr-xr-x - hduser supergroup 0 2015-09-22 18:01 /user/
hduser

Copy Data onto HDFS

After a directory has been created for the current user, data can be uploaded to the user’s HDFS home directory with the -put command:

$ hdfs dfs -put /home/hduser/input.txt /user/hduser

This command copies the file /home/hduser/input.txt from the local filesystem to /user/hduser/input.txt on HDFS.

Use the -ls command to verify that input.txt was moved to HDFS:

$ hdfs dfs -ls
Found 1 items
-rw-r--r-- 1 hduser supergroup 52 2015-09-20 13:20
input.txt

Retrieving Data from HDFS

Multiple commands allow data to be retrieved from HDFS. To simply view the contents of a file, use the -cat command. -cat reads a file on HDFS and displays its contents to stdout. The following command uses -cat to display the contents of

/user/hduser/input.txt:
$ hdfs dfs -cat input.txt
jack be nimble
jack be quick
jack jumped over the candlestick

Data can also be copied from HDFS to the local filesystem using the -get command. The -get command is the opposite of the -put command:

$ hdfs dfs -get input.txt /home/hduser

This command copies input.txt from /user/hduser on HDFS to /home/hduser on the local filesystem.

Reproduced from Hadoop with Python free ebook

Leave a Comment