We can copy files from local file system and vice versa. We can append data into existing files in HDFS.
hadoop fs -copyFromLocal
or hadoop fs -put
– to copy files from local filesystem and HDFS. Process of copying data is already covered. File will be divided into blocks and will be stored on Datanodes in distributed fashion based on block size and replication factor.
hadoop fs -copyToLocal
orhadoop fs -get
– to copy files from HDFS to local filesystem. It will read all the blocks using index in sequence and construct the file in local file system.- We can also use
hadoop fs -appendToFile
to append data to existing file. - However, we will not be able to update or fix data in files when they are in HDFS. If we have to fix any data, we have to move file to local file system, fix data and then again copy to HDFS.
- We can move files from local file system to HDFS using
hadoop fs -moveFromLocal
. Even though there is a command moveToLocal, functionality is not implemented yet.