File Paths and Names¶
Let us quickly review about file paths and names. Typically, we will have folder and file name associated while processing a file.
- A file might be part of a directory structure with multiple levels.
- All the directories in the path are typically separated by
/
. - The last part of path is typically the name of a file or the name of a folder which contain multiple files.
- For first class Python functions, we typically pass a file and hence last part will be the name of the file. We pass these names as strings.
In [1]:
!ls -ltr /data/retail_db # Folder which contain multiple folders and files
total 20128 drwxrwxr-x 2 itversity itversity 24 Mar 8 02:04 categories drwxrwxr-x 2 itversity itversity 24 Mar 8 02:04 customers -rw-rw-r-- 1 itversity itversity 1748 Mar 8 02:04 create_db_tables_pg.sql -rw-rw-r-- 1 itversity itversity 10303297 Mar 8 02:04 create_db.sql drwxrwxr-x 2 itversity itversity 24 Mar 8 02:04 departments drwxrwxr-x 2 itversity itversity 24 Mar 8 02:04 order_items -rw-rw-r-- 1 itversity itversity 10297372 Mar 8 02:04 load_db_tables_pg.sql drwxrwxr-x 2 itversity itversity 24 Mar 8 02:04 orders drwxrwxr-x 2 itversity itversity 24 Mar 8 02:04 products
In [2]:
!ls -ltr /data/retail_db/orders
# The folder contains files related to orders data set
# As of now we have only one file
total 2932 -rw-rw-r-- 1 itversity itversity 2999944 Mar 8 02:04 part-00000
In [3]:
!ls -ltr /data/retail_db/orders/part-00000
# You can see that the directories in the hierarchy are separated by /
# The last part is a file
-rw-rw-r-- 1 itversity itversity 2999944 Mar 8 02:04 /data/retail_db/orders/part-00000
In [4]:
# You can use head command to confirm it is a text file
!head -5 /data/retail_db/orders/part-00000
1,2013-07-25 00:00:00.0,11599,CLOSED 2,2013-07-25 00:00:00.0,256,PENDING_PAYMENT 3,2013-07-25 00:00:00.0,12111,COMPLETE 4,2013-07-25 00:00:00.0,8827,CLOSED 5,2013-07-25 00:00:00.0,11318,COMPLETE
{note}
We can pass `/data/retail_db/orders/part-00000` as string to first class Python functions to perform File I/O.