Recap of basic file IO¶
Let us quickly recap basic file IO using first class Python functions.
- We use
open
to open a file withr
for read only purpose. - We can use
open
to open a file withw
to create a new file and write or to overwrite the existing file. - We can use
open
to open a file in append mode by usinga
. If the file does not exists, it will be created and if the file does exists then the content will be appended at the end of the file. - We might have to introduce
\n
(new line character) after each element if we want to write a list of elements into a file. - In case if we want to create a file, only if it does not exists, then we can use
open
withx
. If the file exists, it will throw error. - Let us perform a task to read a file under /data/retail_db/orders into a collection and then write the collection into a target file under data/retail_db/orders. The data folder which we are trying to write should be under our current folder.
- Read data from /data/retail_db/orders/part-00000.
In [1]:
!ls -ltr /data/retail_db/orders/part-00000
-rw-rw-r-- 1 itversity itversity 2999944 Mar 8 02:04 /data/retail_db/orders/part-00000
In [2]:
!file /data/retail_db/orders/part-00000
/data/retail_db/orders/part-00000: CSV text
In [3]:
!head -5 /data/retail_db/orders/part-00000
1,2013-07-25 00:00:00.0,11599,CLOSED 2,2013-07-25 00:00:00.0,256,PENDING_PAYMENT 3,2013-07-25 00:00:00.0,12111,COMPLETE 4,2013-07-25 00:00:00.0,8827,CLOSED 5,2013-07-25 00:00:00.0,11318,COMPLETE
In [4]:
orders = open('/data/retail_db/orders/part-00000').read().splitlines()
In [5]:
type(orders)
Out[5]:
list
In [6]:
len(orders)
Out[6]:
68883
In [7]:
orders[0]
Out[7]:
'1,2013-07-25 00:00:00.0,11599,CLOSED'
- Write the list
orders
to the target file.
In [8]:
!rm -rf data/retail_db/orders
In [9]:
!mkdir -p data/retail_db/orders
In [10]:
orders_file = open('data/retail_db/orders/part-00000', 'w')
In [11]:
for order in orders:
orders_file.write(f'{order}\n')
In [12]:
orders_file.close()
- Validate whether file is created or not as expected.
In [13]:
!ls -ltr data/retail_db/orders
total 2932 -rw-r--r-- 1 itversity itversity 2999944 Mar 25 06:07 part-00000
In [14]:
!head -5 data/retail_db/orders/part-00000
1,2013-07-25 00:00:00.0,11599,CLOSED 2,2013-07-25 00:00:00.0,256,PENDING_PAYMENT 3,2013-07-25 00:00:00.0,12111,COMPLETE 4,2013-07-25 00:00:00.0,8827,CLOSED 5,2013-07-25 00:00:00.0,11318,COMPLETE
In [15]:
!wc -l data/retail_db/orders/part-00000
68883 data/retail_db/orders/part-00000