Write iterables to files using CSV¶
Let us understand how to write data in iterables such as list of tuples to delimited files using Python I/O functions as well as csv
. csv
can also be used to write iterable of dicts into delimited files.
In [1]:
orders = [(1, '2013-07-25 00:00:00.0', 11599, 'CLOSED'),
(2, '2013-07-25 00:00:00.0', 256, 'PENDING_PAYMENT'),
(3, '2013-07-25 00:00:00.0', 12111, 'COMPLETE'),
(4, '2013-07-25 00:00:00.0', 8827, 'CLOSED'),
(5, '2013-07-25 00:00:00.0', 11318, 'COMPLETE'),
(6, '2013-07-25 00:00:00.0', 7130, 'COMPLETE'),
(7, '2013-07-25 00:00:00.0', 4530, 'COMPLETE'),
(8, '2013-07-25 00:00:00.0', 2911, 'PROCESSING'),
(9, '2013-07-25 00:00:00.0', 5657, 'PENDING_PAYMENT'),
(10, '2013-07-25 00:00:00.0', 5648, 'PENDING_PAYMENT')]
orders_csv = map(lambda order: ','.join(map(lambda item: str(item), order)), orders)
In [2]:
list(orders_csv)
Out[2]:
['1,2013-07-25 00:00:00.0,11599,CLOSED', '2,2013-07-25 00:00:00.0,256,PENDING_PAYMENT', '3,2013-07-25 00:00:00.0,12111,COMPLETE', '4,2013-07-25 00:00:00.0,8827,CLOSED', '5,2013-07-25 00:00:00.0,11318,COMPLETE', '6,2013-07-25 00:00:00.0,7130,COMPLETE', '7,2013-07-25 00:00:00.0,4530,COMPLETE', '8,2013-07-25 00:00:00.0,2911,PROCESSING', '9,2013-07-25 00:00:00.0,5657,PENDING_PAYMENT', '10,2013-07-25 00:00:00.0,5648,PENDING_PAYMENT']
In [3]:
!rm -rf data/retail_db/orders
In [4]:
!mkdir -p data/retail_db/orders
In [5]:
orders_file = open('data/retail_db/orders/part-00000', 'w')
orders_csv = map(lambda order: ','.join(map(lambda item: str(item), order)), orders)
for order in orders_csv:
orders_file.write(f'{order}\n')
orders_file.close()
In [6]:
!ls -ltr data/retail_db/orders/part-00000
-rw-r--r-- 1 itversity itversity 402 Mar 25 06:14 data/retail_db/orders/part-00000
In [7]:
!cat data/retail_db/orders/part-00000
1,2013-07-25 00:00:00.0,11599,CLOSED 2,2013-07-25 00:00:00.0,256,PENDING_PAYMENT 3,2013-07-25 00:00:00.0,12111,COMPLETE 4,2013-07-25 00:00:00.0,8827,CLOSED 5,2013-07-25 00:00:00.0,11318,COMPLETE 6,2013-07-25 00:00:00.0,7130,COMPLETE 7,2013-07-25 00:00:00.0,4530,COMPLETE 8,2013-07-25 00:00:00.0,2911,PROCESSING 9,2013-07-25 00:00:00.0,5657,PENDING_PAYMENT 10,2013-07-25 00:00:00.0,5648,PENDING_PAYMENT
- Writing iterables of tuples using
csv.writer
.
In [8]:
orders = [(1, '2013-07-25 00:00:00.0', 11599, 'CLOSED'),
(2, '2013-07-25 00:00:00.0', 256, 'PENDING_PAYMENT'),
(3, '2013-07-25 00:00:00.0', 12111, 'COMPLETE'),
(4, '2013-07-25 00:00:00.0', 8827, 'CLOSED'),
(5, '2013-07-25 00:00:00.0', 11318, 'COMPLETE'),
(6, '2013-07-25 00:00:00.0', 7130, 'COMPLETE'),
(7, '2013-07-25 00:00:00.0', 4530, 'COMPLETE'),
(8, '2013-07-25 00:00:00.0', 2911, 'PROCESSING'),
(9, '2013-07-25 00:00:00.0', 5657, 'PENDING_PAYMENT'),
(10, '2013-07-25 00:00:00.0', 5648, 'PENDING_PAYMENT')]
In [9]:
import csv
In [10]:
csv.writer?
Docstring: csv_writer = csv.writer(fileobj [, dialect='excel'] [optional keyword args]) for row in sequence: csv_writer.writerow(row) [or] csv_writer = csv.writer(fileobj [, dialect='excel'] [optional keyword args]) csv_writer.writerows(rows) The "fileobj" argument can be any object that supports the file API. Type: builtin_function_or_method
In [11]:
!rm data/retail_db/orders/part-00000
In [12]:
orders_file = open('data/retail_db/orders/part-00000', 'w')
In [13]:
csv_writer = csv.writer(orders_file)
In [14]:
csv_writer.writerows(orders)
In [15]:
orders_file.close()
In [16]:
!ls -ltr data/retail_db/orders/part-00000
-rw-r--r-- 1 itversity itversity 412 Mar 25 06:14 data/retail_db/orders/part-00000
In [17]:
!cat data/retail_db/orders/part-00000
1,2013-07-25 00:00:00.0,11599,CLOSED 2,2013-07-25 00:00:00.0,256,PENDING_PAYMENT 3,2013-07-25 00:00:00.0,12111,COMPLETE 4,2013-07-25 00:00:00.0,8827,CLOSED 5,2013-07-25 00:00:00.0,11318,COMPLETE 6,2013-07-25 00:00:00.0,7130,COMPLETE 7,2013-07-25 00:00:00.0,4530,COMPLETE 8,2013-07-25 00:00:00.0,2911,PROCESSING 9,2013-07-25 00:00:00.0,5657,PENDING_PAYMENT 10,2013-07-25 00:00:00.0,5648,PENDING_PAYMENT
- We can also write into delimited text files using other delimiters than comma.
In [18]:
!rm data/retail_db/orders/part-00000
In [19]:
orders_file = open('data/retail_db/orders/part-00000', 'w')
csv_writer = csv.writer(orders_file, delimiter=';')
csv_writer.writerows(orders)
orders_file.close()
In [20]:
!ls -ltr data/retail_db/orders/part-00000
-rw-r--r-- 1 itversity itversity 412 Mar 25 06:14 data/retail_db/orders/part-00000
In [21]:
!cat data/retail_db/orders/part-00000
1;2013-07-25 00:00:00.0;11599;CLOSED 2;2013-07-25 00:00:00.0;256;PENDING_PAYMENT 3;2013-07-25 00:00:00.0;12111;COMPLETE 4;2013-07-25 00:00:00.0;8827;CLOSED 5;2013-07-25 00:00:00.0;11318;COMPLETE 6;2013-07-25 00:00:00.0;7130;COMPLETE 7;2013-07-25 00:00:00.0;4530;COMPLETE 8;2013-07-25 00:00:00.0;2911;PROCESSING 9;2013-07-25 00:00:00.0;5657;PENDING_PAYMENT 10;2013-07-25 00:00:00.0;5648;PENDING_PAYMENT
- We can also write list of dicts into delimited strings using
csv.DictWriter
In [22]:
orders = [
{'order_id': 1, 'order_date': '2013-07-25 00:00:00.0', 'order_customer_id': 11599, 'order_status': 'CLOSED'},
{'order_id': 2, 'order_date': '2013-07-25 00:00:00.0', 'order_customer_id': 256, 'order_status': 'PENDING_PAYMENT'},
{'order_id': 3, 'order_date': '2013-07-25 00:00:00.0', 'order_customer_id': 12111, 'order_status': 'COMPLETE'},
{'order_id': 4, 'order_date': '2013-07-25 00:00:00.0', 'order_customer_id': 8827, 'order_status': 'CLOSED'},
{'order_id': 5, 'order_date': '2013-07-25 00:00:00.0', 'order_customer_id': 11318, 'order_status': 'COMPLETE'},
{'order_id': 6, 'order_date': '2013-07-25 00:00:00.0', 'order_customer_id': 7130, 'order_status': 'COMPLETE'},
{'order_id': 7, 'order_date': '2013-07-25 00:00:00.0', 'order_customer_id': 4530, 'order_status': 'COMPLETE'},
{'order_id': 8, 'order_date': '2013-07-25 00:00:00.0', 'order_customer_id': 2911, 'order_status': 'PROCESSING'},
{'order_id': 9, 'order_date': '2013-07-25 00:00:00.0', 'order_customer_id': 5657, 'order_status': 'PENDING_PAYMENT'},
{'order_id': 10, 'order_date': '2013-07-25 00:00:00.0', 'order_customer_id': 5648, 'order_status': 'PENDING_PAYMENT'}
]
In [23]:
!rm data/retail_db/orders/part-00000
In [24]:
orders_file = open('data/retail_db/orders/part-00000', 'w')
In [25]:
csv.DictWriter?
Init signature: csv.DictWriter( f, fieldnames, restval='', extrasaction='raise', dialect='excel', *args, **kwds, ) Docstring: <no docstring> File: /usr/local/lib/python3.8/csv.py Type: type Subclasses:
In [26]:
csv_writer = csv.DictWriter(
orders_file,
fieldnames=['order_id', 'order_date', 'order_customer_id', 'order_status'],
delimiter=';'
)
In [27]:
csv_writer.writerows(orders)
In [28]:
orders_file.close()
In [29]:
!ls -ltr data/retail_db/orders/part-00000
-rw-r--r-- 1 itversity itversity 412 Mar 25 06:14 data/retail_db/orders/part-00000
In [30]:
!cat data/retail_db/orders/part-00000
1;2013-07-25 00:00:00.0;11599;CLOSED 2;2013-07-25 00:00:00.0;256;PENDING_PAYMENT 3;2013-07-25 00:00:00.0;12111;COMPLETE 4;2013-07-25 00:00:00.0;8827;CLOSED 5;2013-07-25 00:00:00.0;11318;COMPLETE 6;2013-07-25 00:00:00.0;7130;COMPLETE 7;2013-07-25 00:00:00.0;4530;COMPLETE 8;2013-07-25 00:00:00.0;2911;PROCESSING 9;2013-07-25 00:00:00.0;5657;PENDING_PAYMENT 10;2013-07-25 00:00:00.0;5648;PENDING_PAYMENT