Sai

Write Delimited Strings into Files

Write Delimited Strings into files¶

Let us understand how to write delimited strings into files. We will start with a collection or list of tuples and see how to convert to delimited strings before writing to a file.

Here are the steps involved to write list of tuples into file as delimited strings.

Convert the list of tuples into list of delimited strings.
Open the file in write mode using w (overwrite) or a (append).
Add the data into the file.
Ensure that the data in the file is validated.

In [1]:

orders = [(1, '2013-07-25 00:00:00.0', 11599, 'CLOSED'),
 (2, '2013-07-25 00:00:00.0', 256, 'PENDING_PAYMENT'),
 (3, '2013-07-25 00:00:00.0', 12111, 'COMPLETE'),
 (4, '2013-07-25 00:00:00.0', 8827, 'CLOSED'),
 (5, '2013-07-25 00:00:00.0', 11318, 'COMPLETE'),
 (6, '2013-07-25 00:00:00.0', 7130, 'COMPLETE'),
 (7, '2013-07-25 00:00:00.0', 4530, 'COMPLETE'),
 (8, '2013-07-25 00:00:00.0', 2911, 'PROCESSING'),
 (9, '2013-07-25 00:00:00.0', 5657, 'PENDING_PAYMENT'),
 (10, '2013-07-25 00:00:00.0', 5648, 'PENDING_PAYMENT')]

In [2]:

type(orders)

Out[2]:

list

In [3]:

orders[0]

Out[3]:

(1, '2013-07-25 00:00:00.0', 11599, 'CLOSED')

In [4]:

type(orders[0])

Out[4]:

tuple

In [5]:

order = orders[0]

In [6]:

str.join?

Signature: str.join(self, iterable, /)
Docstring:
Concatenate any number of strings.

The string whose method is called is inserted in between each given string.
The result is returned as a new string.

Example: '.'.join(['ab', 'pq', 'rs']) -> 'ab.pq.rs'
Type:      method_descriptor

In [7]:

'hello'.join

Out[7]:

<function str.join(iterable, /)>

In [8]:

','.join(order) # throws error as first and third elements are of type int

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [8], in <cell line: 1>()
----> 1 ','.join(order)

TypeError: sequence item 0: expected str instance, int found

In [9]:

[str(item) for item in order]

Out[9]:

['1', '2013-07-25 00:00:00.0', '11599', 'CLOSED']

In [10]:

# Convering all the items in tuple to strings using list comprehension
','.join([str(item) for item in order]) 

Out[10]:

'1,2013-07-25 00:00:00.0,11599,CLOSED'

In [11]:

list(map(lambda item: str(item), order))

Out[11]:

['1', '2013-07-25 00:00:00.0', '11599', 'CLOSED']

In [12]:

# Convering all the items in tuple to strings using map function
','.join(map(lambda item: str(item), order))

Out[12]:

'1,2013-07-25 00:00:00.0,11599,CLOSED'

In [13]:

orders

Out[13]:

[(1, '2013-07-25 00:00:00.0', 11599, 'CLOSED'),
 (2, '2013-07-25 00:00:00.0', 256, 'PENDING_PAYMENT'),
 (3, '2013-07-25 00:00:00.0', 12111, 'COMPLETE'),
 (4, '2013-07-25 00:00:00.0', 8827, 'CLOSED'),
 (5, '2013-07-25 00:00:00.0', 11318, 'COMPLETE'),
 (6, '2013-07-25 00:00:00.0', 7130, 'COMPLETE'),
 (7, '2013-07-25 00:00:00.0', 4530, 'COMPLETE'),
 (8, '2013-07-25 00:00:00.0', 2911, 'PROCESSING'),
 (9, '2013-07-25 00:00:00.0', 5657, 'PENDING_PAYMENT'),
 (10, '2013-07-25 00:00:00.0', 5648, 'PENDING_PAYMENT')]

In [14]:

orders_csv = map(lambda order: ','.join(map(lambda item: str(item), order)), orders)

In [15]:

list(orders_csv)

Out[15]:

['1,2013-07-25 00:00:00.0,11599,CLOSED',
 '2,2013-07-25 00:00:00.0,256,PENDING_PAYMENT',
 '3,2013-07-25 00:00:00.0,12111,COMPLETE',
 '4,2013-07-25 00:00:00.0,8827,CLOSED',
 '5,2013-07-25 00:00:00.0,11318,COMPLETE',
 '6,2013-07-25 00:00:00.0,7130,COMPLETE',
 '7,2013-07-25 00:00:00.0,4530,COMPLETE',
 '8,2013-07-25 00:00:00.0,2911,PROCESSING',
 '9,2013-07-25 00:00:00.0,5657,PENDING_PAYMENT',
 '10,2013-07-25 00:00:00.0,5648,PENDING_PAYMENT']

In [16]:

orders_csv = map(lambda order: ','.join(map(lambda item: str(item), order)), orders)
order = list(orders_csv)[0]
order

Out[16]:

'1,2013-07-25 00:00:00.0,11599,CLOSED'

Writing CSV strings one at a time to the file.

In [17]:

!rm -rf data/retail_db/orders

In [18]:

!mkdir -p data/retail_db/orders

In [19]:

orders_file = open('data/retail_db/orders/part-00000', 'w')

In [20]:

orders_csv = map(lambda order: ','.join(map(lambda item: str(item), order)), orders)

In [21]:

for order in orders_csv:
    orders_file.write(f'{order}\n')

In [22]:

orders_file.close()

Writing as one big string. As we are opening the file using w, the file will be truncated. It means the contents of the file will be overwritten with the string we are trying to write to the file.

In [23]:

orders_csv = map(lambda order: ','.join(map(lambda item: str(item), order)), orders)

In [24]:

orders_string = '\n'.join(orders_csv)

In [25]:

orders_string

Out[25]:

'1,2013-07-25 00:00:00.0,11599,CLOSED\n2,2013-07-25 00:00:00.0,256,PENDING_PAYMENT\n3,2013-07-25 00:00:00.0,12111,COMPLETE\n4,2013-07-25 00:00:00.0,8827,CLOSED\n5,2013-07-25 00:00:00.0,11318,COMPLETE\n6,2013-07-25 00:00:00.0,7130,COMPLETE\n7,2013-07-25 00:00:00.0,4530,COMPLETE\n8,2013-07-25 00:00:00.0,2911,PROCESSING\n9,2013-07-25 00:00:00.0,5657,PENDING_PAYMENT\n10,2013-07-25 00:00:00.0,5648,PENDING_PAYMENT'

In [26]:

orders_file = open('data/retail_db/orders/part-00000', 'w')

In [27]:

orders_file.write(orders_string)

Out[27]:

In [28]:

orders_file.close()

Sai

Write Delimited Strings into Files

Write Delimited Strings into files¶

Share this post

Join Our Community

Follow Us

Links

Contact Info

Address

Phone

Email