Sorting list of delimited strings¶
Let us perform a task to sort employees based upon their salary using employees list.
- We will create employee list in the form of comma separated or delimited strings.
- Employee list will contain employee id, email and salary.
In [1]:
employees = [
'1,ktrett0@independent.co.uk,6998.95',
'2,khaddock1@deviantart.com,10572.4',
'3,ecraft2@dell.com,3967.35',
'4,drussam3@t-online.de,17672.44',
'5,graigatt4@github.io,11660.67',
'6,bjaxon5@salon.com,18614.93',
'7,araulston6@list-manage.com,11550.75',
'8,mcobb7@mozilla.com,17016.15',
'9,grobardley8@unesco.org,14141.25',
'10,bbuye9@vkontakte.ru,12193.2'
]
In [2]:
type(employees)
Out[2]:
list
In [3]:
employees[0]
Out[3]:
'1,ktrett0@independent.co.uk,6998.95'
- We need to sort the data by comparing salaries between employees.
- We can define custom comparison logic using
key
argument. - Each element or record in the list is comma seperated.
- We need to extract the salary as float for right comparison.
- Here is how we can extract the salary.
In [4]:
# Reading first element
employees[0]
Out[4]:
'1,ktrett0@independent.co.uk,6998.95'
In [5]:
emp = employees[0]
In [6]:
type(emp)
Out[6]:
str
In [7]:
emp
Out[7]:
'1,ktrett0@independent.co.uk,6998.95'
In [8]:
# We can use split with ',' as delimiter.
# It will create a list of strings.
# The list contains 3 elements - id, email and salary
# All 3 will be of type string
emp.split(',')
Out[8]:
['1', 'ktrett0@independent.co.uk', '6998.95']
In [9]:
emp_list = emp.split(',')
In [10]:
type(emp_list)
Out[10]:
list
In [11]:
emp_list
Out[11]:
['1', 'ktrett0@independent.co.uk', '6998.95']
In [12]:
for e in emp_list:
print(f'Data type of {e} is {type(e)}')
Data type of 1 is <class 'str'> Data type of ktrett0@independent.co.uk is <class 'str'> Data type of 6998.95 is <class 'str'>
In [13]:
# Getting salary
emp_list[2]
Out[13]:
'6998.95'
In [14]:
# We can also -1 to read from the last
emp_list[-1]
Out[14]:
'6998.95'
In [15]:
# We need to change the data type to float or decimal for right comparison.
float(emp_list[-1])
Out[15]:
6998.95
In [16]:
# Complete logic
float(emp.split(',')[-1])
Out[16]:
6998.95
In [17]:
# We can pass the comparison logic to key function in sorted
# You can see the output. It is sorted in ascending order by salary.
sorted(employees, key=lambda emp: float(emp.split(',')[-1]))
Out[17]:
['3,ecraft2@dell.com,3967.35', '1,ktrett0@independent.co.uk,6998.95', '2,khaddock1@deviantart.com,10572.4', '7,araulston6@list-manage.com,11550.75', '5,graigatt4@github.io,11660.67', '10,bbuye9@vkontakte.ru,12193.2', '9,grobardley8@unesco.org,14141.25', '8,mcobb7@mozilla.com,17016.15', '4,drussam3@t-online.de,17672.44', '6,bjaxon5@salon.com,18614.93']
In [18]:
# You can reverse by using reverse keywork argument
# reverse will be applied on custom comparison passed as part of key
sorted(employees, key=lambda emp: float(emp.split(',')[-1]), reverse=True)
Out[18]:
['6,bjaxon5@salon.com,18614.93', '4,drussam3@t-online.de,17672.44', '8,mcobb7@mozilla.com,17016.15', '9,grobardley8@unesco.org,14141.25', '10,bbuye9@vkontakte.ru,12193.2', '5,graigatt4@github.io,11660.67', '7,araulston6@list-manage.com,11550.75', '2,khaddock1@deviantart.com,10572.4', '1,ktrett0@independent.co.uk,6998.95', '3,ecraft2@dell.com,3967.35']
In [19]:
employees.sort(key=lambda emp: float(emp.split(',')[-1]))
In [20]:
employees
Out[20]:
['3,ecraft2@dell.com,3967.35', '1,ktrett0@independent.co.uk,6998.95', '2,khaddock1@deviantart.com,10572.4', '7,araulston6@list-manage.com,11550.75', '5,graigatt4@github.io,11660.67', '10,bbuye9@vkontakte.ru,12193.2', '9,grobardley8@unesco.org,14141.25', '8,mcobb7@mozilla.com,17016.15', '4,drussam3@t-online.de,17672.44', '6,bjaxon5@salon.com,18614.93']
In [21]:
employees.sort(key=lambda emp: float(emp.split(',')[-1]), reverse=True)
In [22]:
employees
Out[22]:
['6,bjaxon5@salon.com,18614.93', '4,drussam3@t-online.de,17672.44', '8,mcobb7@mozilla.com,17016.15', '9,grobardley8@unesco.org,14141.25', '10,bbuye9@vkontakte.ru,12193.2', '5,graigatt4@github.io,11660.67', '7,araulston6@list-manage.com,11550.75', '2,khaddock1@deviantart.com,10572.4', '1,ktrett0@independent.co.uk,6998.95', '3,ecraft2@dell.com,3967.35']
In [23]:
employees = [
'1,ktrett0@independent.co.uk,6998.95',
'2,khaddock1@deviantart.com,10572.4',
'3,ecraft2@dell.com,3967.35',
'4,drussam3@t-online.de,17672.44',
'5,graigatt4@github.io,11660.67',
'6,bjaxon5@salon.com,18614.93',
'7,araulston6@list-manage.com,11550.75',
'8,mcobb7@mozilla.com,17016.15',
'9,grobardley8@unesco.org,14141.25',
'10,bbuye9@vkontakte.ru,12193.2'
]
In [24]:
sorted(employees, key=lambda emp: emp.split(',')[-1], reverse=True)
Out[24]:
['1,ktrett0@independent.co.uk,6998.95', '3,ecraft2@dell.com,3967.35', '6,bjaxon5@salon.com,18614.93', '4,drussam3@t-online.de,17672.44', '8,mcobb7@mozilla.com,17016.15', '9,grobardley8@unesco.org,14141.25', '10,bbuye9@vkontakte.ru,12193.2', '5,graigatt4@github.io,11660.67', '7,araulston6@list-manage.com,11550.75', '2,khaddock1@deviantart.com,10572.4']
In [25]:
'6' > '18614'
Out[25]:
True