Sorting Data in lists and sets¶
Let us go through the details about sorting data in lists and sets.
- We can use
sorted
function to sort any collection – list, set, dict or tuple. For now, let’s focus on list and set. sorted
always returns a new list. We typically assign it to a new variable to process the newly created sorted list further.- On top of
list
, we can also invokesort
function. Whilesorted
creates new list,sort
will update the existing list. The sorting done bysort
is also known as inplace sorting. - While
sorted
returns a new list,sort
on top of list returns nothing. - Both
sorted
andsort
takes same arguments. - We can use
reverse
to sort in reverse order. - We can also sort the data based upon comparison logic passed using
key
argument. - We use
sorted
more often thanlist.sort
for following reasons.sorted
can be used on all types of collections – list, set, dict, tuple or any other collection type.- The original collection will not be touched.
- We can pass
sorted
function to other functions as part of chained calling. For example if we would like to get unique records after sorting, we can sayset(sorted(l))
. It is not possible withlist.sort
In [1]:
# Run this to see the syntax of sorted
sorted?
Signature: sorted(iterable, /, *, key=None, reverse=False) Docstring: Return a new list containing all items from the iterable in ascending order. A custom key function can be supplied to customize the sort order, and the reverse flag can be set to request the result in descending order. Type: builtin_function_or_method
In [2]:
# Run this to see the syntax of sort function
list.sort?
Signature: list.sort(self, /, *, key=None, reverse=False) Docstring: Sort the list in ascending order and return None. The sort is in-place (i.e. the list itself is modified) and stable (i.e. the order of two equal elements is maintained). If a key function is given, apply it once to each list item and sort them, ascending or descending, according to their function values. The reverse flag can be set to sort in descending order. Type: method_descriptor
- Sorting a simple list using
sorted
In [3]:
l = [1, 3, 2, 6, 4]
In [4]:
sorted(l)
Out[4]:
[1, 2, 3, 4, 6]
In [5]:
type(sorted(l))
Out[5]:
list
In [6]:
# l did not change
l
Out[6]:
[1, 3, 2, 6, 4]
In [7]:
# Typical usage for further processing
l_sorted = sorted(l)
In [8]:
type(l_sorted)
Out[8]:
list
In [9]:
l_sorted
Out[9]:
[1, 2, 3, 4, 6]
In [10]:
sorted(l, reverse=True)
Out[10]:
[6, 4, 3, 2, 1]
- Sorting a simple list using
sort
In [11]:
l = [1, 3, 2, 6, 4]
In [12]:
l
Out[12]:
[1, 3, 2, 6, 4]
In [13]:
# We typically don't assign to another variable.
l.sort()
In [14]:
type(l.sort())
Out[14]:
NoneType
In [15]:
l
Out[15]:
[1, 2, 3, 4, 6]
- Sorting a set using
sort
In [16]:
s = {1, 4, 2}
In [17]:
s
Out[17]:
{1, 2, 4}
In [18]:
# This will fail as sort is available only on top of list but not set
s.sort()
--------------------------------------------------------------------------- AttributeError Traceback (most recent call last) Input In [18], in <cell line: 2>() 1 # This will fail as sort is available only on top of list but not set ----> 2 s.sort() AttributeError: 'set' object has no attribute 'sort'
In [19]:
sorted(s)
Out[19]:
[1, 2, 4]
In [20]:
sorted(s, reverse=True)
Out[20]:
[4, 2, 1]
In [21]:
type(sorted(s))
Out[21]:
list
- Reverse sorting of a list or a set using
sorted
. Similar process can be followed forlist.sort
as well.
In [22]:
l = [1, 3, 2, 6, 4]
In [23]:
sorted(l, reverse=True)
Out[23]:
[6, 4, 3, 2, 1]
In [24]:
l.sort(reverse=True)
In [25]:
l
Out[25]:
[6, 4, 3, 2, 1]
In [26]:
employees = [
'1,ktrett0@independent.co.uk,6998.95',
'2,khaddock1@deviantart.com,10572.4',
'3,ecraft2@dell.com,3967.35',
'4,drussam3@t-online.de,17672.44',
'5,graigatt4@github.io,11660.67',
'6,bjaxon5@salon.com,18614.93',
'7,araulston6@list-manage.com,11550.75',
'8,mcobb7@mozilla.com,17016.15',
'9,grobardley8@unesco.org,14141.25',
'10,bbuye9@vkontakte.ru,12193.2'
]
- We need to sort the data by comparing salaries between employees.
- We can define custom comparitor using
key
argument. - Each element or record in the list is comma seperated.
- We need to extract the salary as float for right comparison.
- Here is how we can extract the salary.
In [27]:
# Reading first element
employees[0]
Out[27]:
'1,ktrett0@independent.co.uk,6998.95'
In [28]:
emp = employees[0]
In [29]:
type(emp)
Out[29]:
str
In [30]:
# We can use split with ',' as delimiter.
# It will create a list of strings.
# The list contains 3 elements - id, email and salary
# All 3 will be of type string
emp.split(',')
Out[30]:
['1', 'ktrett0@independent.co.uk', '6998.95']
In [31]:
emp_list = emp.split(',')
In [32]:
type(emp_list)
Out[32]:
list
In [33]:
for e in emp_list:
print(f'Data type of {e} is {type(e)}')
Data type of 1 is <class 'str'> Data type of ktrett0@independent.co.uk is <class 'str'> Data type of 6998.95 is <class 'str'>
In [34]:
# Getting salary
emp_list[2]
Out[34]:
'6998.95'
In [35]:
# We can also -1 to read from the last
emp_list[-1]
Out[35]:
'6998.95'
In [36]:
# We need to change the data type to float or decimal for right comparison.
float(emp_list[-1])
Out[36]:
6998.95
In [37]:
# Complete logic
float(emp.split(',')[-1])
Out[37]:
6998.95
In [38]:
# We can pass the comparison logic to key function in sorted
# You can see the output. It is sorted in ascending order by salary.
sorted(employees, key=lambda emp: float(emp.split(',')[-1]))
Out[38]:
['3,ecraft2@dell.com,3967.35', '1,ktrett0@independent.co.uk,6998.95', '2,khaddock1@deviantart.com,10572.4', '7,araulston6@list-manage.com,11550.75', '5,graigatt4@github.io,11660.67', '10,bbuye9@vkontakte.ru,12193.2', '9,grobardley8@unesco.org,14141.25', '8,mcobb7@mozilla.com,17016.15', '4,drussam3@t-online.de,17672.44', '6,bjaxon5@salon.com,18614.93']
In [39]:
# You can reverse by using reverse keywork argument
# reverse will be applied on custom comparison passed as part of key
sorted(employees, key=lambda emp: float(emp.split(',')[-1]), reverse=True)
Out[39]:
['6,bjaxon5@salon.com,18614.93', '4,drussam3@t-online.de,17672.44', '8,mcobb7@mozilla.com,17016.15', '9,grobardley8@unesco.org,14141.25', '10,bbuye9@vkontakte.ru,12193.2', '5,graigatt4@github.io,11660.67', '7,araulston6@list-manage.com,11550.75', '2,khaddock1@deviantart.com,10572.4', '1,ktrett0@independent.co.uk,6998.95', '3,ecraft2@dell.com,3967.35']