Simple Python Tricks #9

Tags

For many, fall means back to school, so for this blog I thought I’d return to Simple Tricks in Python. Fall also means Halloween for many, so hopefully these tricks will be treats, even if they do involve some very basic Python.

In this post, I explore some of Python’s more interesting and useful built-in functions, such as enumerate, sorted, reversed, map, and filter.

Python has a variety of useful built-in functions that may be new to programmers who’ve only used C-family languages (including Java and JavaScript, which, despite their names, are two completely different languages).

Let’s start with the one I use most often:

enumerate

As a rule, in Python we want to avoid writing loops like this:

 words = [‘apple’, ‘orange’, ‘grape’, ‘lemon’, ‘banana’]

 

 for ix in range(len(words)):

     value = words[ix]

     print(f'{ix+1:2d}: {value}’)

 print()



Here we need to iterate over a list of items. We don’t know the length, so we use the built-in len function to get the length of the words list and use that length with the built-in range function to generate a list of indexes used to access individual members of the list (all on line #3). More to the point, in line #4 we use the loop index value, ix, to access each item in words.

This, in one form or another, is an extremely common pattern in many languages. Python’s built-in enumerate function offers a cleaner way to write these loops:

 words = [‘apple’, ‘orange’, ‘grape’, ‘lemon’, ‘banana’]

 

 for ix,value in enumerate(words):

     print(f'{ix+1:2d}: {value}’)

 print()



When run, both examples print the same thing:

 1: apple
 2: orange
 3: grape
 4: lemon
 5: banana

The enumerate function takes an iterable and returns an enumerate iterable object. When used in a loop, the object returns the items from the iterable along with their index number (by default starting with zero). Effectively, the enumerate function adds index numbers to a list of items.

A general rule in Python is that whenever possible, iterate list items directly rather than using an index value. That is, if we don’t need the index value, rather than the generate them anyway, just iterate the list items themselves:

 words = [‘apple’, ‘orange’, ‘grape’, ‘lemon’, ‘banana’]

 

 for value in words:

     print(f’value is: {value}’)

 print()



If an index is required, use enumerate to provide it. And since I almost always want to number the lists I print, I almost always do need an index number.

Each item returned by enumerate is a two-element tuple where the first item is the index number, and the second is the item from list. Here’s an interactive example that illustrates how an enumerate object works:

>>> e = enumerate([1,2,3])
>>> type(e)
    <class 'enumerate'>
>>> repr(e)
    '<enumerate object at 0x000002236A843840>'
>>>
>>> next(e)
    (0, 1)
>>> next(e)
    (1, 2)
>>> next(e)
    (2, 3)
>>> next(e)
    Traceback (most recent call last):
      File "<pyshell#18>", line 1, in 
        next(e)
    StopIteration

Note how each value returned by next is a tuple. When the enumerate object reaches the end of the list, it invokes StopIteration. If the code is in a loop, Python catches this exception and ends the loop.

Even more common is the pattern for iterating over two-dimensional arrays:

 tic_tac_toe = [[0,0,0], [0,0,0], [0,0,0]]

 

 for x in range(3):

     for y in range(3):

         value = tic_tac_toe[x][y]

         print(f'[{x+1},{y+1}] = {value}’)

     print()

 print()



The Python approach is to iterate the items themselves, so a better approach is:

 tic_tac_toe = [[0,0,0], [0,0,0], [0,0,0]]

 

 for x,row in enumerate(tic_tac_toe):

     for y,value in enumerate(row):

         print(f'[{x+1},{y+1}] = {value}’)

     print()

 print()



When run, both examples print the same thing:

[1,1] = 0
[1,2] = 0
[1,3] = 0

[2,1] = 0
[2,2] = 0
[2,3] = 0

[3,1] = 0
[3,2] = 0
[3,3] = 0

Note how we’ve been adding a one to the index so that we’re numbering from 1 rather than 0. This also means the last index number is the number of items in the list rather than one less. I tend to do this for most listed output so that the nth item has a matching index number rather than one less. As it turns out, enumerate gives us a way to do that without that bit of math:

 words = [‘apple’, ‘orange’, ‘grape’, ‘lemon’, ‘banana’]

 

 for ix,value in enumerate(words, start=1):

     print(f'{ix:2d}: {value}’)

 print()



Which prints the same thing as the first two examples. The tic-tac-toe example could also be updated this way (an exercise for the reader).

sorted

The sorted function does just what its name implies, returns a sorted version of an iterable. Note that the name is past tense: sorted. This in in contrast to the present tense of the enumerate function.

Here’s a simple example:

 words = [‘apple’, ‘orange’, ‘grape’, ‘lemon’, ‘banana’]

 

 for value in sorted(words):

     print(value)

 print()



We can add the enumerate function if we want numbering:

 words = [‘apple’, ‘orange’, ‘grape’, ‘lemon’, ‘banana’]

 

 for ix,value in enumerate(sorted(words), start=1):

     print(f'{ix:2d}: {value}’)

 print()



When run, this prints:

 1: apple
 2: banana
 3: grape
 4: lemon
 5: orange

The optional keyword-only reverse parameter makes sorted to return a list that is in reverse order:

 words = [‘Drew’, ‘Blair’, ‘Alex’, ‘Em’, ‘Chris’, ‘Fran’]

 

 for value in sorted(words, reverse=True):

     print(value)

 print()



When run, this prints:

Fran
Em
Drew
Chris
Blair
Alex

And, of course, we can use the enumerate function if we want to number the list.

The other optional keyword-only parameter, key, allows controlling what sorted uses to determine list order. The expected value is a function that takes a single argument — the current list item — and returns a sort key. It’s common to use a lambda function here, but you can pass any callable function.

One place this is helpful is in sorting multi-field records:

 records = [

     (‘Gene’ , ‘R’, ‘Smith’, ‘grsmith@vmail.not’, 87),

     (‘Drew’ , ‘T’, ‘Smith’, ‘dtsmith@vmail.not’, 120),

     (‘Blair’, ‘D’, ‘Jones’, ‘jonsey@aol.not’, 173),

     (‘Blair’, ‘M’, ‘Green’, ‘bmgreen53@notacorp.not’, 152),

     (‘Alex’ , ‘G’, ‘Johnson’, ‘alex.g@mobile.not’, 93),

     (‘Em’   , ‘A’, ‘Nother’ , ’emmy@highermaths.not’, 248),

     (‘Chris’, ‘J’, ‘Green’, ‘cjgreen47@mickysoft.not’, 137),

     (‘Fran’ , ‘C’, ‘Xavier’, ‘profx@xmenhq.not’, 203),

 ]

 fmt = ‘%-6s %s %-8s %-24s %3d’

 

 for value in sorted(records):

     print(fmt % (*value,))

 print()

 

 for value in sorted(records, key=lambda t:t[2]):

     print(fmt % (*value,))

 print()

 

 for value in sorted(records, key=lambda t:(t[2],t[0])):

     print(fmt % (*value,))

 print()



When run, this prints:

Alex   G Johnson  alex.g@mobile.not         93
Blair  D Jones    jonsey@aol.not           173
Blair  M Green    bmgreen53@notacorp.not   152
Chris  J Green    cjgreen47@mickysoft.not  137
Drew   T Smith    dtsmith@vmail.not        120
Em     A Nother   emmy@highermaths.not     248
Fran   C Xavier   profx@xmenhq.not         203
Gene   R Smith    grsmith@vmail.not         87

Blair  M Green    bmgreen53@notacorp.not   152
Chris  J Green    cjgreen47@mickysoft.not  137
Alex   G Johnson  alex.g@mobile.not         93
Blair  D Jones    jonsey@aol.not           173
Em     A Nother   emmy@highermaths.not     248
Gene   R Smith    grsmith@vmail.not         87
Drew   T Smith    dtsmith@vmail.not        120
Fran   C Xavier   profx@xmenhq.not         203

Blair  M Green    bmgreen53@notacorp.not   152
Chris  J Green    cjgreen47@mickysoft.not  137
Alex   G Johnson  alex.g@mobile.not         93
Blair  D Jones    jonsey@aol.not           173
Em     A Nother   emmy@highermaths.not     248
Drew   T Smith    dtsmith@vmail.not        120
Gene   R Smith    grsmith@vmail.not         87
Fran   C Xavier   profx@xmenhq.not         203

On the first one, sorted uses its default for tuples, which is to sort based on the member elements starting with the first and only using later ones to sub-sort matches, So, in this case, it sorts by the first name. Note how the two occurrences of Blair are ordered by the middle initial, the second field.

The second one uses the key parameter to pass a lambda function that returns the third element in the record — the last name. So, now the list is sorted by last name. But note the two Smith names aren’t sorted by first name. Given only the last name as the sort key, sorted uses the incoming list order for matches (it is a stable sort).

The third example passes a lambda function that returns a tuple consisting of the last name and then the first name. If we wanted to, we could add the middle name as the third element. Now the list is sorted by last name and then first name.

Here’s another example that uses the key parameter to provide a function that sorts strings based on segments within the string:

 names = “””

 1: Gene R Smith grsmith@vmail.not 87

 2: Drew T Smith dtsmith@vmail.not 120

 3: Blair D Jones jonsey@aol.not 173

 4: Blair M Green bmgreen53@notacorp.not 152

 5: Alex G Johnson alex.g@mobile.not 93

 6: Em A Nother emmy@highermaths.not 248

 7: Chris J Green cjgreen47@mickysoft.not 137

 8: Fran C Xavier profx@xmenhq.not 203

 “””

 namelist = names.splitlines()

 

 def sortfunc (line):

     field1 = line[12:21].rstrip()

     field2 = line[3:10].rstrip()

     return f'{field1} {field2}’

 

 for name in sorted(namelist, key=sortfunc):

     print(name)

 print()



When run, this prints:

4: Blair  M Green    bmgreen53@notacorp.not   152
7: Chris  J Green    cjgreen47@mickysoft.not  137
5: Alex   G Johnson  alex.g@mobile.not         93
3: Blair  D Jones    jonsey@aol.not           173
6: Em     A Nother   emmy@highermaths.not     248
2: Drew   T Smith    dtsmith@vmail.not        120
1: Gene   R Smith    grsmith@vmail.not         87
8: Fran   C Xavier   profx@xmenhq.not         203

This example uses a separate sort key function (lines #4 to #7) that extracts the last name and first name from the string and provides them as the sort key.

Lastly, you can make instances of your classes sortable through the sort function. It’s a topic I’ll explore in detail in future posts. Suffice for now to say that, using sort on class instances requires the class define the __lt__ method (the less-than method). See Sorting for more details.

reversed

The reversed function, as its name implies, returns a reversed copy of an iterable. (Again, the name is a past tense description of what the function returns.) Note that, in contrast to setting reverse=True in the sorted function, this function only reverses the existing order, it doesn’t sort the items.

Two simple examples suffice to demonstrate the reversed function:

 basechr = ord(‘A’)

 letters = [chr(basechr+cx) for cx in range(12)]

 

 rletters = list(reversed(letters))

 

 print(letters)

 print(rletters)

 print()

 

 rhello = list(reversed(‘Hello, World!’))

 

 print(rhello)

 print(”.join(rhello))

 print()



When run, this prints:

['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L']
['L', 'K', 'J', 'I', 'H', 'G', 'F', 'E', 'D', 'C', 'B', 'A']

['!', 'd', 'l', 'r', 'o', 'W', ' ', ',', 'o', 'l', 'l', 'e', 'H']
!dlroW ,olleH

The first example (lines #1 to #8) creates a list of capital letters and reverses that list (line #4). Note that the reversed function (as do the other functions here) does not return a list, it returns (in this case) a reverse object. If you want a list, you must pass this to something that iterates it into a list (for examples, list, tuple, a for loop, or a list comprehension).

The second example (lines #10 to #14) reverses the string “Hello, World!”. Note that reversed treats the string as a list of characters (as do nearly all iteration functions). To get back a string, we must use the str.join method on the list of characters.

map

The map function is defined:

map(function, iterable, ...)

With a single iterable, the map function is essentially the same as:

[function(x) for x in iterable]

The key difference being, similar to other functions here, that map returns a map object whereas the list comprehension returns a list object. Here’s a simple example:

 records = [

     (‘Gene’ , ‘R’, ‘Smith’, ‘grsmith@vmail.not’, 87),

     (‘Drew’ , ‘T’, ‘Smith’, ‘dtsmith@vmail.not’, 120),

     (‘Blair’, ‘D’, ‘Jones’, ‘jonsey@aol.not’, 173),

     (‘Blair’, ‘M’, ‘Green’, ‘bmgreen53@notacorp.not’, 152),

     (‘Alex’ , ‘G’, ‘Johnson’, ‘alex.g@mobile.not’, 93),

     (‘Em’   , ‘A’, ‘Nother’ , ’emmy@highermaths.not’, 248),

     (‘Chris’, ‘J’, ‘Green’, ‘cjgreen47@mickysoft.not’, 137),

     (‘Fran’ , ‘C’, ‘Xavier’, ‘profx@xmenhq.not’, 203),

 ]

 fmt = ‘%-6s %s %-8s %-24s %3d’

 

 def fix_middle_initial (r):

     mi = f'{r[1]}.’

     return (r[0], mi, *r[2:])

 

 records2 = map(fix_middle_initial, records)

 

 for rcd in records2:

     print(fmt % (*rcd,))

 print()



If you provide more than one iterable, the map function is essentially the same as:

[function(x1,x2,..) for x1,x2,.. in zip(iterables)]

The function receives one argument from each iterable. As with the zip function, if the iterables have different lengths, the iteration ends at the shortest length.

It’s a handy function, but I find I generally use a list comprehension instead. For one thing, map requires a function whereas a list comprehension just needs an expression. With map you need either a separately defined function or a lambda expression in the map statement. I find list comprehensions ultimately cleaner, but that’s just my preference.

filter

The filter function looks a bit like the map function but works differently. Instead of allowing manipulation of the list items, it uses the user-provided function to filter the list. If the function returns False, the item is excluded from the output.

The filter function is defined:

filter(function, iterable)

Note that it takes only one iterable. The function is essentially the same as:

[item for item in iterable if function(item)]

With the same caveats about filter not returning a list but a filter object that must be iterated over.

Here’s a simple example where we want to filter for records with a hit count higher than 150:

 records = [

     (‘Gene’ , ‘R’, ‘Smith’, ‘grsmith@vmail.not’, 87),

     (‘Drew’ , ‘T’, ‘Smith’, ‘dtsmith@vmail.not’, 120),

     (‘Blair’, ‘D’, ‘Jones’, ‘jonsey@aol.not’, 173),

     (‘Blair’, ‘M’, ‘Green’, ‘bmgreen53@notacorp.not’, 152),

     (‘Alex’ , ‘G’, ‘Johnson’, ‘alex.g@mobile.not’, 93),

     (‘Em’   , ‘A’, ‘Nother’ , ’emmy@highermaths.not’, 248),

     (‘Chris’, ‘J’, ‘Green’, ‘cjgreen47@mickysoft.not’, 137),

     (‘Fran’ , ‘C’, ‘Xavier’, ‘profx@xmenhq.not’, 203),

 ]

 fmt = ‘%-6s %s %-8s %-24s %3d’

 

 def exclude_low_values (r):

     return (150 < r[4])

 

 records2 = filter(exclude_low_values, records)

 

 for rcd in records2:

     print(fmt % (*rcd,))

 print()



When run, this prints:

Blair  D Jones    jonsey@aol.not           173
Blair  M Green    bmgreen53@notacorp.not   152
Em     A Nother   emmy@highermaths.not     248
Fran   C Xavier   profx@xmenhq.not         203

Again, a handy function, but not one I use often for the same reasons I mentioned above about the map function.

I’ll end with the zip function, which combines lists:

 a = [1, 2, 3, 4, 5, 6]

 b = [‘A’, ‘B’, ‘C’, ‘D’, ‘E’]

 c = [‘z’, ‘y’, ‘x’, ‘w’, ‘v’, ‘u’]

 

 for t in zip(a,b,c):

     print(t)

 print()



When run, this prints:

(1, 'A', 'z')
(2, 'B', 'y')
(3, 'C', 'x')
(4, 'D', 'w')
(5, 'E', 'v')

Note that the zip function’s default is to silently ignore parts of any iterable longer than the shortest iterable. In the example above, the list of capital letters has a length five, while the other two lists have a length of six. Because of the shorter list, the output only has five items. To detect when lists don’t match, use the strict parameter:

 a = [1, 2, 3, 4, 5, 6]

 b = [‘A’, ‘B’, ‘C’, ‘D’, ‘E’]

 c = [‘z’, ‘y’, ‘x’, ‘w’, ‘v’, ‘u’]

 

 for t in zip(a,b,c, strict=True):

     print(t)

 print()



Now when run, this prints:

(1, 'A', 'z')
(2, 'B', 'y')
(3, 'C', 'x')
(4, 'D', 'w')
(5, 'E', 'v')
Traceback (most recent call last):
  File "C:\CJS\prj\Python\blog\hcc\source\fragment.py", line 5, in 
    for t in zip(a,b,c, strict=True):
ValueError: zip() argument 2 is shorter than argument 1

As with all these functions, zip returns a zip object you must iterate over. (Essentially, these functions return generators, which saves memory.)

The zip function has a couple of cute tricks up its sleeve. For instance, you can use it to unzip a zipped list:

 a = [1, 2, 3]

 b = ‘ABC’

 c = ‘zyx’

 

 z = zip(a,b,c)

 

 for t in zip(*z):

     print(t)

 print()



When run, this prints:

(1, 2, 3)
('A', 'B', 'C')
('z', 'y', 'x')

The other trick lets you create a list with duplicated items:

 nums = [1, 2, 3, 4, 5, 6]

 reps = 4

 

 for t in zip(*[nums]*reps, strict=True):

     print(t)

 print()



When run, this prints:

(1, 1, 1, 1)
(2, 2, 2, 2)
(3, 3, 3, 3)
(4, 4, 4, 4)
(5, 5, 5, 5)
(6, 6, 6, 6)

Which I suppose is handy if you want something like that.

If you weren’t familiar with some of these, I hope you’ll find them helpful and add them to your Python toolkit.

Link: Zip file containing all code fragments used in this post.

∅

3 thoughts on “Simple Python Tricks #9”

Wyrd Smythe said:

October 14, 2024 at 9:38 am

ATTENTION: The WordPress Reader strips the style information from posts, which can destroy certain important formatting elements. If you’re reading this in the Reader, I highly recommend (and urge) you to [A] stop using the Reader and [B] always read blog posts on their website.

This post is: Simple Python Tricks #9

Wyrd Smythe said:

October 14, 2024 at 9:42 am
Test your Python knowledge. Explain the output from:

001| t3a = [[0,0,0], [0,0,0], [0,0,0]]
002| t3b = [[0]*3, [0]*3, [0]*3]
003| t3c = [[0]*3]*3
004| t3d = [[0 for y in range(3)] for x in range(3)]
005|
006| print(f’t3a = {t3a}’)
007| print(f’t3b = {t3b}’)
008| print(f’t3c = {t3c}’)
009| print(f’t3d = {t3d}’)
010| print()
011|
012| t3a[1][1] = 1
013| t3b[1][1] = 1
014| t3c[1][1] = 1
015| t3d[1][1] = 1
016|
017| print(f’t3a = {t3a}’)
018| print(f’t3b = {t3b}’)
019| print(f’t3c = {t3c}’)
020| print(f’t3d = {t3d}’)
021| print()
022|

Which is:
```
t3a = [[0, 0, 0], [0, 0, 0], [0, 0, 0]]
t3b = [[0, 0, 0], [0, 0, 0], [0, 0, 0]]
t3c = [[0, 0, 0], [0, 0, 0], [0, 0, 0]]
t3d = [[0, 0, 0], [0, 0, 0], [0, 0, 0]]

t3a = [[0, 0, 0], [0, 1, 0], [0, 0, 0]]
t3b = [[0, 0, 0], [0, 1, 0], [0, 0, 0]]
t3c = [[0, 1, 0], [0, 1, 0], [0, 1, 0]]
t3d = [[0, 0, 0], [0, 1, 0], [0, 0, 0]]
```
Note the second output of t3c. What happened there?

Pingback: Simple Python Tricks #10 | The Hard-Core Coder

The Hard-Core Coder

~ I can't stop writing code!

Simple Python Tricks #9

enumerate

sorted

reversed

map

filter

3 thoughts on “Simple Python Tricks #9”

Leave a reply to Wyrd Smythe Cancel reply

enumerate

sorted

reversed

map

filter

Share this:

Related

3 thoughts on “Simple Python Tricks #9”

Leave a reply to Wyrd Smythe Cancel reply