Tags
For many, fall means back to school, so for this blog I thought I’d return to Simple Tricks in Python. Fall also means Halloween for many, so hopefully these tricks will be treats, even if they do involve some very basic Python.
In this post, I explore some of Python’s more interesting and useful built-in functions, such as enumerate, sorted, reversed, map, and filter.
Python has a variety of useful built-in functions that may be new to programmers who’ve only used C-family languages (including Java and JavaScript, which, despite their names, are two completely different languages).
Let’s start with the one I use most often:
enumerate
As a rule, in Python we want to avoid writing loops like this:
002|
003| for ix in range(len(words)):
004| value = words[ix]
005| print(f'{ix+1:2d}: {value}’)
006| print()
007|
Here we need to iterate over a list of items. We don’t know the length, so we use the built-in len function to get the length of the words list and use that length with the built-in range function to generate a list of indexes used to access individual members of the list (all on line #3). More to the point, in line #4 we use the loop index value, ix, to access each item in words.
This, in one form or another, is an extremely common pattern in many languages. Python’s built-in enumerate function offers a cleaner way to write these loops:
002|
003| for ix,value in enumerate(words):
004| print(f'{ix+1:2d}: {value}’)
005| print()
006|
When run, both examples print the same thing:
1: apple 2: orange 3: grape 4: lemon 5: banana
The enumerate function takes an iterable and returns an enumerate iterable object. When used in a loop, the object returns the items from the iterable along with their index number (by default starting with zero). Effectively, the enumerate function adds index numbers to a list of items.
A general rule in Python is that whenever possible, iterate list items directly rather than using an index value. That is, if we don’t need the index value, rather than the generate them anyway, just iterate the list items themselves:
002|
003| for value in words:
004| print(f’value is: {value}’)
005| print()
006|
If an index is required, use enumerate to provide it. And since I almost always want to number the lists I print, I almost always do need an index number.
Each item returned by enumerate is a two-element tuple where the first item is the index number, and the second is the item from list. Here’s an interactive example that illustrates how an enumerate object works:
>>> e = enumerate([1,2,3])
>>> type(e)
<class 'enumerate'>
>>> repr(e)
'<enumerate object at 0x000002236A843840>'
>>>
>>> next(e)
(0, 1)
>>> next(e)
(1, 2)
>>> next(e)
(2, 3)
>>> next(e)
Traceback (most recent call last):
File "<pyshell#18>", line 1, in
next(e)
StopIteration
Note how each value returned by next is a tuple. When the enumerate object reaches the end of the list, it invokes StopIteration. If the code is in a loop, Python catches this exception and ends the loop.
Even more common is the pattern for iterating over two-dimensional arrays:
002|
003| for x in range(3):
004| for y in range(3):
005| value = tic_tac_toe[x][y]
006| print(f'[{x+1},{y+1}] = {value}’)
007| print()
008| print()
009|
The Python approach is to iterate the items themselves, so a better approach is:
002|
003| for x,row in enumerate(tic_tac_toe):
004| for y,value in enumerate(row):
005| print(f'[{x+1},{y+1}] = {value}’)
006| print()
007| print()
008|
When run, both examples print the same thing:
[1,1] = 0 [1,2] = 0 [1,3] = 0 [2,1] = 0 [2,2] = 0 [2,3] = 0 [3,1] = 0 [3,2] = 0 [3,3] = 0
Note how we’ve been adding a one to the index so that we’re numbering from 1 rather than 0. This also means the last index number is the number of items in the list rather than one less. I tend to do this for most listed output so that the nth item has a matching index number rather than one less. As it turns out, enumerate gives us a way to do that without that bit of math:
002|
003| for ix,value in enumerate(words, start=1):
004| print(f'{ix:2d}: {value}’)
005| print()
006|
Which prints the same thing as the first two examples. The tic-tac-toe example could also be updated this way (an exercise for the reader).
sorted
The sorted function does just what its name implies, returns a sorted version of an iterable. Note that the name is past tense: sorted. This in in contrast to the present tense of the enumerate function.
Here’s a simple example:
002|
003| for value in sorted(words):
004| print(value)
005| print()
006|
We can add the enumerate function if we want numbering:
002|
003| for ix,value in enumerate(sorted(words), start=1):
004| print(f'{ix:2d}: {value}’)
005| print()
006|
When run, this prints:
1: apple 2: banana 3: grape 4: lemon 5: orange
The optional keyword-only reverse parameter makes sorted to return a list that is in reverse order:
002|
003| for value in sorted(words, reverse=True):
004| print(value)
005| print()
006|
When run, this prints:
Fran Em Drew Chris Blair Alex
And, of course, we can use the enumerate function if we want to number the list.
The other optional keyword-only parameter, key, allows controlling what sorted uses to determine list order. The expected value is a function that takes a single argument — the current list item — and returns a sort key. It’s common to use a lambda function here, but you can pass any callable function.
One place this is helpful is in sorting multi-field records:
002| (‘Gene’ , ‘R’, ‘Smith’, ‘grsmith@vmail.not’, 87),
003| (‘Drew’ , ‘T’, ‘Smith’, ‘dtsmith@vmail.not’, 120),
004| (‘Blair’, ‘D’, ‘Jones’, ‘jonsey@aol.not’, 173),
005| (‘Blair’, ‘M’, ‘Green’, ‘bmgreen53@notacorp.not’, 152),
006| (‘Alex’ , ‘G’, ‘Johnson’, ‘alex.g@mobile.not’, 93),
007| (‘Em’ , ‘A’, ‘Nother’ , ’emmy@highermaths.not’, 248),
008| (‘Chris’, ‘J’, ‘Green’, ‘cjgreen47@mickysoft.not’, 137),
009| (‘Fran’ , ‘C’, ‘Xavier’, ‘profx@xmenhq.not’, 203),
010| ]
011| fmt = ‘%-6s %s %-8s %-24s %3d’
012|
013| for value in sorted(records):
014| print(fmt % (*value,))
015| print()
016|
017| for value in sorted(records, key=lambda t:t[2]):
018| print(fmt % (*value,))
019| print()
020|
021| for value in sorted(records, key=lambda t:(t[2],t[0])):
022| print(fmt % (*value,))
023| print()
024|
When run, this prints:
Alex G Johnson alex.g@mobile.not 93 Blair D Jones jonsey@aol.not 173 Blair M Green bmgreen53@notacorp.not 152 Chris J Green cjgreen47@mickysoft.not 137 Drew T Smith dtsmith@vmail.not 120 Em A Nother emmy@highermaths.not 248 Fran C Xavier profx@xmenhq.not 203 Gene R Smith grsmith@vmail.not 87 Blair M Green bmgreen53@notacorp.not 152 Chris J Green cjgreen47@mickysoft.not 137 Alex G Johnson alex.g@mobile.not 93 Blair D Jones jonsey@aol.not 173 Em A Nother emmy@highermaths.not 248 Gene R Smith grsmith@vmail.not 87 Drew T Smith dtsmith@vmail.not 120 Fran C Xavier profx@xmenhq.not 203 Blair M Green bmgreen53@notacorp.not 152 Chris J Green cjgreen47@mickysoft.not 137 Alex G Johnson alex.g@mobile.not 93 Blair D Jones jonsey@aol.not 173 Em A Nother emmy@highermaths.not 248 Drew T Smith dtsmith@vmail.not 120 Gene R Smith grsmith@vmail.not 87 Fran C Xavier profx@xmenhq.not 203
On the first one, sorted uses its default for tuples, which is to sort based on the member elements starting with the first and only using later ones to sub-sort matches, So, in this case, it sorts by the first name. Note how the two occurrences of Blair are ordered by the middle initial, the second field.
The second one uses the key parameter to pass a lambda function that returns the third element in the record — the last name. So, now the list is sorted by last name. But note the two Smith names aren’t sorted by first name. Given only the last name as the sort key, sorted uses the incoming list order for matches (it is a stable sort).
The third example passes a lambda function that returns a tuple consisting of the last name and then the first name. If we wanted to, we could add the middle name as the third element. Now the list is sorted by last name and then first name.
Here’s another example that uses the key parameter to provide a function that sorts strings based on segments within the string:
001| 1: Gene R Smith grsmith@vmail.not 87
001| 2: Drew T Smith dtsmith@vmail.not 120
001| 3: Blair D Jones jonsey@aol.not 173
001| 4: Blair M Green bmgreen53@notacorp.not 152
001| 5: Alex G Johnson alex.g@mobile.not 93
001| 6: Em A Nother emmy@highermaths.not 248
001| 7: Chris J Green cjgreen47@mickysoft.not 137
001| 8: Fran C Xavier profx@xmenhq.not 203
001| “””
002| namelist = names.splitlines()
003|
004| def sortfunc (line):
005| field1 = line[12:21].rstrip()
006| field2 = line[3:10].rstrip()
007| return f'{field1} {field2}’
008|
009| for name in sorted(namelist, key=sortfunc):
010| print(name)
011| print()
012|
When run, this prints:
4: Blair M Green bmgreen53@notacorp.not 152 7: Chris J Green cjgreen47@mickysoft.not 137 5: Alex G Johnson alex.g@mobile.not 93 3: Blair D Jones jonsey@aol.not 173 6: Em A Nother emmy@highermaths.not 248 2: Drew T Smith dtsmith@vmail.not 120 1: Gene R Smith grsmith@vmail.not 87 8: Fran C Xavier profx@xmenhq.not 203
This example uses a separate sort key function (lines #4 to #7) that extracts the last name and first name from the string and provides them as the sort key.
Lastly, you can make instances of your classes sortable through the sort function. It’s a topic I’ll explore in detail in future posts. Suffice for now to say that, using sort on class instances requires the class define the __lt__ method (the less-than method). See Sorting for more details.
reversed
The reversed function, as its name implies, returns a reversed copy of an iterable. (Again, the name is a past tense description of what the function returns.) Note that, in contrast to setting reverse=True in the sorted function, this function only reverses the existing order, it doesn’t sort the items.
Two simple examples suffice to demonstrate the reversed function:
002| letters = [chr(basechr+cx) for cx in range(12)]
003|
004| rletters = list(reversed(letters))
005|
006| print(letters)
007| print(rletters)
008| print()
009|
010| rhello = list(reversed(‘Hello, World!’))
011|
012| print(rhello)
013| print(”.join(rhello))
014| print()
015|
When run, this prints:
['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L'] ['L', 'K', 'J', 'I', 'H', 'G', 'F', 'E', 'D', 'C', 'B', 'A'] ['!', 'd', 'l', 'r', 'o', 'W', ' ', ',', 'o', 'l', 'l', 'e', 'H'] !dlroW ,olleH
The first example (lines #1 to #8) creates a list of capital letters and reverses that list (line #4). Note that the reversed function (as do the other functions here) does not return a list, it returns (in this case) a reverse object. If you want a list, you must pass this to something that iterates it into a list (for examples, list, tuple, a for loop, or a list comprehension).
The second example (lines #10 to #14) reverses the string “Hello, World!”. Note that reversed treats the string as a list of characters (as do nearly all iteration functions). To get back a string, we must use the str.join method on the list of characters.
map
The map function is defined:
map(function, iterable, ...)
With a single iterable, the map function is essentially the same as:
[function(x) for x in iterable]
The key difference being, similar to other functions here, that map returns a map object whereas the list comprehension returns a list object. Here’s a simple example:
002| (‘Gene’ , ‘R’, ‘Smith’, ‘grsmith@vmail.not’, 87),
003| (‘Drew’ , ‘T’, ‘Smith’, ‘dtsmith@vmail.not’, 120),
004| (‘Blair’, ‘D’, ‘Jones’, ‘jonsey@aol.not’, 173),
005| (‘Blair’, ‘M’, ‘Green’, ‘bmgreen53@notacorp.not’, 152),
006| (‘Alex’ , ‘G’, ‘Johnson’, ‘alex.g@mobile.not’, 93),
007| (‘Em’ , ‘A’, ‘Nother’ , ’emmy@highermaths.not’, 248),
008| (‘Chris’, ‘J’, ‘Green’, ‘cjgreen47@mickysoft.not’, 137),
009| (‘Fran’ , ‘C’, ‘Xavier’, ‘profx@xmenhq.not’, 203),
010| ]
011| fmt = ‘%-6s %s %-8s %-24s %3d’
012|
013| def fix_middle_initial (r):
014| mi = f'{r[1]}.’
015| return (r[0], mi, *r[2:])
016|
017| records2 = map(fix_middle_initial, records)
018|
019| for rcd in records2:
020| print(fmt % (*rcd,))
021| print()
022|
If you provide more than one iterable, the map function is essentially the same as:
[function(x1,x2,..) for x1,x2,.. in zip(iterables)]
The function receives one argument from each iterable. As with the zip function, if the iterables have different lengths, the iteration ends at the shortest length.
It’s a handy function, but I find I generally use a list comprehension instead. For one thing, map requires a function whereas a list comprehension just needs an expression. With map you need either a separately defined function or a lambda expression in the map statement. I find list comprehensions ultimately cleaner, but that’s just my preference.
filter
The filter function looks a bit like the map function but works differently. Instead of allowing manipulation of the list items, it uses the user-provided function to filter the list. If the function returns False, the item is excluded from the output.
The filter function is defined:
filter(function, iterable)
Note that it takes only one iterable. The function is essentially the same as:
[item for item in iterable if function(item)]
With the same caveats about filter not returning a list but a filter object that must be iterated over.
Here’s a simple example where we want to filter for records with a hit count higher than 150:
002| (‘Gene’ , ‘R’, ‘Smith’, ‘grsmith@vmail.not’, 87),
003| (‘Drew’ , ‘T’, ‘Smith’, ‘dtsmith@vmail.not’, 120),
004| (‘Blair’, ‘D’, ‘Jones’, ‘jonsey@aol.not’, 173),
005| (‘Blair’, ‘M’, ‘Green’, ‘bmgreen53@notacorp.not’, 152),
006| (‘Alex’ , ‘G’, ‘Johnson’, ‘alex.g@mobile.not’, 93),
007| (‘Em’ , ‘A’, ‘Nother’ , ’emmy@highermaths.not’, 248),
008| (‘Chris’, ‘J’, ‘Green’, ‘cjgreen47@mickysoft.not’, 137),
009| (‘Fran’ , ‘C’, ‘Xavier’, ‘profx@xmenhq.not’, 203),
010| ]
011| fmt = ‘%-6s %s %-8s %-24s %3d’
012|
013| def exclude_low_values (r):
014| return (150 < r[4])
015|
016| records2 = filter(exclude_low_values, records)
017|
018| for rcd in records2:
019| print(fmt % (*rcd,))
020| print()
021|
When run, this prints:
Blair D Jones jonsey@aol.not 173 Blair M Green bmgreen53@notacorp.not 152 Em A Nother emmy@highermaths.not 248 Fran C Xavier profx@xmenhq.not 203
Again, a handy function, but not one I use often for the same reasons I mentioned above about the map function.
I’ll end with the zip function, which combines lists:
002| b = [‘A’, ‘B’, ‘C’, ‘D’, ‘E’]
003| c = [‘z’, ‘y’, ‘x’, ‘w’, ‘v’, ‘u’]
004|
005| for t in zip(a,b,c):
006| print(t)
007| print()
008|
When run, this prints:
(1, 'A', 'z') (2, 'B', 'y') (3, 'C', 'x') (4, 'D', 'w') (5, 'E', 'v')
Note that the zip function’s default is to silently ignore parts of any iterable longer than the shortest iterable. In the example above, the list of capital letters has a length five, while the other two lists have a length of six. Because of the shorter list, the output only has five items. To detect when lists don’t match, use the strict parameter:
002| b = [‘A’, ‘B’, ‘C’, ‘D’, ‘E’]
003| c = [‘z’, ‘y’, ‘x’, ‘w’, ‘v’, ‘u’]
004|
005| for t in zip(a,b,c, strict=True):
006| print(t)
007| print()
008|
Now when run, this prints:
(1, 'A', 'z')
(2, 'B', 'y')
(3, 'C', 'x')
(4, 'D', 'w')
(5, 'E', 'v')
Traceback (most recent call last):
File "C:\CJS\prj\Python\blog\hcc\source\fragment.py", line 5, in
for t in zip(a,b,c, strict=True):
ValueError: zip() argument 2 is shorter than argument 1
As with all these functions, zip returns a zip object you must iterate over. (Essentially, these functions return generators, which saves memory.)
The zip function has a couple of cute tricks up its sleeve. For instance, you can use it to unzip a zipped list:
002| b = ‘ABC’
003| c = ‘zyx’
004|
005| z = zip(a,b,c)
006|
007| for t in zip(*z):
008| print(t)
009| print()
010|
When run, this prints:
(1, 2, 3)
('A', 'B', 'C')
('z', 'y', 'x')
The other trick lets you create a list with duplicated items:
002| reps = 4
003|
004| for t in zip(*[nums]*reps, strict=True):
005| print(t)
006| print()
007|
When run, this prints:
(1, 1, 1, 1) (2, 2, 2, 2) (3, 3, 3, 3) (4, 4, 4, 4) (5, 5, 5, 5) (6, 6, 6, 6)
Which I suppose is handy if you want something like that.
If you weren’t familiar with some of these, I hope you’ll find them helpful and add them to your Python toolkit.
Link: Zip file containing all code fragments used in this post.
∅
ATTENTION: The WordPress Reader strips the style information from posts, which can destroy certain important formatting elements. If you’re reading this in the Reader, I highly recommend (and urge) you to [A] stop using the Reader and [B] always read blog posts on their website.
This post is: Simple Python Tricks #9
Test your Python knowledge. Explain the output from:
002| t3b = [[0]*3, [0]*3, [0]*3]
003| t3c = [[0]*3]*3
004| t3d = [[0 for y in range(3)] for x in range(3)]
005|
006| print(f’t3a = {t3a}’)
007| print(f’t3b = {t3b}’)
008| print(f’t3c = {t3c}’)
009| print(f’t3d = {t3d}’)
010| print()
011|
012| t3a[1][1] = 1
013| t3b[1][1] = 1
014| t3c[1][1] = 1
015| t3d[1][1] = 1
016|
017| print(f’t3a = {t3a}’)
018| print(f’t3b = {t3b}’)
019| print(f’t3c = {t3c}’)
020| print(f’t3d = {t3d}’)
021| print()
022|
Which is:
t3a = [[0, 0, 0], [0, 0, 0], [0, 0, 0]] t3b = [[0, 0, 0], [0, 0, 0], [0, 0, 0]] t3c = [[0, 0, 0], [0, 0, 0], [0, 0, 0]] t3d = [[0, 0, 0], [0, 0, 0], [0, 0, 0]] t3a = [[0, 0, 0], [0, 1, 0], [0, 0, 0]] t3b = [[0, 0, 0], [0, 1, 0], [0, 0, 0]] t3c = [[0, 1, 0], [0, 1, 0], [0, 1, 0]] t3d = [[0, 0, 0], [0, 1, 0], [0, 0, 0]]Note the second output of
t3c. What happened there?Pingback: Simple Python Tricks #10 | The Hard-Core Coder