Tags

, , ,

The last few posts in this Simple Tricks series were perhaps a bit less than simple, so this month, here at the end of the year, I’m going to take it easy and enjoy the season. (I hope you are doing so as well!)

I’ve put together a random grab bag of little bits and pieces of Python. Nothing too complicated. And in all honesty, probably nothing terribly interesting or that useful, either. But you may find some of the approaches helpful.

First up, just some simple functions for vetting an object as a number of the desired type (converting if possible). Perhaps a set of functions or methods expects a certain type of number. Rather than writing the same vetting code for each one, write a general function they all call. Suppose our functions expect integers:

001| def SafeInt (x):
002|     ”’Return an integer no matter what.”’
003| 
004|     # If it’s an integer, just return it…
005|     if isinstance(x,int):
006|         return x
007| 
008|     # If it’s a float, convert it…
009|     if isinstance(x,float):
010|         return round(x)
011| 
012|     # If it’s a string, convert it…
013|     if isinstance(x,str) and x.isnumeric():
014|         return int(x, base=0)
015| 
016|     # Not known, return zero…
017|     return 0
018| 
019| 
020| if __name__ == ‘__main__’:
021|     def demo (x):
022|         v = SafeInt(x)
023|         sx = f'{x}:{type(x).__name__}’
024|         sv = f'{v}:{type(v).__name__}’
025|         print(f'{sx:<10s} -> {sv}’)
026|         return v
027| 
028|     demo(4)
029|     demo(4.4)
030|     demo(4.5)
031|     demo(4.6)
032|     demo(‘4’)
033|     demo(‘4.5’)
034|     demo(‘-4’)
035|     demo(‘+4’)
036| 

The SafeInt function (lines #1 to #17) takes an object and checks to see if it’s an integer (lines #5 & #6) or a float (line #9 & #10). It returns the object as is if it’s an integer or converts it to an integer if it’s a float. Note the use of round rather than int to convert a float to the nearest integer.

Next (lines #13 & #14), we check to see if the object is a numeric string and, if so, use the int function to convert it. Finally, if we can’t identify the object, we return zero (line #17).

When run, this prints:

4:int      -> 4:int
4.4:float  -> 4:int
4.5:float  -> 4:int
4.6:float  -> 5:int
4:str      -> 4:int
4.5:str    -> 0:int
-4:str     -> 0:int
+4:str     -> 0:int

The function is simplistic and fails on otherwise legit cases due to a leading plus or minus. And it doesn’t return the same thing for the string “4.5” as it does for the float 4.5. The problem is that the str.isnumeric test fails on a leading plus or minus. We can improve on the first problem like this:

001| def SafeInt (x):
002|     ”’Return an integer no matter what.”’
003| 
004|     # If it’s an integer, just return it…
005|     if isinstance(x,int):
006|         return x
007| 
008|     # If it’s a float, convert it…
009|     if isinstance(x,float):
010|         return round(x)
011| 
012|     # If it’s a string, convert it…
013|     if isinstance(x,str):
014|         try:
015|             dp = x.find(‘.’)
016|             xx = x if dp < 0 else x[:dp]
017|             return int(xx, base=0)
018|         except:
019|             pass
020| 
021|     # Not known, return zero…
022|     return 0
023| 
024| 
025| if __name__ == ‘__main__’:
026|     def demo (x):
027|         v = SafeInt(x)
028|         sx = f'{x}:{type(x).__name__}’
029|         sv = f'{v}:{type(v).__name__}’
030|         print(f'{sx:<10s} -> {sv}’)
031|         return v
032| 
033|     demo(4)
034|     demo(4.4)
035|     demo(4.5)
036|     demo(4.6)
037|     demo(‘4’)
038|     demo(‘4.5’)
039|     demo(‘4.6’)
040|     demo(‘-4’)
041|     demo(‘+4’)
042| 

Now, if it’s a string, we try to convert it inside a try block and simply continue if the attempt fails (lines #14 to #19). Otherwise, we return the integer value (line #17). To avoid an error from converting a number with a decimal, we chop off one if found (lines #15 & #16).

When run, this prints:

4:int      -> 4:int
4.4:float  -> 4:int
4.5:float  -> 4:int
4.6:float  -> 5:int
4:str      -> 4:int
4.5:str    -> 4:int
4.6:str    -> 4:int
-4:str     -> -4:int
+4:str     -> 4:int

Note that we still have a difference between the float 4.6 and the string “4.6”. It’s possible to fix this, but it makes the string-handling part a lot messier. Basically, if the string has a decimal point, convert using the float function and then use the round function to return an integer. Otherwise, use the int function as above.

Using the same pattern, we could also write a SafeFloat or a SafeComplex function. In fact, the basic pattern, a small function with a series of checks with multiple return points is one I find very useful, but which is viewed negatively from the purist point of view that functions must have only one return point. To the contrary, I find the ability to jump out of a function useful and clear. Far more readable than the gyrations often required to get all control flows to the one and only final return.

The general pattern looks either like this:

function check_good (value)
begin
    if condition1(value) then return TRUE
    if condition2(value) then return TRUE
    if condition3(value) then return TRUE
    return FALSE
end function

Or like this:

function check_bad (value)
begin
    if condition1(value) then return FALSE
    if condition2(value) then return FALSE
    if condition3(value) then return FALSE
    return TRUE
end function

In the first example, we’re looking for a special value that matches some condition. For example, looking for certain words. In the second, we’re looking for conditions that falsify the value. For example, we might want to reject unprintable characters. The SafeInt function is an example of the former.


There’s a math puzzle that goes: Given the numbers 1 to 1000, does any digit appear the most times?

At first glance, one might think all digits appear an equal number of times. Which turns out to not be unreasonable. But in this case, the digit “1” appears 301 time compared to the digits “2” through “9”, which appear 300 times. Since no number begins with “0”, there are always fewer zeros. In this case, 192 occurrences.

However, if the question uses the numbers from 1 to 999, then the digits “1” to “9” all appear 300 times. The “0” still lags behind with 189 occurrences.

More to the point, if one is a Python programmer, this is an easy problem to solve:

001| def number_of_digits (highest_digit=1000):
002|     ”’Histogram of digit frequencies in the range 1 to {highest_digit}.”’
003|     histo  = [0]*10
004| 
005|     for nmbr in range(1, highest_digit+1):
006|         for digit in str(nmbr):
007|             histo[int(digit)] += 1
008| 
009|     print(f’Numbers: 1 to {highest_digit}’)
010|     for ix,n in enumerate(histo):
011|         print(f'{ix}: {n:3d}’)
012|     print()
013| 
014|     count = max(histo)
015|     digit = histo.index(count)
016|     winners = [n for n in histo if n==count]
017|     if len(winners) == 1:
018|         print(f’Winner: The digit {digit} with {count} occurrences’)
019|     else:
020|         print(f’No winner: {len(winners)} digits with {count} occurrences’)
021|     print()
022| 
023|     return (digit, count)
024| 
025| 
026| if __name__ == ‘__main__’:
027|     number_of_digits(999)
028|     number_of_digits()
029| 

The code should be largely self-explanatory. The number_of_digits function starts by initializing a histogram vector to keep track of ten digits (line #3). Then it just iterates through the number range, converts each number to a string, and scans the string to count the digits (lines #5 to #7). Then it prints the histogram results (lines #9 to #12) and figures out if any digit won (lines #14 to #21).

When run, this prints:

Numbers: 1 to 999
0: 189
1: 300
2: 300
3: 300
4: 300
5: 300
6: 300
7: 300
8: 300
9: 300

No winner: 9 digits with 300 occurrences

Numbers: 1 to 1000
0: 192
1: 301
2: 300
3: 300
4: 300
5: 300
6: 300
7: 300
8: 300
9: 300

Winner: The digit 1 with 301 occurrences

If you’re the sort of person who runs into a little math puzzle like this, and one of the first thoughts you have is how you could write code to solve it … then you’re probably thinking like a programmer.


Speaking of math puzzles, a famous one is the Monty Hall Three-Door Problem. The problem got notoriety when it appeared in the Parade magazine “Ask Marilyn” column in 1990. There was a bruhaha over the answer Marilyn vos Savant gave — many well-trained readers disagreed with her, often disdainfully to their later shame.

Once again, thinking like programmers, we can simulate the problem:

001| from random import randint, sample
002| 
003| def three_doors (switch_doors=True, n_times=100_000, verbose=False):
004|     ”’The Monty Hall 3 Doors Problem.”’
005| 
006|     wins = 0
007|     goat = ‘goat’
008|     car = ‘car’
009|     tries = []
010| 
011|     # Generate a set of random three-door problems…
012|     for n in range(n_times):
013|         # Start with goats behind all doors…
014|         doors = [goat, goat, goat]
015| 
016|         # Change a random one to a car…
017|         doors[randint(0,2)] = ‘car’
018| 
019|         # Add to the list…
020|         tries.append(doors)
021| 
022|     # Generate some distribution stats to see if we’re random…
023|     fx = lambda x: (1 if x==‘car’ else 0)
024|     n0 = sum(fx(t[0]) for t in tries)
025|     n1 = sum(fx(t[1]) for t in tries)
026|     n2 = sum(fx(t[2]) for t in tries)
027| 
028|     print(f’Doors: [1]={n0:,}, [2]={n1:,}, [3]={n2:,}; tries={len(tries):,}’)
029|     if verbose:
030|         print()
031| 
032|     # Go through each try…
033|     for tx,t in enumerate(tries):
034|         # Guess at a door…
035|         dpick = randint(0,2)
036| 
037|         # Host considers other doors with goats…
038|         others = [(dx,d) for dx,d in enumerate(t) if d != ‘car’ if dx != dpick]
039| 
040|         # Host randomly opens a door with a goat…
041|         dopen = sample(others, k=1)[0]
042| 
043|         # If we’re switching doors…
044|         if switch_doors:
045|             # Pick the remaining door…
046|             dswap = 3  (dpick + dopen[0])
047| 
048|             # If the car was behind it, we won!…
049|             if t[dswap] == ‘car’:
050|                 wins += 1
051| 
052|             if verbose:
053|                 spick = f'[{dpick}]({t[dpick]:4s})’
054|                 sopen = f'[{dopen[0]}]{dopen[1]}’
055|                 sswap = f'[{dswap}]{t[dswap]:4s}’
056|                 print(f'{spick} vs {sopen} => {sswap}’)
057| 
058|         # If we’re staying with the door we picked…
059|         else:
060|             # If the car was behind it, we won!…
061|             if t[dpick] == ‘car’:
062|                 wins += 1
063| 
064|             if verbose:
065|                 spick = f'[{dpick}]{t[dpick]:4s}’
066|                 sopen = f'[{dopen[0]}]{dopen[1]}’
067|                 print(f'{spick} vs {sopen}’)
068| 
069|     if verbose:
070|         print()
071|     print(f’tries: {tx+1:,}, wins: {wins:,} ({100*(wins/(tx+1)):2f} %)’)
072|     print()
073| 

At the top (lines #6 to #9) we initialize some variables. We’ll use the strings ‘goat’ and ‘car’ and put them in a three-item list to represent the three doors. In lines #12 to #20, we generate n_times sets of door lists. In line #14, we initialize doors to have three goats, but in line #17 we randomly change one to a car. (Doing it this way is just a bit easier than trying to generate a random list of two goats and one car.) Line #20 appends each list of three doors to the list of tries.

In lines #23 to #30 we go over all the tries to see how many times a car is behind each door. If our generating routine was good, it should be one-third of the time.

Lines #33 to #67 comprise a large for loop that iterates over the tries. In it, we start by picking a door (line #35). Then Monty Hall (who knows what’s behind each door), opens a door other than the one we picked, and this door has a goat behind it. Line #38 generates a list of all doors in the current try that do not have a car and are not the door we picked. Line #41 picks one of those randomly.

If our strategy is to switch doors (lines #44 to#56), then we calculate the other door number (line #46) and see if it has a car behind it (line #49). If it does, we win.

If our strategy is to keep the door we first picked (lines #59 to #67), then we see if a car is behind that door (line #61).

The function takes three parameters: switch_doors, a Boolean that determines our strategy; n_times, the number of tries to make; and verbose, a Boolean flag that allows monitoring the function.

We run our simulation twice, once keeping the picked door and once swapping:

001| from examples import three_doors
002| 
003| N = 200_000
004| 
005| if __name__ == ‘__main__’:
006|     print(‘Run {N} tests and KEEP the door we pick.’)
007|     three_doors(switch_doors=False, n_times=N)
008| 
009|     print(‘Run {N} tests and CHANGE the door we pick.’)
010|     three_doors(switch_doors=True, n_times=N)
011| 

When run, this prints:

Run {N} tests and KEEP the door we pick.
Doors: [1]=66,823, [2]=66,929, [3]=66,248; tries=200,000
tries: 200,000, wins: 66,699 (33.349500 %)

Run {N} tests and CHANGE the door we pick.
Doors: [1]=66,783, [2]=66,640, [3]=66,577; tries=200,000
tries: 200,000, wins: 133,238 (66.619000 %)

The results are clear. We see firstly that the car is evenly distributed between all three doors. More importantly, we see that when we stay with the door we picked, we win only 33% of the time, but when we swap, we win 66% of the time.

Despite the initially counter-intuitive sense that swapping can’t possibly help, in fact it doubles your chances.

If we use larger values for N, the tests take a bit of time to run (depending on the speed of system, as always). If we’re curious how long these tests take, there’s a handy Python library function you can use, timeit:

001| from timeit import timeit
002| from examples import three_doors
003| 
004| N = 1_000_000
005| 
006| if __name__ == ‘__main__’:
007|     print(‘Run {N} tests and KEEP the door we pick.’)
008|     ts0 = timeit(‘three_doors(False,N)’, globals=globals(), number=1)
009| 
010|     print(‘Run {N} tests and CHANGE the door we pick.’)
011|     ts1 = timeit(‘three_doors(True,N)’ , globals=globals(), number=1)
012| 
013|     print(f'{ts0=:.6f}, {ts1=:.6f}’)
014| 

The timeit function is a little weird in that it expects text, not code. The simplest way around this is, as done here, to put the code to test in a function, and then just call that function in the string passed to timeit.

When run, this prints:

Run {N} tests and KEEP the door we pick.
Doors: [1]=333,547, [2]=332,738, [3]=333,715; tries=1,000,000
tries: 1,000,000, wins: 333,284 (33.328400 %)

Run {N} tests and CHANGE the door we pick.
Doors: [1]=333,411, [2]=333,306, [3]=333,283; tries=1,000,000
tries: 1,000,000, wins: 666,328 (66.632800 %)

ts0=2.594643, ts1=2.584221

Running the test with one million tries takes about two-and-a-half seconds. And, of course, we got the same results. The larger value for N makes them even closer to the predicted values.


I’ll leave you with a little useless cuteness that tests the limits of recursion:

001| def tail_add (numbers):
002|     if len(numbers) == 1:
003|         return numbers[0]
004|     return numbers[0] + tail_add(numbers[1:])
005| 
006| def tail_mul (numbers):
007|     if len(numbers) == 1:
008|         return numbers[0]
009|     return numbers[0] * tail_mul(numbers[1:])
010| 
011| def tail_xor (numbers):
012|     if len(numbers) == 1:
013|         return numbers[0]
014|     return numbers[0] ^ tail_xor(numbers[1:])
015| 
016| 
017| if __name__ == ‘__main__’:
018|     N = 12
019|     for n in range(1, N+1):
020|         print(‘%2d: %2d’ % (n,tail_add(range(1,n+1))))
021|     print()
022| 
023|     for n in range(1, N+1):
024|         print(‘%2d: %2d’ % (n,tail_mul(range(1,n+1))))
025|     print()
026| 
027|     for n in range(1, N+1):
028|         print(‘%2d: %2d’ % (n,tail_xor(range(1,n+1))))
029|     print()
030| 

These are three examples of tail recursion. In all three cases, if the list of numbers passed to the function is of length one, then that single list element is returned. Otherwise (list has more than one element), the return value is the first item in the list operating on the remainder of the list.

Take the time to be sure you understand how these functions work; they are the soul of recursion and functional programming. Note that, in the “main” code, we’re invoking the functions with a list of numbers that’s longer each time the function is called.

Note that the limit on recursion limits how large a list these functions can handle.


Link: Zip file containing all code fragments used in this post.