Tags

, ,

This post was meant to be the final edition of Simple Tricks in 2024. My every-other-Monday schedule had it slotted for December 23rd. But I came down with a respiratory virus on the 22nd. The post wasn’t complete at that point, so I pushed publication to the following Monday, the 30th, but I was sick until well into the new year.

I ended up taking January off from the web (and computers in general), and it’s not until now that I’ve caught up with myself. In any event, I’m not sure how many more of these Simple Tricks posts I’ll do, but here’s one more.

We’ll start with a simple application that we’ll write in three different ways: procedural (imperative), functional, and object-oriented. As I’ve mentioned often, I tend to favor the last approach, and the examples here may help illustrate exactly why [see OOP versus Imp].

The app requirement is simple. Given a list of numbers, compute their squares and then the sum of those squares. The minimum requirement is that sum of squares. This might be part of calculating the standard deviation (and we’ll explore doing that next).

The simplest and most straight-forward (but not necessarily the best) form of programming is procedural (imperative). Procedural code is noted for its procedures (aka sub-routines or functions):

001| from math import sqrt
002| from random import randint
003| 
004| N = 12
005| 
006| def sum_of_squares (numbers):
007|     ”’Generate a sum of squares from a list of numbers.”’
008| 
009|     # Python does most of the work for us…
010|     return sum(pow(num,2) for num in numbers)
011| 
012| 
013| if __name__ == ‘__main__’:
014| 
015|     # Generate a list of numbers…
016|     numbers = [randint(1,100) for _ in range(N)]
017| 
018|     result = sum_of_squares(numbers)
019|     print(f”Numbers: {numbers}”)
020|     print(f”Count: {len(numbers)} numbers”)
021|     print(f’Minimum: {min(numbers)}’)
022|     print(f’Maximum: {max(numbers)}’)
023|     print(f”Sum of squares: {result}”)
024|     print(f”Standard Deviation: {sqrt(result/len(numbers)):.3f}”)
025|     print()
026| 

It turns out that built-in Python sum function does some of the heavy lifting for us. The code above has only the one function, sum_of_squares (lines #6 to line #10), which takes a list of numbers and returns the sum of their squares. For simplicity, I’m not checking the list of numbers for validity. If the list contains invalid numbers, the function blows up in the built-in pow function.

In line #16 we generate a list of random numbers and pass it to the sum_of_squares function (line #18). Lines #19 to #25 print the results.

When run, this prints:

Numbers: [82, 54, 1, 77, 64, 64, 42, 76, 64, 18, 93, 80]
Count: 12 numbers
Minimum: 1
Maximum: 93
Sum of squares: 50771
Standard Deviation: 65.045

Of course, because of the randint function, the result will be different each time.

A more verbose and less Pythonic version might look like this:

001| from sys import argv
002| from math import sqrt
003| from random import randint
004| 
005| def validate_list_of_numbers (numbers):
006|     ”’Check list to ensure it consists of numbers.”’
007| 
008|     for num in numbers:
009|         # A valid type just keeps the list going…
010|         if isinstance(num,int) or isinstance(num,float):
011|             continue
012| 
013|         # If we get here, it isn’t a legit number…
014|         raise ValueError(f’Invalid list element #{ix}: {num}’)
015| 
016|     # Made it through the whole list without an exception…
017|     return True
018| 
019| def sum_of_numbers (numbers):
020|     ”’Generate a sum from a list of numbers.”’
021|     validate_list_of_numbers(numbers)
022| 
023|     accum = 0
024|     for num in numbers:
025|         accum += num
026| 
027|     return accum
028| 
029| def sum_of_squares (numbers):
030|     ”’Generate a sum of squares from a list of numbers.”’
031|     validate_list_of_numbers(numbers)
032| 
033|     squares = [num*num for num in numbers]
034| 
035|     # Python does most of the work for us…
036|     return sum_of_numbers(squares)
037| 
038| 
039| if __name__ == ‘__main__’:
040|     print(argv[0])
041| 
042|     n = int(argv[1]) if 1 < len(argv) else 12
043| 
044|     # Generate a list of numbers…
045|     numbers = [randint(1,100) for _ in range(n)]
046| 
047|     result = sum_of_squares(numbers)
048|     print(f”Numbers: {numbers}”)
049|     print(f”Count: {len(numbers)} numbers”)
050|     print(f’Minimum: {min(numbers)}’)
051|     print(f’Maximum: {max(numbers)}’)
052|     print(f”Sum of squares: {result}”)
053|     print(f”Standard Deviation: {sqrt(result/len(numbers)):.3f}”)
054|     print()
055| 

In fact, it’s a bit of overkill, but I’m pretending Python doesn’t have the sum or pow functions here. I’ve added the validate_list_of_numbers function (lines #5 to #17) to provide better error detection and added the sum_of_numbers function (lines #19 to #27) to replace the built-in sum function.

Which shows how elegant a Pythonic solution really is. In that first example, even the sum_of_squares function is overkill, since it’s just one line of code that could just as easily be where the function call is with no need for a separate function.

When run, this prints essentially the same thing as the first example (except for having different random numbers).


The next example uses an object-oriented solution with a class that embodies all the processing (and a bit more):

001| from sys import argv
002| from math import sqrt
003| from random import randint
004| 
005| class NumberList:
006|     def __init__(self, lst):
007|         self.lst = tuple(lst)
008| 
009|     def __iter__ (self):
010|         return iter(self.lst)
011| 
012|     def __len__ (self):
013|         return len(self.lst)
014| 
015|     @property
016|     def numbers (self): return self.lst
017| 
018|     @property
019|     def low (self): return min(self.lst)
020| 
021|     @property
022|     def high (self): return max(self.lst)
023| 
024|     @property
025|     def squares (self): return [n*n for n in self.lst]
026| 
027|     @property
028|     def sumsquares (self): return sum(self.squares)
029| 
030|     @property
031|     def sumsqrt (self): return sqrt(self.sumsquares)
032| 
033|     @property
034|     def stddev (self): return sqrt(self.sumsquares/len(self))
035| 
036| 
037| if __name__ == ‘__main__’:
038|     print(argv[0])
039| 
040|     n  = int(argv[1]) if 1 < len(argv) else 12
041|     lo = int(argv[2]) if 2 < len(argv) else 1
042|     hi = int(argv[3]) if 3 < len(argv) else 100
043| 
044|     # Generate a list of numbers…
045|     numbers = [randint(lo,hi) for _ in range(n)]
046| 
047|     numlst = NumberList(numbers)
048|     print(f”Numbers: {numbers}”)
049|     print(f”Count: {len(numlst)} numbers”)
050|     print(f’Minimum: {numlst.low}’)
051|     print(f’Maximum: {numlst.high}’)
052|     print(f”Sum of squares: {numlst.sumsquares}”)
053|     print(f”Standard Deviation: {numlst.stddev:.3f}”)
054|     print()
055| 

In this example, we define the class NumberList (lines #5 to #34) whose new instances expect an iterable — anything that tuple can accept and turn into a tuple. We also define the __iter__ and __len__ built-in methods (lines #9 to #13) to enable iteration and the built-in len function. For brevity, I left off the validity check for all list elements being numbers.

The class also has four read-only property methods: numbers (line #16), which returns the original list of numbers; low (line #19), which returns the smallest number; high (line #22), which returns the biggest number; squares (line #25), which returns the list of squares; sumsquares (line #28), which returns the sum of the squares, sumsqrt (line #31), the square root of the sum of squares, and stddev, which returns the standard deviation.

Note that the only information the class stores is the original list of numbers. The sum of squares, the square root of the sum, or the standard deviation are all calculated on the fly each time. This respects of principle of “no duplication”. Never store parallel results without good reason.

That said, we’ve made this class immutable by storing the list of numbers as a tuple, so there’s no reason the read-only properties couldn’t cache their respective values. If we left the list mutable, even adding the __getitem__ and __setitem__ methods for indexing, then calculating the sum and square root values on the fly is necessary.

When run, this prints:

C:\CJS\prj\Python\blog\hcc\source\fragment.py

Numbers: [5, 77, 61, 59, 72, 77, 86, 66, 13, 30, 75, 64]
Count: 12 numbers
Minimum: 5
Maximum: 86
Sum of squares: 46811
Standard Deviation: 62.457

Lastly, the functional approach where everything is done with functions and no variables:

001| from sys import argv
002| from math import sqrt
003| from random import randint
004| 
005| square = lambda n:n*n
006| 
007| sumlst = lambda lst:lst[0] if len(lst)==1 else lst[0]+sumlst(lst[1:])
008| 
009| mullst = lambda lst:lst[0] if len(lst)==1 else lst[0]*mullst(lst[1:])
010| 
011| sumsqrs = lambda lst: sumlst([square(n) for n in lst])
012| 
013| 
014| if __name__ == ‘__main__’:
015|     print(argv[0])
016| 
017|     n  = int(argv[1]) if 1 < len(argv) else 12
018|     lo = int(argv[2]) if 2 < len(argv) else 1
019|     hi = int(argv[3]) if 3 < len(argv) else 100
020| 
021|     # Generate a list of numbers…
022|     numbers = [randint(lo,hi) for _ in range(n)]
023| 
024|     result = sumsqrs(numbers)
025|     print(f’Numbers: {numbers}’)
026|     print(f”Count: {len(numbers)} numbers”)
027|     print(f’Minimum: {min(numbers)}’)
028|     print(f’Maximum: {max(numbers)}’)
029|     print(f’Sum of squares: {result}’)
030|     print(f”Stnd Deviation: {sqrt(result/len(numbers))}”)
031|     print()
032| 

In this example we define four lambda functions: square, which takes a number and squares it; sumlst, which takes a list of numbers and returns their sum; mullist, which takes a list of numbers and returns their product (unused, included for illustration); and sumsqrs, which takes a list of numbers and returns the sum of their squares (using the square and sumlst functions).

Note the use of tail recursion on the sumlst and mullst functions (see the end of the previous post for more on tail recursion).


A math paper [arXiv:2401.14346v2] by Angelini, Branicky, Resta, Sloane, and Wilson, The Comma Sequence: A Simple Sequence with Bizarre Properties, investigates the number sequence beginning at 1 and consecutive numbers after are separated by the value formed by the right-hand digit of the first number and the left-hand digit of the second number. The two digits are concatenated to form a number whose value is the gap between the first and second numbers.

The sequence begins: 1, 12, 35, 94, 135, …

The gap between 1 and 12 is 11, the gap between 12 and 35 is 23, the gap between 35 and 94 is 59, the gap between 94 and 135 is 41, and so on. The pairs of digits used to form the gap number are color coded to help illustrate the protocol.

Generating the sequence requires using the right-hand digit of the last number so far and trying to find a second number whose left-hand digit forms a valid gap number. For example, above the last number is 135, and its right-hand digit is “5”. So, we need a two-digit number starting with five, whose value added to 135 is that number.

There are only nine possibilities (because the new number can’t start with zero):

  1. 135 + 51 = 186 (works)
  2. 135 + 52 = 187 (fail)
  3. 135 + 53 = 188 (fail)
  4. 135 + 54 = 189 (fail)
  5. 135 + 55 = 190 (fail)
  6. 135 + 56 = 191 (fail)
  7. 135 + 57 = 192 (fail)
  8. 135 + 58 = 193 (fail)
  9. 135 + 59 = 193 (fail)

So, the next number is 186. If there are multiple valid choices that work, we take the lowest one. If none of the possibilities works, the sequence ends. (According to the paper, it does at the 2,137,453rd term, 999,99,945.

This is exactly the sort of simple-but-sufficiently-chewy problem that gets my gears turning and results in a bit of code. I found I needed two functions: a sequence checker that examines an existing sequence for validity, and (of course) a sequence generator to create the sequence. The code turns out to be extremely simple.

Here’s the sequence checker, test_sequence (lines #11 to #21):

001| from sys import argv
002| from itertools import count
003| 
004| demo = (1, 12, 35, 94, 135, 186, 248, 331, 344, 387, 461,
005|         475, 530, 535, 590, 595, 651, 667, 744, 791, 809,
006|         908, 997, 1068, 1149, 1240, 1241, 1252, 1273, 1304,
007|         1345, 1396, 1457, 1528, 1609, 1700, 1701, 1712,
008|         1733, 1764, 1805, 1856, 1917, 1988, 2070, 2072,
009|         2094, 2136, 2198, 2280)
010| 
011| def test_sequence (seq):
012|     ”’Test SEQ as a Comma Sequence.”’
013|     print(seq)
014|     print()
015|     css = [str(n) for n in seq]
016| 
017|     for ix,a,b in zip(count(1),css,css[1:]):
018|         dif = int(f'{a[-1]}{b[0]}’)
019|         assert (int(a)+dif) == int(b), ValueError(f’Bad sequence! {a},{b}’)
020|         print(f'{ix:04d}: {a} + {dif:02d} = {b}’)
021|     print()
022| 
023| if __name__ == ‘__main__’:
024|     print(argv[0])
025|     test_sequence(demo)
026|     print()
027| 

Note the use of the count function from itertools. It provides an index number beginning with 1. We could have wrapped the zip function with an enumerate wrapper (and set start=1), but then the for loop doesn’t give us the three-part tuple that’s so easy to unpack into the ix, a and b variables.

The demo tuple holds the first 50 comma sequence numbers. This is provided as test data for the function and for your experimentation. See what the function does if you modify one of the numbers.

Next is the generation function, with is comprised of find_next_number (lines #3 to #32) and generate_sequence (lines #34 to #48):

001| from sys import argv
002| 
003| def find_next_number (n):
004|     ”’Find next number in Comma Sequence given N.”’
005| 
006|     last_digit = str(n)[1]
007|     print(f'{n=}’)
008|     print(f'{last_digit=}’)
009| 
010|     jumps = [int(f'{last_digit}{d}’) for d in range(1,10)]
011|     targs = [n+p for p in jumps]
012|     spans = [int(f'{last_digit}{str(t)[0]}’) for t in targs]
013|     check = [js for j,s in zip(jumps,spans)]
014| 
015|     print(f'{jumps=}’)
016|     print(f'{spans=}’)
017|     print(f'{check=}’)
018|     print(f'{targs=}’)
019| 
020|     #TODO: Check for multiple possiblities.
021|     if 0 in check:
022|         ix = check.index(0)
023|         plus = jumps[ix]
024|         targ = targs[ix]
025|         print(f’found: {n} + {plus:02d} = {targ}’)
026|         print()
027|         return (plus, targ)
028| 
029|     print(f’nadda: {n} {check}’)
030|     print()
031| 
032|     return (0,n)
033| 
034| def generate_sequence (seq_length, start=1):
035|     ”’Generate a Comma Sequence COUNT long.”’
036|     nxt = start
037|     seq = [nxt]
038| 
039|     for n in range(seq_length):
040|         add,nxt = find_next_number(nxt)
041|         if add == 0:
042|             print(f’Sequence breaks at {nxt}’)
043|             print()
044|             break
045|         seq.append(nxt)
046|     print()
047| 
048|     return seq
049| 
050| if __name__ == ‘__main__’:
051|     print(argv[0])
052|     seq = generate_sequence(5)
053|     print()
054| 

The generate_sequence function takes two parameters, seq_length and start, which control how long the sequence is (assuming valid gaps can be found for each interval, otherwise the sequence will end and be shorter) and where it starts (by default, at 1). Note how it always returns a sequence of at least the start number (because of lines #36 and #37).

The for loop (lines #39 to #45) builds the sequence by calling find_next_number (and terminating early if it can’t find one).

The find_next_number function takes a single parameter, the current number. It uses that to determine the last digit (line #6). It builds a list of the nine possible gap numbers (line #10) and the targets those gaps result in (line #11). Next it builds a list of intervals from the targets (line #12). Then it creates a list of differences — if a jumps number matches the spans number, we’ve found a valid gap number. We just need to see if there is at least one zero in the check list (lines #21 to #27) — that will be the valid gap. Note that it will automatically pick the first (lowest) possibility when there are multiple options (which is what we want), but it doesn’t make any note of the fact.

If there are no zero entries in the check list, we failed to find a valid next number and return a zero as the first member of the returned tuple (line #32).

When run (with the input 5, asking for a five-number sequence), this prints:

n=1
last_digit='1'
jumps=[11, 12, 13, 14, 15, 16, 17, 18, 19]
spans=[11, 11, 11, 11, 11, 11, 11, 11, 12]
check=[0, 1, 2, 3, 4, 5, 6, 7, 7]
targs=[12, 13, 14, 15, 16, 17, 18, 19, 20]
found: 1 + 11 = 12

n=12
last_digit='2'
jumps=[21, 22, 23, 24, 25, 26, 27, 28, 29]
spans=[23, 23, 23, 23, 23, 23, 23, 24, 24]
check=[-2, -1, 0, 1, 2, 3, 4, 4, 5]
targs=[33, 34, 35, 36, 37, 38, 39, 40, 41]
found: 12 + 23 = 35

n=35
last_digit='5'
jumps=[51, 52, 53, 54, 55, 56, 57, 58, 59]
spans=[58, 58, 58, 58, 59, 59, 59, 59, 59]
check=[-7, -6, -5, -4, -4, -3, -2, -1, 0]
targs=[86, 87, 88, 89, 90, 91, 92, 93, 94]
found: 35 + 59 = 94

n=94
last_digit='4'
jumps=[41, 42, 43, 44, 45, 46, 47, 48, 49]
spans=[41, 41, 41, 41, 41, 41, 41, 41, 41]
check=[0, 1, 2, 3, 4, 5, 6, 7, 8]
targs=[135, 136, 137, 138, 139, 140, 141, 142, 143]
found: 94 + 41 = 135

n=135
last_digit='5'
jumps=[51, 52, 53, 54, 55, 56, 57, 58, 59]
spans=[51, 51, 51, 51, 51, 51, 51, 51, 51]
check=[0, 1, 2, 3, 4, 5, 6, 7, 8]
targs=[186, 187, 188, 189, 190, 191, 192, 193, 194]
found: 135 + 51 = 186

I made the function extra verbose to show how it’s working.

That’s all for now.


Link: Zip file containing all code fragments used in this post.