Tags
The last two posts in this Simple Tricks series (Tricks #4 and Tricks #5) explored the basics of file handling. The two before that (Tricks #2 and Tricks #3) explored Python list comprehensions.
This time we’ll explore something extremely basic, passing parameters to functions. Python has interesting native capabilities that give programmers options in how they deal with function parameters.
To begin at the very beginning: A function is a unit of code that takes zero or more data inputs (called formal parameters) and outputs a single datum (called a return value). Presumably the function does something interesting with the inputs in order to generate the output.
Sometimes a function takes zero inputs and returns a value it generates on its own. For example, a function that just returns the current time. Most functions need inputs, and that’s what this post focuses on.
Some consider the terms parameter and argument synonymous, but here I’ll use parameter to mean the declaration of the input variable in the function definition and argument to mean the actual value passed at run-time.
002| c = a + b
003| return c
004|
005| x = add(21, 42)
006| y = add(63, 84)
007|
The simple add function above (lines #1-#3) has two formal parameters, a and b. Line #5 calls the function passing two arguments, 21 and 42. Line #6 calls it again but passes the arguments 63 and 84.
The term arity refers to the number of input arguments. In some cases, the arity is fixed and known by the compiler. For instance, the add function above has a fixed arity of two.
In other cases, arity is variable. A good example is a print function that prints whatever number of items are passed to it (or, typically, a blank line if none are passed). Python’s built in print function acts this way:
002|
003| print()
004| print(“Hello,”, “World!”, “How”, “are”, “you?”)
005| print(“Some numbers:”, 21, 42, 63, 84, pi, e)
006| print(“That’s”, “all”, “for”, “now.”)
007| print(“Bye!”)
008| print()
009|
010|
When run, this prints:
Hello, World! How are you? Some numbers: 21 42 63 84 3.141592653589793 2.718281828459045 That's all for now. Bye!
When print gets multiple arguments, it prints the items separated by spaces. It would be nice to change that and perhaps other behaviors, as well. This implies, on top of variable arity, optional parameters. If supplied, their values change default behaviors. If not supplied, the function uses a default value.
But this implies a problem. How can the function tell an optional parameter was supplied? The solution to this is keyword parameters. If a formal parameter has a default value attached, then it is a keyword parameter. The definition of the print function mentioned above probably looks something like this:
002| ”’Print function.”’
003| …
004|
That first parameter, *args, and one similar to it are the stars of the show in this post, and I’ll get to them below. Here I’ll just say that *args is a list of input arguments for the function to print. The second and third parameters, the slash and asterisk, aren’t parameters at all but special signals.
The asterisk means arguments to the left cannot have keywords. The slash means arguments to the right must have keywords. The two together ensure input arguments are just a list of terms to be printed and that, not only are control inputs recognized as such, but unknown keyword arguments are rejected. This provides a form of function type-checking.
Each of the keyword arguments controls an aspect of the print function. The sep parameter controls the separator character between print items. The end parameter controls whether a newline is printed. The file parameter allows print output to be redirected to a file. The flush parameter ensures a buffer flush after printing. Each has a default value callers can override if desired.
A vector object illustrates a common use for default arguments. We’ll assume a 3D vector, (x, y, z), implemented as a Python tuple:
002| t = tuple([float(x), float(y), float(z)])
003| return t
004|
005| v0 = vector_object()
006| v1 = vector_object(21)
007| v2 = vector_object(21, 42)
008| v3 = vector_object(21, 42, 63)
009| v4 = vector_object(z=63)
010|
011| print(v0)
012| print(v1)
013| print(v2)
014| print(v3)
015| print(v4)
016| print()
017|
When run, this prints:
(0.0, 0.0, 0.0) (21.0, 0.0, 0.0) (21.0, 42.0, 0.0) (21.0, 42.0, 63.0) (0.0, 0.0, 63.0)
This gives users the option of providing no parameters (line #5) and getting a reasonable default (all zeros) or providing whatever initial values they want (lines #6-#9). Note that arguments can be provided by position or by keyword. The latter allows providing only the values different from the default (as in line #9).
If we require all three arguments to have keywords attached, we define it like this:
002| t = tuple([float(x), float(y), float(z)])
003| return t
004|
005| v0 = vector_object()
006| v1 = vector_object(z=63)
007| v2 = vector_object(x=21, y=42, z=63)
008| print(v0)
009| print(v1)
010| print(v2)
011| print()
012|
If we want to disallow keywords, we define it like this:
002| t = tuple([float(x), float(y), float(z)])
003| return t
004|
005| v0 = vector_object()
006| v1 = vector_object(21)
007| v2 = vector_object(21, 42)
008| v3 = vector_object(21, 42, 63)
009| print(v0)
010| print(v1)
011| print(v2)
012| print(v3)
013| print()
014|
Note this prevents setting y without setting x, or setting z without setting x and y.
Providing default values for parameters makes the arity appear variable from the point of view of the user, but the function does still have a fixed set of parameters even if it isn’t always called with all of them. These are variable argument functions, not variable arity functions.
Declaring a parameter adds it to the arity of the function, so Python provides a way to specify an unknown number of arguments (such as needed by the print function).
Many languages (including Python) feature an array or list of command line arguments provided by the user at runtime. This array or list usually goes by the name argv (for arguments vector), and its exact implementation depends on the language. Python provides it as a list in the sys module:
002|
003| for ix,arg in enumerate(argv):
004| print(f'{ix:2d}: {arg}’)
005| print()
006|
When run it prints any runtime arguments supplied on the command line:
0: C:\users\wyrd\blog\hcc\source\example.py 1: spam 2: spam 3: spam 4: spam 5: eggs
Because, in this run, I supplied the five arguments listed. Note how the first argument is the filename of the Python module being run. This is always present even if there were no command line arguments provided. This is typical and provided by the Python runtime.
Python functions support a variable arity that’s similar to the command line (zero or more arguments provided). Because of that similarity, the variable name canonically used is args, but you can use any name. (Using canonical names and styles makes your code easier for other Python programmers to read — they signal your intent.)
Here’s how we can use it to re-implement the vector object with variable arguments:
002| x = float(args[0]) if 0 < len(args) else 0.0
003| y = float(args[1]) if 1 < len(args) else 0.0
004| z = float(args[2]) if 2 < len(args) else 0.0
005|
006| return (x,y,z)
007|
008| v0 = vector_object()
009| v1 = vector_object(21)
010| v2 = vector_object(21, 42)
011| v3 = vector_object(21, 42, 63)
012| print(v0)
013| print(v1)
014| print(v2)
015| print(v3)
016| print()
017|
Using the canonical name args. Regardless of the name, the leading asterisk is important. That’s Python knows args is a list (literally a list object) with zero or more input arguments (it has a length of zero or greater). This is functionally identical to the third vector_object example above.
Lines #2-#4 use the list length to determine whether the input value comes from an argument in the list or, if not supplied, from the default value hardcoded into the function. It’s a good idea to cast input arguments to the expected type unless you know they will always be the expected types.
This technique works fine in cases where the function has a fixed arity with known parameters but some or all arguments are optional.
The print function, or any function with truly variable arity, must treat the input as the list it actually is. An imaginary print function might work something like this:
002|
003| def print (*printargs, sep=‘ ‘, end=‘\n’):
004| ”’OS PrintQueue does the heavy lifting.”’
005| PrintQueue(START)
006|
007| for ix,arg in enumerate(printargs):
008| if (0 < ix) and (sep is not None):
009| PrintQueue(ADD, sep)
010| PrintQueue(ADD, arg)
011|
012| if end is not None:
013| PrintQueue(ADD, end)
014|
015| PrintQueue(PRINT)
016|
017| print(“Hello,”, “World!”, “How”, “are”, “you?”)
018| print()
019| print(“Some numbers: “, end=None)
020| print(21, 42, 63, 84, 3.1415, 2.7, sep=‘, ‘)
021| print()
022| print(“That’s”, “all”, “for”, “now.”)
023| print(“Bye!”)
024|
Note the use of the end and sep keyword arguments to tailor the behavior of the function. Firstly, to join two lines; secondly, to separate numbers with commas. Note also using the name *printargs rather than the canonical *args.
When run, this prints:
Hello, World! How are you? Some numbers: 21, 42, 63, 84, 3.1415, 2.7 That's all for now. Bye!
This assumes that PrintQueue is an operating system function that actually sends the output text to the screen (or wherever). The intent here is demonstrating how to handle *args as an arbitrary list rather than as a set of known positional arguments. (Not too differently from the example above that printed the sys.argv list.)
Note that neither the parameters nor the arguments in the *args mechanism can have names or defaults. The print function defined above takes zero or more unnamed positional arguments followed by neither, either, or both keyword arguments.
Python supports arbitrary keyword arguments similarly to how it supports arbitrary positional arguments. In this case the canonical name is kwargs (for keyword args) and two leading asterisks indicate the parameter is a Python dictionary (dict) of keyword arguments.
To illustrate using both arbitrary positional arguments and arbitrary keywords, suppose there is a plot_data function that takes zero or more lists of data points and plots them on a graph. The function also takes a number of keyword arguments tailoring how the graph looks.
Here’s a code fragment:
002| title = str(kwargs[‘title’]) if ‘title’ in kwargs else ‘A Chart’
003| size = int(kwargs[‘size’]) if ‘size’ in kwargs else 8
004| ymax = float(kwargs[‘ymax’]) if ‘ymax’ in kwargs else 10.0
005| yticks = float(kwargs[‘yticks’]) if ‘yticks’ in kwargs else 1.0
006| …
007| for datum in data:
008| # Add plot line to graph…
009| …
010| …
011|
012| data1 = [1, 2, 3, 4, 5]
013| data2 = [21, 42, 63, 84, 99]
014| data3 = [9, 8, 7, 6, 5]
015| plot_data(data1, data2, data3, title=‘My Data’, size=15, ymax=100)
016|
The above is just a sketch for illustration. In a real function, there might be dozens (or more) potential keyword arguments for the function. (In fact, the plotting methods in the plotmatlib library work exactly this way.)
The technique above is intended for functions that do take lots of keyword arguments. Functions taking only a handful are better off declaring them as formal parameters. For one thing, this documents the function better. It also provides some argument-checking. The above silently allows unknown keyword arguments.
Which can be a feature if the function should be blind to unknown keywords. An example of this might be a function that passes the kwargs parameter to a sub-function that may use keywords the parent function doesn’t. Another example is future/past compatibility where users of the function may pass obsolete or updated keywords.
If all passed keyword arguments must be valid, the technique above requires they be explicitly checked. I’ll leave that for a future post.
The code above does a typecast on its arguments to ensure they’re the right data type. This is optional if callers are certain to pass the right data types but can be necessary if all passed arguments are strings (say from the command line or a configuration text file).
One problem with how the above is written is the double-use of both kwargs and the parameter name. Using a reference more than once in a given scope creates a synchronization problem. The double-use of the parameter name is especially problematic because it’s inside a string, so no syntax error occurs if it is misspelled (whereas misspelling kwargs likely does generate a syntax error).
There’s a handy way to avoid that. That’s where I’ll pick up next time.
Link: Zip file containing all code fragments used in this post.
∅
ATTENTION: The WordPress Reader strips the style information from posts, which can destroy certain important formatting elements. If you’re reading this in the Reader, I highly recommend (and urge) you to [A] stop using the Reader and [B] always read blog posts on their website.
This post is: Simple Python Tricks #6
It’s worth summarizing the use of / and * in formal parameters:
Pingback: Simple Python Tricks #7 | The Hard-Core Coder
Pingback: Simple Python Tricks #8 | The Hard-Core Coder