Tags
The previous post in this Simple Tricks series began an exploration into function parameters and arguments. (Parameters are what a function declares it takes. Arguments are values passed at runtime.)
We covered optional arguments with defaults and variable arity functions. We left off with variable arity keyword parameters and a synchronization problem.
The last example looked something like this:
002| title = str(kwargs[‘title’]) if ‘title’ in kwargs else ‘A Chart’
003| …
004|
005| plot_data(title=‘A Chart’, size=12)
006|
To focus on specifics, I’ve removed all but one of the kwargs-handling statements (line #2). The rest all follow the same pattern.
Note how the caller (line #5) passes no arguments for the *data parameter. All arguments are optional when using *args or **kwargs parameters.
Additionally, without added code, neither parameter limits input. The *args parameter accepts zero to any number of positional arguments, and the **kwargs parameter does likewise with keyword arguments. For instance, the vector_object(*args) function defined last time accepts (but ignores) more than three arguments.
I pointed out last time how line #2 above has a synchronization problem. The kwargs variable is used twice, as is the parameter name. The latter is especially an issue because it’s inside a string, so the compiler won’t complain about a misspelling as it would with kwargs.
The solution is to define a function encapsulates away the repetition. We’re replacing a single line of code, so a lambda function is a natural choice to implement this:
002| KWARG = lambda nam,dflt,ty,va: ty(va[nam]) if nam in va else dflt
003|
The KWARG function has four positional parameters, all required. First, the name of the keyword parameter, always a string. Second, a default value for the variable in case the keyword doesn’t exist. Third, the data type to convert a presumed string argument to the desired data type. Fourth, the dictionary object containing the keyword arguments (typically kwargs).
Now we can rewrite our function in a less problematic way:
002|
003| def plot_data (*args, **kwargs):
004| ”’Data plotting function.”’
005|
006| title = KWARG(‘title’ , ‘A Chart’, str, kwargs)
007| size = KWARG(‘size’ , 8 , int, kwargs)
008| ymax = KWARG(‘ymax’ , 10.0, float, kwargs)
009| yticks = KWARG(‘yticks’, 1.0, float, kwargs)
010| …
011| print(f'{title = }’)
012| print(f'{size = }’)
013| print(f'{ymax = }’)
014| print(f'{yticks = }’)
015| print()
016|
017| data1 = [1, 2, 3, 4, 5]
018| data2 = [9, 8, 7, 6, 5]
019| plot_data(data1, data2, title=‘My Data’, size=12, ymax=100)
020|
When run, this prints:
title = 'My Data' size = 12 ymax = 100.0 yticks = 1.0
Note the use of Python “f-strings” to print the variable values. If these are unfamiliar to you, stay tuned for the next post. I’ll be exploring formatted output including these.
The KWARG function makes using **kwargs much less error-prone and easier to read. It does assume the input arguments are strings, which can be a feature or a bug depending. The technique above is especially useful for parsing command line input arguments or other commands given entirely in text. But for functions called with known data types, it introduces an extraneous typecast (which, because it adds a form of type-checking and coercing, can be seen as a feature or a bug).
As pointed out above, using the *args or **kwargs parameter means default values and no automatic checking for missing or unexpected arguments. The KWARG function provides default values and enables optional arguments. But it can’t check for required arguments, and it doesn’t notice unexpected arguments.
There are many ways to handle this with some extra code. One is to define a class that automatically processes keyword arguments based on a user-supplied list. Instances of the class will have — as attributes — all the input properties specified by the function. We can include code to deal with required and unexpected arguments.
Here’s one way to implement it:
002| ”’Function parameters object base class.”’
003| KeyWords = {} # subclass must override!
004|
005| def __init__ (self, **kwargs):
006| assert 0 < len(self.KeyWords)
007| self._params = kwargs
008|
009| # Turn valid arguments into attributes…
010| # For each keyword in kwargs…
011| for kw in kwargs:
012| # If it’s not a known keyword…
013| if kw not in self.KeyWords:
014| # Raise an exception…
015| raise ValueError(f’Argument unknown: “{kw}”‘)
016|
017| # Get the keyword data type…
018| datatype,_,_ = self.KeyWords[kw]
019|
020| # Create an attribute for the keyword and argument…
021| setattr(self, kw, datatype(kwargs[kw]))
022|
023| # Turn missing parameters into attributes (w/ defaults)…
024| # For each keyword in kwargs…
025| for kw in self.KeyWords:
026| # If we haven’t created an attribute yet…
027| if not hasattr(self, kw):
028|
029| # Get the keyword default value…
030| _,default,req = self.KeyWords[kw]
031| if req:
032| raise ValueError(f’Argument missing: “{kw}”‘)
033|
034| # Create an attribute for the argument…
035| setattr(self, kw, default)
036|
The ParamsBase class constructor (the __init__ method) uses two loops to process the keyword arguments. Note this is an abstract base class not meant to be instanced. (The assert statement on line #6 ensures instances of this class can’t be created.)
The first loop (lines #9 through #21) checks each provided keyword against those named in KeyWords (which a subclass must populate; see below). Keywords found are turned into instance attributes. Keywords not found raise a ValueError.
The second loop (lines #15 through #19) goes through the official list to see if any inputs were not supplied. If so, they’re made into attributes using the supplied default.
Users of the abstract base class must create a subclass defining the list of desired parameters along with their data type and default value:
002|
003| class MyParams (ParamsBase):
004| KeyWords = {
005| ‘title’: (str,‘<untitled>’, True),
006| ‘size’: (int, 5, False),
007| ‘xmin’: (float, 0.0, False),
008| ‘xmax’: (float, 100.0, True),
009| ‘ymin’: (float, 0.0, False),
010| ‘ymax’: (float, 100.0, True),
011| ‘xticks’: (int, 10, False),
012| ‘yticks’: (int, 10, False),
013| }
014|
015| def plot_data (*args, **kwargs):
016| ”’Data plotting function.”’
017|
018| ps = MyParams(**kwargs)
019| …
020| print(f’Chart: “{ps.title}” (size {ps.size})’)
021| print(f’X-Axis: {ps.xmin:+.2f} x {ps.xmax:+.2f} (ticks={ps.xticks})’)
022| print(f’Y-Axis: {ps.ymin:+.2f} x {ps.ymax:+.2f} (ticks={ps.yticks})’)
023| print()
024| …
025|
026| plot_data(title=‘My Data’,size=9, xmin=–1,xmax=1, ymin=–1,ymax=1)
027|
The MyParams subclass of ParamsBase only needs to redefine the KeyWords class attribute (lines #3 through #13). Functions can use the subclass to parse the supplied keyword arguments (line #16). The returned object, ps, has attributes named after each anticipated keyword argument. (These are referenced in the print statements (lines #18 through #20).)
Note that, in the KeyWords dictionary, each keyword defined attaches to a tuple containing, respectively, the datatype of the attribute, a default value, and a boolean flag indicating whether the argument is required. As implemented, ParamsBase raises a ValueError on missing required and unexpected arguments.
When run, this prints:
Chart: "My Data" (size 9) X-Axis: -1.0 x 1.0 (ticks=10) Y-Axis: -1.0 x 1.0 (ticks=10)
The ps object in plot_data (line #30) has as attributes all items listed in the KeyWords dictionary (with the attribute name being the dictionary keyword name). Note that each keyword associated to a tuple of data type and default value.
If we want to allow unexpected keyword arguments, then the base class can stash them in an extras dictionary rather than raising an exception:
002| ”’Function parameters object base class.”’
003| KeyWords = {} # subclass must override!
004|
005| def __init__ (self, **kwargs):
006| assert 0 < len(self.KeyWords)
007| self._params = kwargs
008| self.extras = {}
009|
010| # Turn valid arguments into attributes…
011| # For each keyword in kwargs…
012| for kw in kwargs:
013| # If it’s not a known keyword…
014| if kw not in self.KeyWords:
015| # Stash unexpected arguments in extras…
016| self.extras[kw] = kwargs[kw]
017| continue
018|
019| # Get the keyword data type…
020| datatype,_,_ = self.KeyWords[kw]
021|
022| # Create an attribute for the keyword and argument…
023| setattr(self, kw, datatype(kwargs[kw]))
024|
025| # Turn missing parameters into attributes (w/ defaults)…
026| # For each keyword in kwargs…
027| for kw in self.KeyWords:
028| # If we haven’t created an attribute yet…
029| if not hasattr(self, kw):
030|
031| # Get the keyword default value…
032| _,default,req = self.KeyWords[kw]
033| if req:
034| raise ValueError(f’Argument missing: “{kw}”‘)
035|
036| # Create an attribute for the argument…
037| setattr(self, kw, default)
038|
This version adds the extras attribute (line #8) and uses it to store any unexpected input arguments. Compare lines #14 through #17 here with lines #14 and #15 in the first version. A client wanting to disallow unexpected arguments can raise its own exception if the dictionary has a length greater than zero.
This version only raises ValueError for missing required arguments. (Which can be effectively disabled by setting the boolean value in the KeyWords parameter definition to False.
Using this new version, we can re-implement our plot_data function with one minor change (in line #17 — the check of the extras attribute):
002|
003| class MyParams (ParamsBase):
004| KeyWords = {
005| ‘title’: (str,‘<untitled>’, True),
006| ‘size’: (int, 5, False),
007| ‘xmin’: (float, 0.0, False),
008| ‘xmax’: (float, 100.0, True),
009| ‘ymin’: (float, 0.0, False),
010| ‘ymax’: (float, 100.0, True),
011| ‘xticks’: (int, 10, False),
012| ‘yticks’: (int, 10, False),
013| }
014|
015| def plot_data (*args, **kwargs):
016| ”’Data plotting function.”’
017|
018| ps = MyParams(**kwargs)
019| assert len(ps.extras)==0, ValueError(‘Unexpected arguments!’)
020| …
021| print(f’Chart: “{ps.title}” (size {ps.size})’)
022| print(f’X-Axis: {ps.xmin:+.2f} x {ps.xmax:+.2f} (ticks={ps.xticks})’)
023| print(f’Y-Axis: {ps.ymin:+.2f} x {ps.ymax:+.2f} (ticks={ps.yticks})’)
024| print()
025| …
026|
Now when we call plot_data, it raises an error for unexpected arguments as well as missing required arguments. This catches, for instance, argument names the user misspelled (which turns them into unexpected arguments).
You can define the MyParams class at the module level if multiple functions can use it, otherwise you can define it inside the function that uses it:
002|
003| def plot_data (*args, **kwargs):
004| ”’Data plotting function.”’
005|
006| class params (ParamsBase):
007| KeyWords = {
008| ‘title’: (str,‘<untitled>’, True),
009| ‘size’: (int, 5, False),
010| ‘xmin’: (float, 0.0, False),
011| ‘xmax’: (float, 100.0, True),
012| ‘ymin’: (float, 0.0, False),
013| ‘ymax’: (float, 100.0, True),
014| ‘xticks’: (int, 10, False),
015| ‘yticks’: (int, 10, False),
016| }
017|
018| ps = params(**kwargs)
019| assert len(ps.extras)==0, ValueError(‘Unexpected arguments!’)
020| …
021| print(f’Chart: “{ps.title}” (size {ps.size})’)
022| print(f’X-Axis: {ps.xmin:+.2f} x {ps.xmax:+.2f} (ticks={ps.xticks})’)
023| print(f’Y-Axis: {ps.ymin:+.2f} x {ps.ymax:+.2f} (ticks={ps.yticks})’)
024| print()
025| …
026|
Either way works depending on whether you’re reusing the subclass. The key point is that, with this technique, your function can proceed knowing that it has all inputs needed, either provided by the caller or from default values. Where necessary, your function can also assume unexpected arguments have been detected and/or that missing required arguments have been detected.
Because the KWARG function expects string arguments (which it casts to the desired data type), it’s especially helpful for processing command line arguments or string arguments read from a text file. The caveat is that these need to be name=value pairs of some kind to fit the keyword argument mold. This is often a natural fit, especially when the arguments are a set of diverse parameters.
It doesn’t fit the technique of using positional line arguments, though (for example, a list of filenames). The last vector_object definition in the previous post showed how to handle positional arguments. Here are relevant lines again for reference:
002| x = float(args[0]) if 0 < len(args) else 0.0
003| y = float(args[1]) if 1 < len(args) else 0.0
004| z = float(args[2]) if 2 < len(args) else 0.0
005|
006| return (x,y,z)
007|
As with **kwargs handling, the formulation here also repeats both the argument index and the name of the argument list object. We can again hide this away in a function:
002| ARG = lambda ix,dflt,ty,va: ty(va[ix]) if ix < len(va) else dflt
003|
In both cases, we only need to get it right one time — here in their definitions — and then we never have to worry about synchronization again. Now we can rewrite functions like vector_object in a less error-prone, much clearer way:
002|
003| def vector_object (*args):
004| x = ARG(0, 0.0, float, args)
005| y = ARG(1, 0.0, float, args)
006| z = ARG(2, 0.0, float, args)
007|
008| return (x,y,z)
009|
010| v0 = vector_object()
011| v1 = vector_object(21)
012| v2 = vector_object(21, 42)
013| v3 = vector_object(21, 42, 63)
014| print(f'{v0 = }’)
015| print(f'{v1 = }’)
016| print(f'{v2 = }’)
017| print(f'{v3 = }’)
018| print()
019|
When run, this prints:
v0 = (0.0, 0.0, 0.0) v1 = (21.0, 0.0, 0.0) v2 = (21.0, 42.0, 0.0) v3 = (21.0, 42.0, 63.0)
Thus, object users can supply as many arguments as desired from zero to all three.
Lastly, the actual command line for Python scripts (alternately, read from a configuration text file).
Command line argument may be positional or have keywords. There are many strategies depending on the developer’s requirements, background, and imagination. The simplest form is positional.
Here’s one way to handle positional command line run-time arguments:
002| from examples import ARG
003|
004| Frequency = float()
005| Amplitude = float()
006| Wavephase = float()
007|
008| …
009|
010| def main ():
011| ”’Module entry function.”’
012| print(f'{Frequency = }’)
013| print(f'{Amplitude = }’)
014| print(f'{Wavephase = }’)
015| …
016|
017| if __name__ == ‘__main__’:
018| # First argument is name of this script…
019| print(argv[0])
020| print()
021|
022| # Handle provided command line arguments…
023| Frequency = ARG(1, 5.0, float, argv)
024| Amplitude = ARG(2, 1.0, float, argv)
025| Wavephase = ARG(3, 0.0, float, argv)
026|
027| # Call main function…
028| main()
029|
This assumes the command line has from zero to three positional arguments that set the frequency, amplitude, and phase for some imagined application. There are default values for missing arguments.
Firstly, your Python modules should (almost) always have an if block like the one in line #17. Doing this ensures that if some other module imports this module, this module doesn’t act as if it had been run. Which is almost always the desired behavior.
When you import a module, Python runs it, so everything at the top level gets invoked. If something at that top level causes output or file access or whatever, it’ll happen when the file is imported as well as when actually run.
Since that’s generally not what you want, the test for __name__ being equal to ‘__main__’ — which is only true if this module is explicitly run — allows you to control what happens when the module is imported. If you’re certain the module will never be imported, then this isn’t necessary, but it’s a very good habit to get into.
Secondly, it’s a good idea to have a main function (line #10). This allows another module to import and call the function. (The name “main” is canonical but you can use any name you prefer.)
When this code is run with nothing on the command line, it prints:
C:\users\wyrd\blog\python\script.py Frequency = 5.0 Amplitude = 1.0 Wavephase = 0.0
Which are, of course, the hard-coded default values.
When run with all three supplied parameters like this:
PYTHON.EXE script.py 12.2 0.75 -30
It prints:
C:\users\wyrd\blog\python\script.py Frequency = 12.2 Amplitude = 0.75 Wavephase = -30.0
That’s all for this time. Next time, Simple Tricks will look at formatted output (including the new “f-strings”). Until then, happy coding!
Link: Zip file containing all code fragments used in this post.
∅
ATTENTION: The WordPress Reader strips the style information from posts, which can destroy certain important formatting elements. If you’re reading this in the Reader, I highly recommend (and urge) you to [A] stop using the Reader and [B] always read blog posts on their website.
This post is: Simple Python Tricks #7
At the end of the post, I mentioned how another module could call the
mainfunction shown in the last example. Here’s how:002|
003| script.Frequency = 8.25
004| script.Amplitude = 1/3
005| script.Wavephase = 180
006|
007| script.main()
008|
009|
Pingback: Simple Python Tricks #8 | The Hard-Core Coder