This is the third post in a series for those who have never used Python but have used a programming language before. These posts are meant as an introduction to this delightful and popular programming language.
The first post introduced Python’s most basic data types; the second post introduced its more interesting list-like data types. In this post, we’ll start digging deeper into those list-like data types and what they can do.
In Python, list and list-like objects are as central and common as sand on a beach. We call such an object an iterable. Remember that Python uses “duck typing” — if something quacks, waddles, swims, and likes breadcrumbs, it’s a duck. More precisely, something we can treat as a duck.
For an object to be a duck iterable, it must quack support iteration. It can do this in two ways: by providing an iterator object or by providing a numeric indexing mechanism (starting with zero). If an object provides either or both, Python considers is a list-like object: an iterable.
An actual list object is the epitome of iterable objects — the ultimate list-like object.
There is a lot to unpack here, and it’s the main topic of this post. We begin with the basics: Python objects. What exactly is a Python “object” or “data type”? I’ve used the terms frequently in the last two posts. It’s time to define them more precisely.
The first one is fairly easy because every object in Python is a Python object. That sounds redundant, but the word “object” has two meanings in software design. Firstly (as with “every object”), it can mean what most people mean when they use the word: any old object. A car, a toy, a cloud, a thought, a musical note, a math equation; these are all objects. A code object in this sense is any distinct thing — a function, a variable, etc. — something with a name.
Secondly, more relevant here, a software object (in the object-oriented design sense) is a self-contained unit of data along with code that knows how to process that data.. Generally, such objects often share common code but encapsulate specific instance data.
In Python, all code objects are of this second type.
They are defined by data types or classes. All integer values are of class int; all floating-point values are of class float; all string values are of class str; and so on. In each case, the data type is a class that can create instances of its type. A common metaphor sees a class as a cookie cutter stamping out instances (cookies).
In the last two posts, we saw some examples:
002|
003| n = int() # int class creates an int instance
004|
005| f = float() # float class creates a float instance
006|
007| s = str() # str class creates a str instance
008|
009| l = list() # list class creates a list instance
010|
011| t = tuple() # tuple class creates a tuple instance
012|
013| d = dict() # dict class creates a dict instance
014|
Each uses the class to create instances. Since no arguments are provided (in the parentheses), the resulting instances have class-dependent default values (zero for numbers and empty lists for the list-like objects).
This code works exactly the same:
002|
003| n = 0 # int literal creates an int instance
004|
005| f = 0.0 # float literal creates a float instance
006|
007| s = “” # str literal creates a str instance
008|
009| l = [] # list literal creates a list instance
010|
011| t = () # tuple literal creates a tuple instance
012|
013| d = {} # dict literal creates a dict instance
014|
And is probably what most coders would write (less typing).
Regardless of how created, the Python objects (which are always instances of some class) are bound to variable names. They have methods (aka functions) that manage their instance data. These methods are invoked with the dot operator.
Here are some examples of invoking (aka calling) methods, in this case on a string object bound to the variable named hello:
002|
003| hello = “Hello, World!”
004|
005| print(hello)
006| print(hello.upper())
007| print(hello.lower())
008| print()
009|
010| print(hello.startswith(“He”))
011| print(hello.startswith(“Hi”))
012| print()
013|
014| print(hello.endswith(“world.”))
015| print(hello.endswith(“World!”))
016| print()
017|
018| print(hello.split())
019| print(hello.split(“,”))
020| print()
021|
022| print(hello.find(“ll”))
023| print(hello.find(“oo”))
024| print()
025|
026| print(hello.count(“l”))
027| print(hello.count(“o”))
028| print()
029|
030| print(hello.isalpha())
031| print(hello.isalnum())
032| print(hello.isdigit())
033| print()
034|
035| print(hello.replace(‘ ‘, ‘+’))
036| print()
037|
When run, this prints:
Hello, World! HELLO, WORLD! hello, world! True False False True ['Hello,', 'World!'] ['Hello', ' World!'] 2 -1 3 2 False False False Hello,+World!
We met the first two string methods (lines #6 & #7) in the first post.
The startswith method (lines #10 and #11) and endswith method (lines #14 and #15) test the respective ends of the string for a match and return True or False. The split method (lines #18 and #19), by default uses spaces to break the string into a list of words. An optional parameter (line #19) allows specifying a different delimiter.
The find method (lines #22 and #23) returns the index of the first occurrence of the provided substring (or -1 if the substring isn’t found). The count method (lines #26 and #27) counts the number of occurrences of the provided substring.
Three of the various string test methods are illustrated in lines #30 to #32. Lastly, the replace method allows replacing a given substring in the string with a different substring.
There are many more methods supported by strings.
Every instance of data has a class behind it with methods that act on the data. This is true even of integer and floating-point objects:
002|
003| n = 12345678901234567890
004|
005| print(n)
006| print(n.bit_length())
007| print(n.bit_count())
008| print(*n.to_bytes(8, byteorder=‘little’))
009| print()
010|
011| f = 2.718281828
012|
013| print(f)
014| print(f.hex())
015| print(f.as_integer_ratio())
016| print(f.is_integer())
017| print()
018|
When run, this prints:
12345678901234567890 64 32 210 10 31 235 140 169 84 171 2.718281828 0x1.5bf0a8b04919bp+1 (6121026513834395, 2251799813685248) False
Don’t worry if this seems a lot to take in. Remember, these posts are just a familiarization tour. You don’t need to memorize any of this. For one thing, Python documentation is excellent. For another, as you work with Python (if you work with Python), you’ll learn as you go. (And if you don’t work with Python, there’s no need to remember anything anyway.)
The dot operator is generally used on variables, as shown in the last two examples. While it’s not common, we can use the dot operator on some literals:
002|
003| print(“This is a test”)
004| print(“This is a test”.count(“s”))
005| print(“This is a test”.count(“is”))
006| print(“This is a test”.count(“xyz”))
007| print()
008|
009| print([1, 2, 3, 1, 2, 3])
010| print([1, 2, 3, 1, 2, 3].count(3))
011| print([1, 2, 3, 1, 2, 3].count(4))
012| print()
013|
014| print({“x”:2.1, “y”:4.2})
015| print({“x”:2.1, “y”:4.2}.get(“x”, –1))
016| print({“x”:2.1, “y”:4.2}.get(“z”, –1))
017| print()
018|
When run, this prints:
This is a test
3
2
0
[1, 2, 3, 1, 2, 3]
2
0
{'x': 2.1, 'y': 4.2}
2.1
-1
We cannot use the dot operator on numeric literals because Python sees it as a decimal point in that context.
The takeaway here is that all objects in Python are instances of some class. Even the classes themselves are instances of a built-in generic type class. In all cases they support a set of methods defined for their instances.
We’ll explore this in detail down the road (with a quick peek at the end of this post). For now, the key takeaways are:
• Firstly, that everything in Python is an object that is an instance of some class (aka data type). That class defines methods (aka functions) for interacting with the instance data.
• Secondly, given some class, the general syntax is:
-- Create a new instance object. -- instance = class(arguments) -- Invoke a method on the instance object. -- instance.method(arguments) -- Invoke a method on the class object. -- class.method(arguments)
In all three cases, depending on class or instance, arguments may be required, optional, or non-existent (i.e. none defined). Note that all three examples above are indeed what they look like: function calls with possible arguments.
As in many languages, the syntax function(arguments) invokes the named function and passes the provided arguments. If a function doesn’t take arguments, the syntax reduces to function() — the key syntactical indicator being a variable name followed by parentheses.
In Python, we can call any object that is callable:
002| # Note: Generates an error because class isn’t defined!
003|
004| # Use the class to create a new instance object…
005| thing = MyFunctionClass()
006|
007| # Call the object like a function (no args)…
008| thing()
009|
010| # Call the object like a function (w/ args)…
011| thing(21, 42, 63)
012|
This code fragment raises an Exception at line #5 because MyFunctionClass is not defined, but we’ll assume it creates a callable object — one we can treat as a function. In lines #8 and #11, we invoke (aka call) the thing object, the first time with no arguments, the second time with three.
Note that the class defines what happens when one of its instances is called. (Below we’ll explore what this means and how it’s accomplished.)
Remember: Using the class name as a function (as in the very first example above or as in line #4 directly above) is a special case where the class object acts as a constructor for making new instance objects of that class. As seen in previous examples, these often take arguments that contribute to the construction of the instance object.
Above I said Python objects can be list-like in two ways: by providing an iterator object or by providing numerical indexing. Both are illustrated in this example:
002| # Note: Generates an error because class isn’t defined!
003|
004| # Create a list-like object…
005| obj = MyListLikeClass()
006|
007| # Find out how many items are in object…
008| n = len(obj)
009|
010| # Get the first item in object…
011| x = obj[0]
012|
013| # Get an iterator for the object…
014| i = iter(obj)
015| x = next(i)
016|
This code fragment also raises an Exception at line #5 because we have not defined MyListLikeClass, but we’ll assume it creates a list-like object.
This means it can have a length obtainable with the built-in len function. Line #8 uses this function to get the object’s length (an integer) and bind it to the variable named n.
Access to individual items can be available through numerical indexing. Line #11 indexes the first item and binds it to variable x. [See the previous post for more on numerical indexing.]
Access to the list of items can be available through an iterator provided by the object via the built-in iter function. Line #14 uses this function to get an iterator from the object and bind it to the variable i. Line #15 uses the built-in next function to get the next item in the list — in this case, the first — and bind it to variable x.
I stressed the word can in the above three paragraphs because whether the object supports these operations depends on the class design. When we create our own classes, we can support any, all, or none of the above operations. When we provide an iterator and/or support numerical indexing, Python sees the object as list-like.
In the same vein, we can support making an object callable — or not. When we do, Python lets us treat the object as a callable function.
In all cases, this is done by providing the class with methods that respond to the operations above. For example, a “call” method or a “length” method. Below is a Python class definition to show what this entails. This is just an illustration. We’ll return to class design in detail down the road.
002|
003| class MyListLikeCallableClass:
004| ”’Example list-like and callable class.”’
005|
006| def __init__ (self, x=0.0, y=0.0, z=0.0):
007| ”’Create new instance object.”’
008| self.x = x
009| self.y = y
010| self.z = z
011|
012| def __call__ (self):
013| ”’Call the object like a function.”’
014| return sum(list(self))
015|
016| def __iter__ (self):
017| ”’Return a iterator over the object.”’
018| return iter([self.x, self.y, self.z])
019|
020| def __len__ (self):
021| ”’Return the length (always 3).”’
022| return 3
023|
024| def __getitem__ (self, ix):
025| ”’Return an indexed item.”’
026| if ix == 1: return self.x
027| if ix == 2: return self.y
028| if ix == 3: return self.z
029| raise IndexError(f’Invalid index: {ix}‘)
030|
031| def __setitem__ (self, ix, value):
032| ”’Set the value of an indexed item.”’
033| if ix == 1: self.x = value; return
034| if ix == 2: self.y = value; return
035| if ix == 3: self.z = value; return
036| raise IndexError(f’Invalid index: {ix}‘)
037|
038| def __delitem__ (self, ix):
039| ”’Delete an indexed item. (Not Allowed!)”’
040| raise NotImplementedError(‘Sorry, unable to delete items.’)
041|
042| def __str__ (self):
043| ”’Return a string version.”’
044| return f’[{self.x}, {self.y}, {self.z}]‘
045|
046|
047| obj0 = MyListLikeCallableClass()
048| obj1 = MyListLikeCallableClass(2.1, 4.2, 6.3)
049| print(f’{obj0}‘)
050| print(f’{obj1}‘)
051| print()
052|
053| print(f’{obj0() = }‘)
054| print(f’{obj1() = }‘)
055| print()
056|
057| it0 = iter(obj0)
058| it1 = iter(obj1)
059| print(f’{next(it0) = }, {next(it1) = }‘)
060| print(f’{next(it0) = }, {next(it1) = }‘)
061| print(f’{next(it0) = }, {next(it1) = }‘)
062| print()
063|
064| print(f’{list(obj0) = }‘)
065| print(f’{list(obj1) = }‘)
066| print()
067|
068| print(f’{len(obj0) = }‘)
069| print(f’{len(obj1) = }‘)
070| print()
071|
072| print(f’{obj0[1] = }, {obj1[1] = }‘)
073| print(f’{obj1[2] = }, {obj1[2] = }‘)
074| print(f’{obj1[3] = }, {obj1[3] = }‘)
075| print()
076|
077| obj0[1] = 1.616
078| obj0[2] = 2.718
079| obj0[3] = 3.141
080| print(f’{obj0}‘)
081|
Lines #3 through #44 define a new class named MyListLikeCallableClass. Lines #47 to #80 exercise the class using two instance objects, obj0 and obj1. For the former we get a default object (line #47); for the latter we provide initialization values (line #48).
When run, this prints:
[0.0, 0.0, 0.0] [2.1, 4.2, 6.3] obj0() = 0.0 obj1() = 12.6 next(it0) = 0.0, next(it1) = 2.1 next(it0) = 0.0, next(it1) = 4.2 next(it0) = 0.0, next(it1) = 6.3 list(obj0) = [0.0, 0.0, 0.0] list(obj1) = [2.1, 4.2, 6.3] len(obj0) = 3 len(obj1) = 3 obj0[1] = 0.0, obj1[1] = 2.1 obj1[2] = 4.2, obj1[2] = 4.2 obj1[3] = 6.3, obj1[3] = 6.3 [1.616, 2.718, 3.141]
We’re jumping far ahead here to things we explore in more detail later, but there are a number of new things here I want to point out.
A key one is the “f” in front of many of the strings used for output. This leading “f” makes the string a format string (or “f-string”). In such strings, text enclosed in curly braces is processed as code:
002|
Here’s a simple example:
002|
003| ultimate = 42
004|
005| the_text = “Don’t Panic!”
006|
007| print(f’The answer is {ultimate}.‘)
008| print(f’The circumference is {two_pie * 12.3}.‘)
009| print(f’Remember: {the_text}‘)
010| print()
011|
When run, this prints:
The answer is 42. The circumference is 77.283114. Remember: Don't Panic!
For debugging, we can include an equals sign:
002| print(f”{code = }“)
003|
Which is where the equals-signs come from in the output above. It’s handy because the code in the curly braces becomes part of the output and automatically identifies what’s printed.
Here’s a simple example:
002| bar = 42
003|
004| print(f’{foo=}, {bar=}‘)
005| print(f’{foo =}, {bar =}‘)
006| print(f’{foo= }, {bar= }‘)
007| print(f’{foo = }, {bar = }‘)
008| print()
009|
010| print(f”{21+42+63=}“)
011| print(f”{21 + 42 + 63 = }“)
012| print(f”{21 * 42 = }“)
013| print(f”{42 / 63 = }“)
014| print()
015|
When run, this prints:
foo=21, bar=42 foo =21, bar =42 foo= 21, bar= 42 foo = 21, bar = 42 21+42+63=126 21 + 42 + 63 = 126 21 * 42 = 882 42 / 63 = 0.6666666666666666
Up to now, I’ve generally printed raw values, which had to be matched line-by-line to the print statements in the code fragment above it. Using f-strings makes the output much friendlier. [If you want to jump ahead, see Simple Tricks #8 for more on f-strings.]
Getting back to the class example (lines #3 to #44), note that everything after line #3 is indented at least one level, therefore belongs to the class definition. Within this we define eight methods (using the def keyword). In each case, the method name starts and ends with double underbars. Such names are special in Python, and you’ll get to know them as you dive deeper into the language.
What matters here is that these method names give classes abilities accessible through built-in Python functions or operations. Because they begin and end with double underbars, they are colloquially referred to “dunder” names.
The first method (lines #6 to #10) is dunder init — the initialization method automatically called when we create a new instance. The one defined here has four parameters: self, x, y, & z. The last three have default values (0.0 in all three cases). These three arguments are assigned to same-name instance variables (aka attributes) on the object.
Note that all eight methods have a self parameter. This is a reference to the object itself, and all instance methods automatically have this parameter. The name “self” is customary but can be any valid variable name. Note that dunder init takes arguments named x, y, & z and also creates attributes on the object with the same names (and values).
The dunder call method (lines #12 to #14), if implemented, makes objects of this class callable — objects that can be treated as functions (as in lines #53 & #54). As defined here, with only the self parameter, these function objects take no arguments. This demo example uses the built-in sum function to return the sum of the three elements.
The dunder iter method (lines #16 to #18), if implemented, is one way to make an object list-like. More on iterator objects later. For now, suffice to say the method returns an iterator over a list of the three elements. Lines #57 to #62 demonstrate first using the iter function to get an iterator (lines #57 & #58) and then using the next function to iterate through the three members in parallel.
Lines #64 & #65 show how being iterable allows us to use a list context to get a list.
The dunder len method (lines #20 to #22), if implemented, gives the object a length obtainable with the len function (lines #68 & #69). Generally, when we implement dunder len, we should also implement at least the first (if not all three) of the dunder methods below.
The dunder getitem, dunder setitem, and dunder delitem methods (lines #24 to #40), if implemented, provide the object’s numerical indexing interface. They can also implement an associative list where indexes can be anything desired. To make an object list-like, however, they need to be numeric and begin indexing at zero.
As their names imply, the first returns a list item, the second sets a list item to a new value, and the third deletes a list item. Lines #72 to #74 illustrate using the dunder getitem method. Lines #77 to #79 illustrate using the dunder setitem method. We don’t want to delete elements, so the dunder delitem method raises an Exception.
Which is another thing to point out. In a number of places, the code raises an Exception (lines #29, #36, and #40). Python has a rich set of built-in Exceptions, plus we can make our own if needed.
Lastly, the dunder str method (lines #42 to #44), if implemented, lets us control what is displayed when an object is printed or is in a string context:
002|
003| class MyObject:
004| ”’Example class.”’
005|
006| def __repr__ (self):
007| ”’Return a debug string.”’
008| return f’<{type(self).__name__} @{id(self):012x}>‘
009|
010| def __str__ (self):
011| ”’Return a string version of object.”’
012| return “Hello!”
013|
014|
015| # Create a new MyObject instance…
016| obj = MyObject()
017|
018| # Print the object itself…
019| print(obj)
020|
021| # Use an f-string to print string version…
022| print(f’{obj}‘)
023| print(f’{obj!s}‘)
024|
025| # Get and print the object’s string value…
026| s = str(obj)
027| print(s)
028| print()
029|
030| # Use an f-string to print debug version…
031| print(f’{obj = }‘)
032| print(f’{obj!r}‘)
033|
034| # Get and print the object’s debug value…
035| r = repr(obj)
036| print(r)
037| print()
038|
The MyObject class (lines #3 to #12) defines only two methods: dunder str (lines #10 to #12) and dunder repr (lines #6 to #8). The latter returns a debug string version of the object accessible via the built-in repr function (line #34).
The built-in str method (line #26) gets the string value of the class. All classes inherit defaults of both string methods, but we should always define our own. (At least define one of them.)
Note how the f-strings can explicitly access the string (line #23) or debug (line #32) versions. When used in an f-string without an equals-sign (line #22), we get the string version; when used with one (line #31), we get the debug version.
When run, this prints:
Hello! Hello! Hello! Hello! obj = <MyObject @015a8f462900> <MyObject @01b57b792900> <MyObject @01b57b792900>
We took a bit of a deep dive here. It was intended only to provide an introduction to how we create classes and their instances in Python.
Key takeaways:
- A data type is a “class” (technically a “class object”) and an “instance object” of that class (or data type) is casually called an “object” (or more formally, an “instance”).
- In Python, everything is an object, but we reserve the word for instances of some data type (class). So, we have “classes” and their “objects” with “methods”.
- Python comes with a rich set of built-in classes, and many more in its standard library, but we can (and often will) define our own.
- Python uses “duck typing” — if an object implements list-like methods, it’s a list-like object. If an object implements
dunder call, it’s a callable (aka function) object.
Lists and list-like objects are so central to Python that we’ll revisit them time and again. Next time, though, we’ll get into Python’s flow-control syntax, the if-else construct as well as the for and while loops. We’ll see that Python has some neat tricks up its sleeve when it comes to loops.
∅
ATTENTION: The WordPress Reader strips the style information from posts, which can destroy certain important formatting elements. If you’re reading this in the Reader, I highly recommend (and urge) you to [A] stop using the Reader and [B] always read blog posts on their website.
This post is: This is Python! (part 3)
Pingback: This is Python! (part 4) | The Hard-Core Coder
Pingback: This is Python! (part 6) | The Hard-Core Coder