Tags

, , ,

Python has the useful notion of descriptor objects as well as the built-in property() function to make using them in the most common cases — read-only and calculated instance attributes — quite easy.

In this post I’ll explore Python descriptors with lots of examples demonstrating how to use them. Descriptors are an important aspect in understanding Python and using it effectively.

Python treats any object with a __get__, __set__, or __delete__ method as a descriptor. When such an object is a class attribute, Python treats it differently than it would a non-descriptor object. For example:

001| class MyClass:
002|     x = MyObject()
003|     y = MyObject()
004| 

Because x and y are class attributes, if myObject defines any of the three methods mentioned above, Python treats the x and y instances as descriptors and handles access to them differently. Most importantly, it treats access from the class versus from instances of the class differently.

Important: These objects must be class attributes, not instance attributes. They must be created as in the above code fragment, not as in this one:

001| class MyClass:
002| 
003|     def __init__ (self):
004|         self.x = MyObject()
005|         self.y = MyObject()
006| 

If you create instance attributes as in the above, Python won’t treat them specially, and they’ll just be normal objects with whatever attributes and methods you gave them. The __get__, __set__, or __delete__ methods, if defined, would have to be called explicitly, and the object may, or more likely may not, act as expected.

Besides creating them in the first place, the attributes of any object (including class objects) allow three operations: reading (get), writing (set), and destroying (delete). Because Python lacks any notion of truly private data, users normally can create, read, set, and delete the attributes of any object (or class). Descriptors provide a means to change that relationship a little. (But only a little. Data in Python is hardly ever truly private.)

The __get__, __set__, and __delete__ methods correspond to the three possible operations on an existing object. When Python sees these methods in a class attribute, it calls them instead of invoking the normal behavior. For example, in the most common case, the object defines (perhaps only) __get__, so Python calls that and returns whatever value it returns rather than returning the attribute itself.

Likewise, if a class attribute defines __set__ or __delete__ then Python calls those methods rather than, respectively, setting the attribute to a new value or deleting it. However, in all three cases, it matters whether the access to the descriptors is on the class containing them or on an instance of the class.

There are two ways to access the x and y attributes defined above in myClass:

001| print(myClass.x, myClass.y)
002| 
003| my_instance = myClass()
004| print(my_instance.x, my_instance.y)
005| 

The first way (line #1) is through the class containing the attribute. The second way (lines #3 and #4) is through an instance of that class. A descriptor method can tell them apart because of how Python invokes the descriptor (which we’ll get to below). This allows the __get__ method to return different results depending on how it’s accessed.

Set and delete operations also work differently between the class and its instances. On the class it works as usual, it sets or deletes the attribute (usually not desired, so avoid). On instances, Python invokes the __set__ and __delete__ methods if they exist. There are some nuances to what happens when they don’t that I’ll get to below.

Consider this simplified example:

001| class myObject:
002|     ”’Descriptor class.”’
003|     def __get__ (self, obj, cls):
004|         return ‘(object)’ if obj else ‘(class)’
005| 
006|     def __set__ (self, obj, val):
007|         print(type(obj).__name__, val)
008| 
009| class myClass:
010|     data = myObject()
011| 
012| 
013| print(myClass.data)
014| 
015| my_instance = myClass()
016| print(my_instance.data)
017| 
018| my_instance.data = 42
019| print(my_instance.data)
020| 

First is a simple definition of a descriptor class, myObject, (lines #1-#7). It defines both __get__ and __set__ methods. Note that the __get__ method takes, besides the self parameter, two others: obj (object) and cls (class). When called on the class, the obj parameter is set to None. The __set__ method takes obj (object) and val (value to set) parameters.

Rather than the x and y attributes from above, myClass just defines a single (descriptor) attribute, data (lines #9 and #10).

When run, it prints:

(class)
(object)
myClass 42
(object)

The first line shows the result of getting the attribute’s value from the class. The obj parameter is None, so the __get__ method returns the string “(class)”. The second and last lines show the result of getting the attribute’s value from the instance. Now the obj parameter is set to the my_instance object, so the method returns the string “(object)”.

The third line comes from setting the value of the my_instance.data attribute. The __set__ method just prints the class name and value passed. Note how this doesn’t change the attribute in any way. Conversely, setting the value of the attribute on the class will destroy the existing attribute and replace it with a new one:

001| class myObject:
002|     ”’Descriptor class.”’
003|     def __get__ (self, obj, cls):
004|         return ‘(object)’ if obj else ‘(class)’
005| 
006|     def __set__ (self, obj, val):
007|         print(type(obj).__name__, val)
008| 
009| class myClass:
010|     data = myObject()
011| 
012| my_instance = myClass()
013| 
014| print(myClass.data)
015| print(my_instance.data)
016| 
017| myClass.data = 42
018| 
019| print(myClass.data)
020| print(my_instance.data)
021| 

Now the data attribute is an integer object with the value 42. And no longer a descriptor, just a regular class attribute. When run this code prints:

(class)
(object)
42
42

Which shows how the descriptor behavior has vanished (because so has the descriptor object). Generally speaking, you’ll want to access descriptor objects only through instances of the class containing them. The way the __get__ method distinguishes between access through the class versus instance does allow a trick we’ll see later.

§

Below is a simple xy-point class we’ll use to illustrate several ways to use descriptors to hide the data attributes:

001| typename = lambda obj: type(obj).__name__
002| 
003| class PointBaseClass:
004|     ”’Very simple XY-point base class.”’
005| 
006|     def __init__ (self, x=0.0, y=0.0):
007|         self._x = x
008|         self._y = y
009| 
010|     def __str__ (self):
011|         return ‘(%f, %f)’ % (self._x, self._y)
012| 
013|     def __repr__ (self):
014|         return ‘{%s @%012x}’ % (typename(self), id(self))
015| 
016| 
017| def demo_PointBaseClass ():
018|     # Create new point instance…
019|     pt = PointBaseClass(21, 42)
020|     print(str(pt), repr(pt))
021|     print(‘x=%s, y=%s\n’ % (pt._x, pt._y))
022| 
023|     # Change the X and Y values…
024|     pt._x = 86
025|     pt._y = 99
026|     print(‘point=%s\n’ % str(pt))
027| 
028|     # Delete the X and Y attributes…
029|     del pt._x
030|     del pt._y
031|     print((‘_x=%s’ % pt._x) if hasattr(pt,‘_x’) else ‘No _x!’)
032|     print((‘_y=%s’ % pt._y) if hasattr(pt,‘_y’) else ‘No _y!’)
033| 
034| demo_PointBaseClass()
035| 

In this example, which does not use descriptors, the _x and _y attributes are fully exposed as data attributes. The leading underbar is meant to tell users of the class that _x and _y are private data that should not be accessed except through methods provided by the class.

The class is meant as immutable XY point data that can accessed only by taking its string value. But users have full access to the _x and _y attributes and can easily change the point values. Or delete the attributes entirely. (To reiterate, it’s difficult to truly hide data in Python. It goes against the grain of the language.)

When run, this prints:

(21.000000, 42.000000) {PointBaseClass @023df4f73e50}
x=21, y=42

point=(86.000000, 99.000000)

No _x!
No _y!

Note, by the way, the definition of typename, which just compresses a much used pattern into a simple function. This helpful function appears in a number of later code fragments.

§

To protect the data attributes from being changed, we’ll subclass the PointBaseClass and add two descriptors, x and y:

001| class PointSubClass1 (PointBaseClass):
002|     ”’Read-only Point class using the @property decorator.”’
003| 
004|     @property
005|     def x (self):
006|         ”’Getter function for X.”’
007|         return self._x
008| 
009|     @property
010|     def y (self):
011|         ”’Getter function for Y.”’
012|         return self._y
013| 
014| 
015| def demo_PointSubClass1 ():
016|     # Create new point instance…
017|     pt = PointSubClass1(21, 42)
018|     print(str(pt), repr(pt))
019|     print(‘x=%s, y=%s\n’ % (pt.x, pt.y))
020| 
021|     # Change the X and Y values…
022|     try: pt.x = 86
023|     except Exception as e:
024|         print(‘ERR: %s’ % e)
025| 
026|     try: pt.y = 99
027|     except Exception as e:
028|         print(‘ERR: %s’ % e)
029| 
030|     print(‘point=%s\n’ % str(pt))
031| 
032|     # Delete the X and Y attributes…
033|     try: del pt.x
034|     except Exception as e:
035|         print(‘ERR: %s’ % e)
036| 
037|     try: del pt.y
038|     except Exception as e:
039|         print(‘ERR: %s’ % e)
040| 
041|     print((‘point.x=%s’ % pt.x) if hasattr(pt, ‘x’) else ‘No x Property!’)
042|     print((‘point.y=%s’ % pt.y) if hasattr(pt, ‘y’) else ‘No y Property!’)
043| 
044| demo_PointSubClass1()
045| 

In this first example we use the built-in property function as a decorator to turn the otherwise normally defined x and y methods into descriptors (lines #4-#7 and #9-#12). These functions access the “private” data attributes and return their values. This allows read-only access to _x and _y.

Trying to set their value or delete them causes an error. (Which is why the attempts had to be protected by tryexcept.) For many applications, this is all that’s needed, and it works great. When run, this prints:

(21.000000, 42.000000) {PointSubClass1 @026f86a9e620}
x=21, y=42

ERR: can't set attribute 'x'
ERR: can't set attribute 'y'
point=(21.000000, 42.000000)

ERR: can't delete attribute 'x'
ERR: can't delete attribute 'y'
point.x=21
point.y=42

The _x and _y attributes retain their values throughout and cannot be modified (except if the user cheats and accesses the “private” data explicitly).

Note that using a subclass isn’t required, you can define private instance attributes in a class that also defines the descriptors accessing those instance attributes:

001| class BasicPointClass:
002|     ”’Read-only Point class using the @property decorator.”’
003| 
004|     @property
005|     def x (self):
006|         ”’Getter function for X.”’
007|         return self._x
008| 
009|     @property
010|     def y (self):
011|         ”’Getter function for Y.”’
012|         return self._y
013| 
014|     def __init__ (self, x=0.0, y=0.0):
015|         ”’New point instance.”’
016|         self._x = x
017|         self._y = y
018| 
019|     def __str__ (self):
020|         return ‘(%f, %f)’ % (self.x, self.y)
021| 
022| pt = BasicPointClass(42,21)
023| print(pt)
024| print(pt.x, pt.y)
025| 

Which is essentially the same as the example above but in one class.

Used this way, descriptors act a bit like class methods and don’t appear in the instance object’s dictionary. That means access to them through the instance finds them in the class object’s dictionary, which is how Python (and hence the descriptor __get__ method) can tell the difference between instance and class access.

Another common use for this pattern is when an attribute is calculated every time the attribute is accessed. For example, you could add a sum method to your lists like this:

001| class numberlist (list):
002|     ”’Python list with a .total attribute.”’
003| 
004|     @property
005|     def total (self):
006|         return sum(self)
007| 
008| lst = numberlist(range(12))
009| print(lst)
010| print(lst.total)
011| 

A numberlist is a Python list but has a total data attribute that sums all the numbers of the list. (This obviously requires the list to contain only numbers. A cleverer method might test and ignore any list members that aren’t numbers.)

§

If we want control over all three operations, get, set, and delete, we must define functions to handle all three. This next example is also a subclass of PointBaseClass, but this time we’ll define getter, setter, and deleter methods for both the _x and _y attributes (for a total of six methods):

001| class PointSubClass2 (PointBaseClass):
002|     ”’Using the property() function with Get/Set/Del methods.”’
003| 
004|     def __init__ (self, x=0.0, y=0.0, x_max=100, y_max=100):
005|         super().__init__(x=x, y=y)
006|         self.x_max = x_max
007|         self.y_max = y_max
008| 
009|     # Getters…
010|     def _getx (self):
011|         ”’Get function for X.”’
012|         return self._x
013| 
014|     def _gety (self):
015|         ”’Get function for Y.”’
016|         return self._y
017| 
018|     # Setters…
019|     def _setx (self, value):
020|         ”’Set function for X.”’
021|         if not (0.0 <= value <= self.x_max):
022|             raise ValueError(‘Value must be from 0 to %s’ % self.x_max)
023|         self._x = value
024| 
025|     def _sety (self, value):
026|         ”’Set function for Y.”’
027|         if not (0.0 <= value <= self.y_max):
028|             raise ValueError(‘Value must be from 0 to %s’ % self.y_max)
029|         self._y = value
030| 
031|     # Deleters…
032|     def _delx (self):
033|         ”’Delete function for X.”’
034|         self._x = 0.0
035| 
036|     def _dely (self):
037|         ”’Delete function for Y.”’
038|         self._y = 0.0
039| 
040|     # Actual X and Y properties…
041|     x = property(_getx, _setx, _delx, ‘X property’)
042|     y = property(_gety, _sety, _dely, ‘Y property’)
043| 
044| 
045| def demo_PointSubClass2():
046|     # Create new point instance…
047|     pt = PointSubClass2(21, 42)
048|     print(str(pt), repr(pt))
049|     print(‘x=%s, y=%s\n’ % (pt.x, pt.y))
050| 
051|     # Change the X and Y values…
052|     pt.x = 86
053|     pt.y = 99
054|     print(‘point=%s\n’ % str(pt))
055| 
056|     # Delete the X and Y attributes…
057|     del pt.x
058|     del pt.y
059| 
060|     print((‘x=%s’ % pt.x) if hasattr(pt, ‘x’) else ‘No x property!’)
061|     print((‘y=%s’ % pt.y) if hasattr(pt, ‘y’) else ‘No y property!’)
062| 
063| demo_PointSubClass2()
064| 

The premise — the reason we want control of get, set, and delete — is that we wish to limit the range of the x and y coordinates. Attempts to set either value must be constrained to be in specified range. Values outside the range must raise an exception. The setter methods (lines #18-#29) implement this.

We also don’t want the attributes deleted (too much other code depends on them). Instead, attempts to delete them should just reset them to zero. The deleter methods (lines #31-#38) handle this. (We could just as easily have attempts to delete raise an exception.)

To implement the range constraint, a PointSubClass2 instance has x_max and y_max attributes that determine the maximum allowed value for x and y. (In this example, we’ll assume the minimum value is always zero, but it would be easy enough to add x_min and y_min attributes if we want to also control the minimum.)

Lines #40-#42 define the actual attributes using the built-in property function, but in this case as a function taking four arguments: a get method, a set method, a delete method, and a doc-string. The property function can be used as a decorator or as a function. (A decorator is a function, so it’s not that a difficult trick.)

When run, this code prints:

(21.000000, 42.000000) {PointSubClass2 @026f86a9d6c0}
x=21, y=42

point=(86.000000, 99.000000)

x=0.0
y=0.0

We can set the values (if we stay within range) and attempts to delete the attributes just reset them to zero.

§

The Python property function is cleverly constructed so that it can be used to decorate all three descriptor methods:

001| class PointSubClass3 (PointBaseClass):
002|     ”’Using the @property decorator with Get/Set/Del methods.”’
003| 
004|     def __init__ (self, x=0.0, y=0.0, x_max=100.0, y_max=100.0):
005|         super().__init__(x=x, y=y)
006|         self.x_max = x_max
007|         self.y_max = y_max
008| 
009|     # Getters…
010|     @property
011|     def x (self):
012|         ”’Get function for X.”’
013|         return self._x
014| 
015|     @property
016|     def y (self):
017|         ”’Get function for Y.”’
018|         return self._y
019| 
020|     # Setters…
021|     @x.setter
022|     def x (self, value):
023|         ”’Set function for X.”’
024|         if not (0.0 <= value <= self.x_max):
025|             raise ValueError(‘Value must be from 0 to %s’ % self.x_max)
026|         self._x = value
027| 
028|     @y.setter
029|     def y (self, value):
030|         ”’Set function for Y.”’
031|         if not (0.0 <= value <= self.y_max):
032|             raise ValueError(‘Value must be from 0 to %s’ % self.y_max)
033|         self._y = value
034| 
035|     # Deleters…
036|     @x.deleter
037|     def x (self):
038|         ”’Delete function for X.”’
039|         self._x = 0.0
040| 
041|     @y.deleter
042|     def y (self):
043|         ”’Delete function for Y.”’
044|         self._y = 0.0
045| 
046| 
047| def demo_PointSubClass3():
048|     # Create new point instance…
049|     pt = PointSubClass3(21, 42)
050|     print(str(pt), repr(pt))
051|     print(‘x=%s, y=%s\n’ % (pt.x, pt.y))
052| 
053|     # Change the X and Y values…
054|     pt.x = 86
055|     pt.y = 99
056|     print(‘point=%s\n’ % str(pt))
057| 
058|     # Delete the X and Y attributes…
059|     del pt.x
060|     del pt.y
061| 
062|     print((‘x=%s’ % pt.x) if hasattr(pt, ‘x’) else ‘No x Property!’)
063|     print((‘y=%s’ % pt.y) if hasattr(pt, ‘y’) else ‘No y Property!’)
064| 
065| demo_PointSubClass3()
066| 

The first occurrence of the property decorator creates the x and y attributes, but the created objects also have setter and getter functions that can act as decorators for the set and delete functions.

Be sure to use the same names for the get, set, and delete methods! If you don’t, you end up creating access methods for whatever names you actually used.

When run, this code prints:

(21.000000, 42.000000) {PointSubClass3 @026f86a9d5a0}
x=21, y=42

point=(86.000000, 99.000000)

x=0.0
y=0.0

The same as did the previous example.

§

That’s plenty for this time. Next time I’ll dig deeper into descriptors with some more involved examples.

Ø