Amador Pahim

Python

Multiple Inheritance and Linearization

apahim

What does super() do? Really, answer that in your mind, we will get back to it.

It is a well known problem: a class A implements a method foo(). Two classes B and C both inherit from A and both override the method foo(). A new class D inherits from both B and C. When an instance of D is created and D().foo() is called, what will happen depends on the language.

             A
           /   \
          B     C
           \   /
             D

It’s called “the diamond problem”, given the shape created by the inheritance. Each language solves it (or not) in a different way. C++ solves it via virtual inheritance. Some languages will define the precedence according to the classes definition order. Go will complain during the compilation. Some languages will force you to redefine foo() in the class D.

The point is: there’s no standard way of dealing with that problem. But, we are here to talk about Python, right?

Python gives us a method resolution order implementing the so called “C3 superclass linearization”. It is not my intention to describe here how the C3 algorithm works, but there are some important takeaways:

Too much information? No worries, let’s see some practical examples. Starting with our classes A and B:

>>> class A:
...     def foo(self):
...         print("I'm A!")
... 
>>> 
>>> class B(A):
...     def foo(self):
...         print("I'm B!")
... 
>>> 
>>> B().foo()
I'm B!

Calling help(B), we see:

Help on class B in module __main__:

class B(A)
 |  Method resolution order:
 |      B
 |      A
 |      builtins.object
 |  
 |  Methods defined here:
 |  
 |  foo(self)
 |  
 |  ----------------------------------------------------

So, the Method resolution order section tells us that Python will first look for methods at B then at A. So far nothing new. Let’s introduce our classes C and D:

>>> class A:
...     def foo(self):
...         print("I'm A!")
... 
>>> 
>>> class B(A):
...     def foo(self):
...         print("I'm B!")
... 
>>> 
>>> class C(A):
...     def foo(self):
...         print("I'm C!")
... 
>>> 
>>> class D(B, C):
...     pass
... 
>>> 
>>> D().foo()
I'm B!

Now we have the fearsome diamond shape, but Python does not seem to be afraid. Let’s have a look at the help(D):

Help on class D in module __main__:

class D(B, C)
 |  Method resolution order:
 |      D
 |      B
 |      C
 |      A
 |      builtins.object
 |  
 |  Methods inherited from B:
 |  
 |  foo(self)
 |  
 |  ----------------------------------------------------

The Python linearization gives us the precedence order “D, B, C, A“. That was verified by our execution. As I said before, the precedence order can be different depending on which node from the inheritance tree we instantiate. What I mean is: for an instance of D, the method resolution order is: “D, B, C, A“, but an instance of B will have the resolution order “B, A” regardless.

When I said the precedence can be rearranged by rearranging the shape of the inheritance, this is what I meant:

>>> class A:
...     def foo(self):
...         print("I'm A!")
... 
>>> 
>>> class B(A):
...     def foo(self):
...         print("I'm B!")
... 
>>> 
>>> class C(A):
...     def foo(self):
...         print("I'm C!")
... 
>>> 
>>> class D(C, B):
...     pass
... 
>>> 
>>> D().foo()
I'm C!

See? Swapping B and C in our D changed the precedence order.

Now, back to our question: what does super() do? If your answer was something like “it calls the parent” or “it delegates method calls to the parent”, I have news for you: super() is not always about the parent. It respects the method resolution order, meaning it can delegate method calls to a sibling. In our diamond shape, for example, an instance of D will make the super() in B call the method in C, not in A:

>>> class A:
...     def foo(self):
...         print("I'm A!")
... 
>>> 
>>> class B(A):
...     def foo(self):
...         super().foo()
... 
>>> 
>>> class C(A):
...     def foo(self):
...         print("I'm C!")
... 
>>> 
>>> class D(B, C):
...     pass
... 
>>> 
>>> D().foo()
I'm C!
>>> B().foo()
I'm A!

And why is that important? Well, by knowing that you can avoid issues in multiple inheritance scenarios. Also, you can take advantage of it and inject dependencies for testing purposes. Or use it in creative implementations. For example, in Python up to 3.5, the dictionaries did not remember the insertion order:

>>> var = {'a': 'foo', 'b': 'foo', 'c': 'foo'}
>>> for k, v in var.items():
...     print(k, v)
... 
('a', 'foo')
('c', 'foo')
('b', 'foo')

As such, the collections.Counter, which inherits from dict(), was also not ordered:

>>> from collections import Counter
>>> c = Counter('aaabbbcccdddeeefff')
>>> for k, v in c.items():
...     print(k, v)
... 
('a', 3)
('c', 3)
('b', 3)
('e', 3)
('d', 3)
('f', 3)

But we could easily have an OrderedCounter by injecting the collections.OrderedDict, which also inherits from dict(), in the method resolution order, making a diamond shape:

           dict()
          /      \
   Counter()    OrderedDict()
          \      /
      OrderedCounter()

Let’s see it:

>>> from collections import Counter
>>> from collections import OrderedDict
>>> 
>>> 
>>> class OrderedCounter(Counter, OrderedDict):
...     """
...     Same as collections.Counter, but ordered.
...     """
... 
>>> 
>>> c = OrderedCounter('aaabbbcccdddeeefff')
>>> for k, v in c.items():
...     print(k, v)
... 
('a', 3)
('b', 3)
('c', 3)
('d', 3)
('e', 3)
('f', 3)

Since Python 3.6, dictionaries are ordered and so is the collections.Counter, making this implementation no longer relevant. But it still serves as a good example on the importance of knowing how Python decides which class to lookup next when resolving methods.

:wq!

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top