Short description
When implementing observer pattern in a language with automatic garbage collection
don't use static mapping from subject to observer. Doing this will
cause memory leaks.
Such an implementation involves having a hashtable which keys are subjects and values
are lists of observers. Then to notify about a change in a subject, get the list of its observers
from the mapping and notify them one by one.
Glossary
GC - Garbage Collector.
Dead object - an object is dead when it is eligible for garbage collection
(can't be reached by strong references from any root). By symmetry, an object can be called alive.
Strong reference - a reference that keeps an object alive (normal reference), as opposed to a weak reference.
Weak reference - a reference that is not taken into account by GC.
In .NET there is WeakReference class to define weak references. As far as I know this is the only way
to define a weak reference.
What is wrong with static subject to observer mapping
For a warm-up I will discuss the traditional implementation of observer pattern.
The nodes on the diagram represent objects. An arc from one object to
another means that it is possible to get from one to the another following strong
references.
In the first diagram, observer can be reached from subject and vice versa.
The fact that observer can be reached from subject follows from subject keeping
a list of observers to notify.
I would like to point out that the arrow from observer
to subject on this diagram indicates that there is, maybe indirect, way by strong
references from observer to subject, this does not follow from observer pattern.
However this being optional, in production code it is quite often that observers
keep references to the subjects.
What happens in this case when there are no more references to both subject and
observer (apart from their own cycle of references)? They can't be reached from
any root, so they are dead and the memory they occupy will be released during next
garbage collection.
The second diagram shows a situation where observers are kept in a weak hashtable.
The dashed arc from entry to subject indicates a weak reference.
This one is not that bad. If there are no references to subject and observers
apart from those on the diagram, the subject can be garbage collected, there being
only a weak reference to it. The observer will be kept alive by the strong reference
from weakhashmap, but WeakHashMaps are usually implemented so that on adding a new
entry they will remove entries which keys have been garbage collected. This prevents entries
with dead keys to pile up.
The third and last diagram depicts a case where there is a reference from observer
to subject. This is the situation which results in memory leaks.
In this configuration the weakHashMap will keep both subject and observer
alive and every other object that can be reach from either of them.
This happens because the reference from entry to the value must be strong,
otherwise an observer might get garbage collected when its subject is still alive.
Known occurrences
I have first come across this anti-pattern in the
paper about implementing GOF design
patterns in AspectJ by Jan Hannemann and Gregor Kiczales. Gregor Kiczales is renowned for
leading the team in XEROX PARC that invented aspect-oriented programming.
In this paper they have a definition of an aspect, ObserverProtocol, that keeps a
WeakHashMap
from subjects to list of observers. This mapping not being static per se, is kept alive
throughout the lifespan of a program execution.
The second use I have seen was in
IronPython's
implementation of events. IronPython has it's own implementation of events because the one
in .NET runtime is not sufficient for IronPython's requirements.
For IronPython programmers it means that hooking to events will cause memory
leaks if source of the event can be reached from the handler (very common).
Summary
Both of these cases show how easy it is to make this mistake, therefore I have documented
it as an anti-pattern for the benefit of mankind ;).