Programmer's Weblog: January 2008

Thursday, January 31, 2008

Arc

I have just read the Arc tutorial. Here are some comments.

Since functions of one argument are so often used in Lisp programs, Arc has a special notation for them. [... _ ...] is an abbreviation for (fn (_) (... _ ...)). So our first map example could have been written

arc> (map [+ _ 10] '(1 2 3))
(11 12 13 . nil)

I know this feature from Nemerle, and miss it when programming in other languages.

Another nice one seems to be ~ operator for negating a function, though would not be nearly as useful as the last one.

arc> (map ~odd '(1 2 3 4 5)) 
(nil t nil t nil)

Code for adding an element to a hashtable does not look very natural:

arc> (= (airports "Boston") 'bos)
bos

In Python it would be:

airports["Boston"] = 'bos

Wednesday, January 30, 2008

Resolver One for Open Source

Resolver One, a new MS Excel's alternative, is currently available in two versions. First is a vanilla commercial one, whereas the second, Resolver One - Non-Commercial, requires a bit of explanation.

Resolver One - Non-Commercial is free (as in beer). Internally at Resolver we used to call it Resolver One for Open Source. Not because it comes with sources, because it doesn't, but because it could only be used to produce open source. Since then we have further relaxed the license to allow personal use.

What a user gets after installing Resolver One Non-Commercial is feature-wise the same application as in the commercial one, but with a different license. The license requires the user to make his spreadsheets open source or mark them as personal -- not for redistribution and profit.

Technically, the Non-Commercial version will not save the spreadsheet file until user decides on a license for his spreadsheet. When he tries to save the file for the first time, he will be presented with seven well known open source licenses to choose from (MIT, GPL, Creative Commons, ...), but he can also mark it as personal. If he selects one of the open-source licenses, it is included on top of the code part of the spreadsheet, making his spreadsheet free (as in speech).

Sunday, January 6, 2008

Anti-pattern: static subject to observer mapping

Short description

When implementing observer pattern in a language with automatic garbage collection don't use static mapping from subject to observer. Doing this will cause memory leaks.

Such an implementation involves having a hashtable which keys are subjects and values are lists of observers. Then to notify about a change in a subject, get the list of its observers from the mapping and notify them one by one.

Glossary

GC - Garbage Collector.
Dead object - an object is dead when it is eligible for garbage collection (can't be reached by strong references from any root). By symmetry, an object can be called alive.
Strong reference - a reference that keeps an object alive (normal reference), as opposed to a weak reference.
Weak reference - a reference that is not taken into account by GC. In .NET there is WeakReference class to define weak references. As far as I know this is the only way to define a weak reference.

What is wrong with static subject to observer mapping

For a warm-up I will discuss the traditional implementation of observer pattern. The nodes on the diagram represent objects. An arc from one object to another means that it is possible to get from one to the another following strong references.

In the first diagram, observer can be reached from subject and vice versa. The fact that observer can be reached from subject follows from subject keeping a list of observers to notify.

I would like to point out that the arrow from observer to subject on this diagram indicates that there is, maybe indirect, way by strong references from observer to subject, this does not follow from observer pattern. However this being optional, in production code it is quite often that observers keep references to the subjects.

What happens in this case when there are no more references to both subject and observer (apart from their own cycle of references)? They can't be reached from any root, so they are dead and the memory they occupy will be released during next garbage collection.

The second diagram shows a situation where observers are kept in a weak hashtable. The dashed arc from entry to subject indicates a weak reference. This one is not that bad. If there are no references to subject and observers apart from those on the diagram, the subject can be garbage collected, there being only a weak reference to it. The observer will be kept alive by the strong reference from weakhashmap, but WeakHashMaps are usually implemented so that on adding a new entry they will remove entries which keys have been garbage collected. This prevents entries with dead keys to pile up.

The third and last diagram depicts a case where there is a reference from observer to subject. This is the situation which results in memory leaks. In this configuration the weakHashMap will keep both subject and observer alive and every other object that can be reach from either of them.

This happens because the reference from entry to the value must be strong, otherwise an observer might get garbage collected when its subject is still alive.

Known occurrences

I have first come across this anti-pattern in the paper about implementing GOF design patterns in AspectJ by Jan Hannemann and Gregor Kiczales. Gregor Kiczales is renowned for leading the team in XEROX PARC that invented aspect-oriented programming.

In this paper they have a definition of an aspect, ObserverProtocol, that keeps a WeakHashMap from subjects to list of observers. This mapping not being static per se, is kept alive throughout the lifespan of a program execution.

The second use I have seen was in IronPython's implementation of events. IronPython has it's own implementation of events because the one in .NET runtime is not sufficient for IronPython's requirements.

For IronPython programmers it means that hooking to events will cause memory leaks if source of the event can be reached from the handler (very common).

Summary

Both of these cases show how easy it is to make this mistake, therefore I have documented it as an anti-pattern for the benefit of mankind ;).

Thursday, January 3, 2008

All possible FontStyles

That was funny. To instantiate Font in .NET 2.0 you call one of many Font's constructors. Basically you provide a font family (either an object of the class FontFamily or a name of one as a string), size and FontStyle. There is a constructor that doesn't take a FontStyle, it assumes the FontStyle.Regular as default.

A FontFamily is a collection of fonts. Arial family will have Arial Regular, Arial Bold, Arial Bold Italic...

Now, in theory there might be a family with only Bold Italic. The requirement is in case there is no style the user requested, give back a font from the family in any available style.

Unfortunatelly in FontFamily object there is no property returning available members of the family, but there is a method for checking if given FontStyle is available.

FontStyle is an enumeration with the following members.

FontStyle.Regular = 0
FontStyle.Bold = 1
FontStyle.Italic = 2
FontStyle.Underline = 4
FontStyle.Strikeout = 8

A font style can be created by bitwise or-ing the values. For example Bold Italic can be defined by FontStyle.Bold | FontStyle.Italic wihich makes 3.

How to generate all possible combinations?

for i in xrange(16):
    fontStyle = Enum.ToObject(FontStyle, i)
    if fontFamily.IsStyleAvailable(fontStyle):
        return Font(fontFamily, fontSize, fontStyle)

Integers from 0 to 15 form all valid combinations of the FontStyle. This sixteen numbers can be represented with 4 bits, 15's binary representation having them all lit (8 + 4 + 2 + 1).