ThornyDev: February 2012

I have to admit to being confused at times between the various meanings of the word "Factory" in object oriented design. For example, recently I was reviewing an Item in Josh Bloch's Effective Java where he said that static factory methods are not an implementation of the Gang of Four Factory pattern. In the next Item he then says that the Builder "pattern" (idiom?) he displays can be used as part of the Gang of Four Abstract Factory pattern. But then the Gang of Four have a Builder pattern as well - how does that relate?

So that's five somewhat related concepts:

Static Factory Methods
The Builder idiom recommended by Josh Bloch in Effective Java
The GoF Builder Pattern
The GoF Factory Pattern
The GoF Abstract Factory Pattern

In this blog entry I review these five ideas and how they relate.

`/* ---[ Static Factory Methods ]--- */`

The Static Factory is a simple idiom for encapsulating creational details into a single place accessible to other classes. For example:

// imports from Google Guava shown
import com.google.common.collect.Maps;  
import com.google.common.collect.ImmutableList; 

// the old-fashioned, hardcoded way
List<Integer> lint1 = new ArrayList<Integer>();
lint1.add(1);
lint1.add(2);
lint1.add(3);

// the first two are from Guava's static 
// factory methods
List<Integer> lint2 = Lists.newArrayList(1, 2, 3);
List<Integer> lint3 = ImmutableList.of(1, 2, 3);
List<Integer> lint4 = Collections.unmodifiableList(lint1);

On lines 12 - 14, we use static factory methods to create Lists. There are at least two benefits to this:

We don't have to repeat the generic parameter on the right hand side
We allow the static factory to determine the best type of concrete List class to create

In fact, in this case the three concrete classes created by the static factory methods all differ:

final PrintStream stdout = System.out;
stdout.println(lint1.getClass()); // java.util.ArrayList
stdout.println(lint2.getClass()); // java.util.ArrayList
stdout.println(lint3.getClass()); // com.google.common.collect.RegularImmutableList
stdout.println(lint4.getClass()); // java.util.Collections$UnmodifiableRandomAccessList

I won't spend much time on why using static factories is Good Design - Josh Bloch's very first entry in Effective Java -- which every Java developer has read, right? -- spends 6 pages on it. But I will mention one of my favorite benefits though (quoting): unlike constructors, they are not required to create a new object each time they are invoked. Caching is good.

`/* ---[ The Builder Idiom ]--- */`

In the Getting Started documentation for DBUnit, there is a curious bit of openness about not knowing how to complete their API design. In talking about doing row ordering with their SortedTable object, they have this code snippet:

SortedTable sortedTable1 = new SortedTable(table1, 
                                           new String[]{"COLUMN1"});
// must be invoked immediately 
// after the constructor
sortedTable1.setUseComparable(true);

It is followed by this statement in italics:

The reason why the parameter is currently not in the constructor is that the number of constructors needed for SortedTable would increase from 4 to 8 which is a lot. Discussion should go on about this feature on how to implement it the best way in the future.

This is the problem of non-atomic object creation where you have to build up the state of the object piece by piece. When there are a lot of pieces, creating all possible constructors to cover the permutations makes it a nightmare for both the library writer and API user.

So one solution is to have an empty constructor and the set neccessary state via a series of setters, which is the JavaBeans spec. But that is very problematic. The object is in an incomplete state until all relevant adders and setters have been called and if the object escapes before the user constructs everything correctly, then you have a source of bugs. With the JavaBeans model, there is no clearly defined point at which to check whether the object's invariants have been violated before it is sent off to the world.

So you get comments in the API like "you must invoke this setter immediately after the constructor in order for the object to work correctly", as with the DBUnit case.

The Builder idiom is a nice solution to this problem, including cases where you want to create immutable objects. It allows objects to be constructed with any number of variable attributes and settings. Some may be required, others are optional. And you can construct the object in a multi-step, self-documenting fashion, which is one of the nice features of the JavaBeans model, but without the possibility of leaving an object in an inconsistent state.

A typical way to use the builder idiom in Java these days is to combine it with a fluent interface. One starts by getting a reference to a builder, which is often an inner class of the class that the builder creates. Then one calls a series of functions on the builder to tell it how to set up the object and finally invoke a build() method to return the desired object.

For the DBUnit example, the builder idiom could be used this way:

SortedTable sortedTable1 = SortedTable.
  newBuilder(table1, new String[]{"COLUMN1"}).
  setUseComparable(true).
  build();

// alternative way without having required fields in 
// the builder constructor
SortedTable sortedTable1 = SortedTable.newBuilder().
  setTable(table1).
  setColumns(new String[]{"COLUMN1"}).
  setUseComparable(true).
  build();

I prefer to put required fields in the builder constructor, since it emphasizes that they are required, but it is not absolutely necessary, especially if there a lot of required fields. One of the beautiful things about the builder idiom is that the build method is the perfect place for the Builder to enforce any invariants that must be met in order to create a valid object.

Builder could be used rather using a variety of different static factories. For example, here is a fictitious example of how Guava could have used a Builder pattern to create a List with various attributes:

// fake API - not runnable code !
List<Integer> lint1 =
  Lists.newBuilder().unmodifiable().
    add(1).add(2).add(3).
    build();

List<Integer> lint2 =
  Lists.newBuilder().immutable().
    add(1).add(2).add(3).
    build();

List<Integer> lint3 =
  Lists.newBuilder().synchronizedColl().
    maxSize(5).add(1).add(2).add(3).
    build();

But for collections, it is more typical to use the static factory pattern, as we saw in the first section.

Where the builder idiom particularly shines is when you are constructing objects with a variable and potentially complex set of attributes. A good example: Guava uses the builder idiom to create a Cache object:

Cache<Key, Graph> graphs = CacheBuilder.newBuilder()
     .concurrencyLevel(4)
     .weakKeys()
     .maximumSize(10000)
     .expireAfterWrite(10, TimeUnit.MINUTES)
     .build(
         new CacheLoader<Key, Graph>() {
           public Graph load(Key key) throws AnyException {
             return createExpensiveGraph(key);
           }
         });

Another aspect of the builder idiom that is particularly satisfying is that it is a great way to create immutable objects that have more than a couple optional attributes. For example, suppose you are modeling an inventory item that only directly requires custodian and quantity, but can take many different optional attributes, such as status, location, container-id, RFID-tag, etc. If you want to create immutable inventory objects, the builder pattern is a pleasant way to manage this:

// custodian (thornydev) and location are required, 
// so go in the builder constructor
InventoryItem item = InventoryItem.
  newBuilder("thornydev", "location 123").
  status("Available").
  quantity(35).quantityUnit(InventoryItem.KILOGRAMS).
  build();

In the build method, the Builder can check that key business rules have been met. For example, quantity and quantityUnit must either both be set or neither set, otherwise build will throw an IllegalStateException.

`/* ---[ The Builder Pattern ]--- */`

OK, now that we've reviewed the idioms, let's analyze the formal GoF patterns and see how they compare.

The Builder Pattern is actually quite similar to the builder idiom. In the GoF version, the Builder is an interface. A concrete version is created to create a product. In the idiom, the Builder doesn't have an interface, since it is tightly coupled to creating a particular product (and in Java is often implemented as an inner class of that object).

The advantage of using an abstraction, in this case an interface, is, as usual, to be able to transparently swap out a different Builder implementation, or to allow multiple different types of entities to use a common interface. An example of the latter is found in the JDK's java.lang.Appendable.

Appendable is an interface that has three flavors of the same method, append. BufferedWriter, PrintStream and StringBuilder, to name a few, implement its interface. StringBuilder is a nice, though simple, example of the Builder pattern - you build up the string bit by bit, can do some morphs, reversals, substrings, or other sorts of changes and then produce an immutable String, threadsafe and ready for production use. In this case toString is the "build" method:

String s = "devil";
StringBuilder sb = new StringBuilder();
sb.append(s).append("ish").  // now is "devilish"
  replace(5, 7, " e").       // now is "devil eh" 
  reverse().toString();      // => "he lived"

The GoF book illustrates a more sophisticated use of the Builder pattern. They create an RTFParser than will convert RTF text to other formats. I've updated their example a bit and redrawn it:

This kind of looks like the Abstract Factory pattern (see below). So why is this a Builder? Because it will be used to construct one document (say in HTML format) by calling its builder methods multiple times in some arbitrary order in a multi-step process to create that document. For example:

HTMLConverter c = new HTMLConverter();
Document d = c.
  convertBold(string1).
  convertParagraph(line1).
  convertHeader1(string2).
  convertParagraph(line2).
  // ... etc.
  .build();

Note: what I'm calling the builder idiom in this article has also been referred to as the "revised builder pattern". As noted here, the two variants use the same approach with different emphasis. The builder idiom (revised builder pattern) usually tightly couples a builder to a specific concrete class, with the intent of simplifying the construction of an object with a complex set of attributes. The GoF Builder pattern is more about providing an abstraction to build various types of entities, akin to Abstract Factory in that sense.

`/* ---[ The Factory Pattern ]--- */`

The Factory Pattern uses classic Object Oriented (OO) reuse through inheritance, along with all the shortcomings and foibles of inheritance-based design

The essence of the Factory Pattern is to delegate to a subclass the creation of an entity that can vary based on application needs. It typically uses either an abstract or concrete class, not a pure interface, to provide a default implementation and handle logic that will be common to all subclasses.

A trivial example from the JDK is Object#toString. Object, a concrete class, provides a basic toString method that subclasses can accept as-is or tailor to their needs.

A slightly less trivial example is java.lang.Number#intValue (and floatValue etc.). These methods are abstract and have to be implemented in a way appropriate to the subclasses. Number provides an implementation of a few common methods and the rest are delegated to subclasses, which includes Integer, Double, BigInteger and AtomicLong.

The above examples are not really "factories" as I normally think of them, though one could argue they are factories of Strings and primitives (but it's a weak argument, which is why I think of Factory as just OO inheritance-based design).

Here is the GoF UML class diagram of the Factory pattern.

In many cases, the Factory pattern starts to blur into the Template pattern for me. Both use OO inheritance-based reuse of methods defined in the parent class. As a side note, the Template Pattern may be a more valid and robust version of inheritance than is often practiced.

The Template Pattern can be implemented via the Factory Pattern, as the GoF book states: Factory methods are usually called within Template Methods (p. 116).

The excellent Head First Design Patterns book does exactly this in their implementation of a Factory method (see my comments in the code below):

package headfirst.factory.pizzaaf;

public abstract class PizzaStore {

  // the abstract method to be implemented in subclasses
  // such as NYPizzaStore or ChicagoPizzaStore
  protected abstract Pizza createPizza(String item);

  // this is the (unacknowledged) Template pattern here
  // delegating the factory method createPizza to a
  // concrete subclass
  public Pizza orderPizza(String type) {
    Pizza pizza = createPizza(type);
    System.out.println("Making a " + pizza.getName());
    pizza.prepare();
    pizza.bake();
    pizza.cut();
    pizza.box();
    return pizza;
  }
}

`/* ---[ The Abstract Factory Pattern ]--- */`

The Abstract Factory Pattern is intended for situations where you need to create families of related "products". For example, a widget library needs to create multiple related widgets - buttons, labels, frames, pick lists, text fields, scrollbars, etc. If you want to offer different "skins" or look-and-feels (is that an outdated term now?), you could use the Abstract Factory Pattern.

The canonical example provided by the Gang of Four is a WidgetFactory interface that can be implemented by any number of concrete Factories to produce all the various widgets with a defined Look-and-Feel.

From this quick look we can immediately see two differences between Factory and Abstract Factory:

Abstract Factory uses a pure interface, while Factory has concrete methods and may or may not have any abstract methods.
Abstract Factory is useful when you need to create multiple different categories of things (like widgets) that are related (say by look-and-feel). Factory produces one thing (a PizzaStore).

So you could think of an Abstract Factory as creating a factory of little factories. And here's where we can start to tie together some the threads of this investigation. What patterns or idioms can the "little factories" of the Abstract Factory use?

Well, they can use the static factory idiom, if the thing they are creating is easily bound to one method call with no or few parameters. Or it could use a Builder (either form) to construct products that need to be constructed in a multi-step fashion or with lots of variable attributes. Or we could use the Factory method to be able to swap different concrete implementations as needed.

Abstract Factory Example from the JDK

In JDBC, one obtains a connection to a datastore by calling DriverManager.getConnection. This is a static factory method that returns an implementation of java.sql.Connection. Connection is a pure interface that gets implemented by JDBC library implementers, so there is one for each flavor of database. So Connection, in this case, is an Abstract Factory interface and the specific implementations by database vendors are the concrete classes. For example, with PostgreSQL's JDBC driver, the concrete Connection class is org.postgresql.jdbc4.Jdbc4Connection (if you are using the JDBC4 version).

So what "products" are created by the "little factories" in the Connection class? Many: Statement, PreparedStatement, CallableStatment, SavePoint, Blob, Clob and a few others, each one, of course, differing from those in other JDBC implementations.

`/* ---[ Summary ]--- */`

So, to finish, a few summarizing points:

The static factory idiom is very common in Java. Using it precludes the possibility of delegating responsibility to a subclass for creating specific types of objects. Therefore, use Static Factory when you are sure that you only need this one implementation of the factory.
The GoF Builder Pattern and "revised builder" (aka builder idiom) are basically the same, except in whether they are tightly coupled to the object they are creating. In either cases, use a builder when you want to be able to flexibly create objects that need multi-step set up, particularly when you need to handle multiple optional attributes.
The Factory Pattern uses inheritance to allow different implementations of a specific method intended to be overridden subclasses.
The Abstract Factory Pattern creates an interface whose implementations use composition to create a series of little factories to produce related items or products. Those little factories can use any of the previous patterns to do their object creation.

`/---[ Thesis and Antithesis ]---/`

Not long ago, there was a posting on slashdot from a self-proclaimed hacker (in the good sense) asking for advice on how to become a professional software engineer to target jobs in the corporate IS world.

"Learn to use a debugger" was one piece of advice he got. Sounds good to me, I thought, although I'm not sure hackers don't know how to use debuggers - gdb, the emacs Grand Unified Debugger and the DDD debugger were all arguably written by hackers. But the "learn to use a debugger" guy went on to ridicule a co-worker of his that had never grokked how to become one with his debugger and preferred print statements. This was such a deep lapse in his skillset and mentality that in their view he couldn't keep up with the awesomeness of the debugger crew and they were glad when he finally left the company.

At my job, I overhead a bathroom conversation (a great place to eavesdrop) where a similar view was taken: "I don't think he knows what he's doing - he put println statements everywhere".

I was recently rereading parts of The Practice of Programming by Brian W. Kernighan and Rob Pike. To provide a little counterpoint, I will quote something they had to say about this debate:

As personal choice, we tend not to use debuggers beyond getting a stack trace or the value of a variable or two. One reason is that it is easy to get lost in details of complicated data structures and control flow; we find stepping through a program less productive than thinking harder and adding output statements and self-checking code at critical places. Clicking over statements takes longer than scanning the output of judiciously-placed displays. It takes less time to decide where to put print statements than to single-step to the critical section of code, even assuming we know where that is. More important, debugging statements stay with the program; debugging sessions are transient.

Perhaps Kernighan and Pike are just hackers.

And just as I was writing this blog, Uncle Bob Martin (re)tweeted this note: Using a debugger is a code smell.

`/---[ Synthesis? ]---/`

That being said, the world is full of wonderful things, and I recently ran across something new to me on the excellent InfoQ website: recording debuggers.

The InfoQ article points to two companies that offer recording debuggers for the JVM:

Chronon: http://www.chrononsystems.com/video-exceptions/
Replay Solutions: http://www.replaysolutions.com/products/demo-video

I did a little investigation into Chronon's debugger. It does instrumentation of the Java bytecode instruction set of your code to record everything that happens in the JVM occur over the life of an application in all threads. No source code changes are required. They use an analogy of a “flight data recorder” for a Java program.

They claim the instrumentation is lightweight enough to be able to run it in production and certainly during formal QA testing. It basically creates a dump file of application state that can be opened in the recording debugger and played backwards and forwards. For example, you can pick a variable and see all state changes it ever had during the life of the program and then jump to any of those timepoints. The stack traces will show you all current threads and you can jump between them to see what state was there.

No breakpoints are required (or even possible), since you are just moving around in a program that has already run and looking at snapshots of time. You can do things like “show me all times which method X was called and from where” and then jump to any of those to see program state and execution at that point.

The other big selling point is that you do not have to recreate the full environment in which the bug occurred – you are just watching the state changes in the JVM so you don’t need to hook up to the database, the network, the message queues or whatever else to reproduce the issue.

This could be especially powerful for debugging multi-threaded apps, which is quite difficult using traditional debuggers.

Perhaps this is the synthesis of the divide expressed in the opening section of this essay. I haven't tried one yet, but it is definitely on my todo list for the future. I'd love to hear from anyone who has tried using a recording debugger.

Sunday, February 19, 2012

Factories and Builders, Idioms and Patterns

`/* ---[ Static Factory Methods ]--- */`

`/* ---[ The Builder Idiom ]--- */`

`/* ---[ The Builder Pattern ]--- */`

`/* ---[ The Factory Pattern ]--- */`

`/* ---[ The Abstract Factory Pattern ]--- */`

Abstract Factory Example from the JDK

`/* ---[ Summary ]--- */`

Friday, February 10, 2012

Thesis, Antithesis, Synthesis: Thoughts on Debuggers

`/---[ Thesis and Antithesis ]---/`

`/---[ Synthesis? ]---/`

Thursday, February 2, 2012

Readable Text Please

Sunday, February 19, 2012

Factories and Builders, Idioms and Patterns

/* ---[ Static Factory Methods ]--- */

/* ---[ The Builder Idiom ]--- */

/* ---[ The Builder Pattern ]--- */

/* ---[ The Factory Pattern ]--- */

/* ---[ The Abstract Factory Pattern ]--- */

Abstract Factory Example from the JDK

/* ---[ Summary ]--- */

Friday, February 10, 2012

Thesis, Antithesis, Synthesis: Thoughts on Debuggers

/*---[ Thesis and Antithesis ]---*/

/*---[ Synthesis? ]---*/

Thursday, February 2, 2012

Readable Text Please

`/* ---[ Static Factory Methods ]--- */`

`/* ---[ The Builder Idiom ]--- */`

`/* ---[ The Builder Pattern ]--- */`

`/* ---[ The Factory Pattern ]--- */`

`/* ---[ The Abstract Factory Pattern ]--- */`

`/* ---[ Summary ]--- */`

`/---[ Thesis and Antithesis ]---/`

`/---[ Synthesis? ]---/`