Ben Biddington

Whatever it is, it's not about "coding"

Refactoring Example — Fat Controllers

with 2 comments

We have an MVC controller that takes some input from a user, finds a file in a particular MP3 flavour, and then calculates the hash of the MP3 frames in that file.

Here’s what it looked like to begin with (an arrow means depends on):

|---------------> TrackFileFinder
|---------------> StreamHasher
|---------------> Log


  • Take input of trackid and format id (both numbers)
  • Open a stream on the file
  • Supply that stream to the StreamHasher
  • Return the hash as text to user

Next requirement — learning file hashes

This works as required but we can improve performance by reducing the number of times we calculate a file’s hash. Assuming each file is static — i.e., its MP3 frames will never change — then we only need to calculate its hash once.

Rather than do this every time, we could save the result somewhere — a database perhaps. Which means we are introducing new behaviour.

When producing a hash, first check whether it already exists
If it does exist, then return it
Otherwise generate it and store it for next time

We could naively add a new HashRepository collaborator:

|---------------> TrackFileFinder
|---------------> StreamHasher
|---------------> Log
|---------------> HashRepository

But now our interface is expanding, this has a habit of getting out of control and you can end up with ten constructor arguments. This is commonly known as the “Too Many Dicks on the Dance Floor” problem.

So what’s wrong with it?

  • Mp3HashController is no longer composing its behaviour from one layer of abstraction. This notion of learning return values has changed that. It is now exposed to details it shouldn’t have any knowledge of let alone dependency on.
  • Mp3HashController is now more difficult to test, there are more paths that have to exercised though the same interface
  • There is new conditional behaviour here that clearly belongs on its own — clients should not even know this is happening
  • It is much easier to test the learning behaviour on its own — I shouldn’t have to probe a controller
  • I shouldn’t have to describe an object’s behaviour in terms of another objects interface, I should be able to use that object directly
  • I would likely have to suppress some behaviour(s) while testing others. This will be manifest itself as complicated stubbing
  • Lots of dependent stubbing. Each of those collaborators are actually collaborators themselves. I think this is a smell. I shouldn’t have to consider this when unit testing Mp3HashController.

There is an opportunity to introduce an abstraction here. If we consider that all Mp3HashController requires is something to get a hash, then we can actually reduce it to:

|---------------> Log
|---------------> HashRepository

And then we have a HashRepository implementation like this:

|---------------> TrackFileFinder
|---------------> StreamHasher
|---------------> HashRepository

The LearninghHashRepository has the responsibility of storing any hashes that don’t exist.

This could probably be condensed even further. TrackFileFinder and StreamHasher represent the concept of obtaining the hash of a file given a track identifier, so they can be combined. This reduces LearningHashRepository to a sort of write-through cache.

Written by benbiddington

9 August, 2010 at 13:37

Posted in development, oop

Tagged with , ,

2 Responses

Subscribe to comments with RSS.

  1. isn’t the real question… why are you such a prick?


    13 August, 2010 at 13:52

    • Please elaborate


      13 August, 2010 at 13:54

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: