Posts Tagged ‘refactoring’
We have an MVC controller that takes some input from a user, finds a file in a particular MP3 flavour, and then calculates the hash of the MP3 frames in that file.
Here’s what it looked like to begin with (an arrow means depends on):
Mp3HashController |---------------> TrackFileFinder |---------------> StreamHasher |---------------> Log
- Take input of trackid and format id (both numbers)
- Open a stream on the file
- Supply that stream to the StreamHasher
- Return the hash as text to user
Next requirement — learning file hashes
This works as required but we can improve performance by reducing the number of times we calculate a file’s hash. Assuming each file is static — i.e., its MP3 frames will never change — then we only need to calculate its hash once.
Rather than do this every time, we could save the result somewhere — a database perhaps. Which means we are introducing new behaviour.
When producing a hash, first check whether it already exists
If it does exist, then return it
Otherwise generate it and store it for next time
We could naively add a new HashRepository collaborator:
Mp3HashController |---------------> TrackFileFinder |---------------> StreamHasher |---------------> Log |---------------> HashRepository
But now our interface is expanding, this has a habit of getting out of control and you can end up with ten constructor arguments. This is commonly known as the “Too Many Dicks on the Dance Floor” problem.
So what’s wrong with it?
- Mp3HashController is no longer composing its behaviour from one layer of abstraction. This notion of learning return values has changed that. It is now exposed to details it shouldn’t have any knowledge of let alone dependency on.
- Mp3HashController is now more difficult to test, there are more paths that have to exercised though the same interface
- There is new conditional behaviour here that clearly belongs on its own — clients should not even know this is happening
- It is much easier to test the learning behaviour on its own — I shouldn’t have to probe a controller
- I shouldn’t have to describe an object’s behaviour in terms of another objects interface, I should be able to use that object directly
- I would likely have to suppress some behaviour(s) while testing others. This will be manifest itself as complicated stubbing
- Lots of dependent stubbing. Each of those collaborators are actually collaborators themselves. I think this is a smell. I shouldn’t have to consider this when unit testing Mp3HashController.
There is an opportunity to introduce an abstraction here. If we consider that all Mp3HashController requires is something to get a hash, then we can actually reduce it to:
Mp3HashController |---------------> Log |---------------> HashRepository
And then we have a HashRepository implementation like this:
LearningHashRepository |---------------> TrackFileFinder |---------------> StreamHasher |---------------> HashRepository
The LearninghHashRepository has the responsibility of storing any hashes that don’t exist.
This could probably be condensed even further. TrackFileFinder and StreamHasher represent the concept of obtaining the hash of a file given a track identifier, so they can be combined. This reduces LearningHashRepository to a sort of write-through cache.