Ben Biddington

Whatever it is, it's not about "coding"

Posts Tagged ‘oop

Refactoring Example — Fat Controllers

with 2 comments

We have an MVC controller that takes some input from a user, finds a file in a particular MP3 flavour, and then calculates the hash of the MP3 frames in that file.

Here’s what it looked like to begin with (an arrow means depends on):

|---------------> TrackFileFinder
|---------------> StreamHasher
|---------------> Log


  • Take input of trackid and format id (both numbers)
  • Open a stream on the file
  • Supply that stream to the StreamHasher
  • Return the hash as text to user

Next requirement — learning file hashes

This works as required but we can improve performance by reducing the number of times we calculate a file’s hash. Assuming each file is static — i.e., its MP3 frames will never change — then we only need to calculate its hash once.

Rather than do this every time, we could save the result somewhere — a database perhaps. Which means we are introducing new behaviour.

When producing a hash, first check whether it already exists
If it does exist, then return it
Otherwise generate it and store it for next time

We could naively add a new HashRepository collaborator:

|---------------> TrackFileFinder
|---------------> StreamHasher
|---------------> Log
|---------------> HashRepository

But now our interface is expanding, this has a habit of getting out of control and you can end up with ten constructor arguments. This is commonly known as the “Too Many Dicks on the Dance Floor” problem.

So what’s wrong with it?

  • Mp3HashController is no longer composing its behaviour from one layer of abstraction. This notion of learning return values has changed that. It is now exposed to details it shouldn’t have any knowledge of let alone dependency on.
  • Mp3HashController is now more difficult to test, there are more paths that have to exercised though the same interface
  • There is new conditional behaviour here that clearly belongs on its own — clients should not even know this is happening
  • It is much easier to test the learning behaviour on its own — I shouldn’t have to probe a controller
  • I shouldn’t have to describe an object’s behaviour in terms of another objects interface, I should be able to use that object directly
  • I would likely have to suppress some behaviour(s) while testing others. This will be manifest itself as complicated stubbing
  • Lots of dependent stubbing. Each of those collaborators are actually collaborators themselves. I think this is a smell. I shouldn’t have to consider this when unit testing Mp3HashController.

There is an opportunity to introduce an abstraction here. If we consider that all Mp3HashController requires is something to get a hash, then we can actually reduce it to:

|---------------> Log
|---------------> HashRepository

And then we have a HashRepository implementation like this:

|---------------> TrackFileFinder
|---------------> StreamHasher
|---------------> HashRepository

The LearninghHashRepository has the responsibility of storing any hashes that don’t exist.

This could probably be condensed even further. TrackFileFinder and StreamHasher represent the concept of obtaining the hash of a file given a track identifier, so they can be combined. This reduces LearningHashRepository to a sort of write-through cache.

Written by benbiddington

9 August, 2010 at 13:37

Posted in development, oop

Tagged with , ,

Why can’t I hang an extension method on a type?

leave a comment »

My brother asked me this. And while I don’t know, I did discover some interesting things along the way.

An extension method is nothing more than a compiler trick. It is simply a static method that takes an instance of the type being extended as an argument. That’s it.

The sugar part is that to you as a programmer, it appears to read more naturally in some cases.

They do not have any special privileges on private or protected members and they are not analagous to ruby module mixins (because the extended class cannot invoked extension methods).

[TBD: It is interesting that instance methods are supplied “this” as their first argument, see CIL]

[TBD: It is interesting that the compiler emits a callvirt instruction even in cases where call seems more appropriate just because callvirt has a null reference check. See: Why does C# always use callvirt?]

[TBD: Extensions are really a higher level abstraction because they operate only against public interface. An extension method is a client of the object it “extends”]


namespace Examples {
    public class ExampleClass { }

    public static class Extensions {
        public static void ExtensionMethod(this ExampleClass instance) {

    public class ThatUsesExampleClass {
        public void RunExample() {
            new ExampleClass().ExtensionMethod();

The interesting part is RunExample (because it invokes the extension method):

public void RunExample() {
    new ExampleClass().ExtensionMethod();

which compiles to:

.method public hidebysig instance void
RunExample() cil managed
    // Code size       13 (0xd)
    .maxstack  8
    IL_0000:  nop
    IL_0001:  newobj     instance void Examples.ExampleClass::.ctor()
    IL_0006:  call       void Examples.Extensions::ExtensionMethod(class Examples.ExampleClass)
    IL_000b:  nop
    IL_000c:  ret
} // end of method ThatUsesExampleClass::RunExample

It is clear that the compiler has done nothing more than redirect to a static method on a static class:

IL_0006:  call       void Examples.Extensions::ExtensionMethod(class Examples.ExampleClass)


The usual static method usage rules apply:

[Clean code chapter 6]
Procedural code (code using data structures) makes it easy to add new functions withoutchanging the existing data structures. OO code, on the other hand, makes it easy to add new classes without changing existing functions.

The complement is also true:
Procedural code makes it hard to add new data structures because all the functions must
change. OO code makes it hard to add new functions because all the classes must change.
So, the things that are hard for OO are easy for procedures, and the things that are
hard for procedures are easy for OO!

In any complex system there are going to be times when we want to add new data
types rather than new functions. For these cases objects and OO are most appropriate. On
the other hand, there will also be times when we’ll want to add new functions as opposed
to data types. In that case procedural code and data structures will be more appropriate.

Mature programmers know that the idea that everything is an object is a myth. Sometimes
you really do want simple data structures with procedures operating on them.

[TBD: Usage — how does it fit with OO design?]

Back to the question

Still no answer.

But I can’t see any reason why the C# compiler couldn’t do the same for static constructs, but I wonder how you would express that on the extension method itself.  Perhaps that’s where the ExtensionAttribute comes in. Note: It currently is illegal to use the ExtensionAttribute directly.

But if you examine the IL for an extension method itself, you’ll see it has been applied:

.custom instance void [System.Core]System.Runtime.CompilerServices.ExtensionAttribute::.ctor() =
    ( 01 00 00 00 )
.method public hidebysig static void
    ExtensionMethod(class Examples.ExampleClass 'instance') cil managed {

    .custom instance void [System.Core]System.Runtime.CompilerServices.ExtensionAttribute::.ctor() =
        ( 01 00 00 00 ) 

    // Code size       9 (0x9)
    .maxstack  8
    IL_0000:  nop
    IL_0001:  ldarg.0
    IL_0002:  callvirt   instance string [mscorlib]System.Object::ToString()
    IL_0007:  pop
    IL_0008:  ret
} // end of method Extensions::ExtensionMethod

Written by benbiddington

4 August, 2010 at 09:15

Posted in development

Tagged with , , , , ,

Scala introduction — writing an OAuth library

leave a comment »

I started out intending to write some scala examples against the twitter API, however I soon discovered I needed OAuth first. Given that I use OAuth all the time at work I figured I could probably do with learning about it first-hand, while learning scala.


I chose to test drive it with JUnit 4.7 and NetBeans.

NetBeans works almost immediately with scala, and has support for project templates etc — even scala JUnit fixtures.

UPDATE (2010-04-27) I have since discovered IntelliJ to be much better, and there is now a free community edition. IntelliJ supports scala without any fiddling around.

JUnit mostly works, though rules don’t and neither do some matchers. Even though rules don’t work, I have included it anyway because I have the t-shirt.

You can find the project on github.

Important abstractions

  1. SignatureBaseString.
    1. Characterized by three ampersand-separated segments: verb, uri, parameters.
    2. URL Encoding must conform to RFC 3986, and the following characters should are consider unreserved so should not be encoded:
      ALPHA, DIGIT, ‘-‘, ‘.’, ‘_’, ‘~’
  2. Signature.
    1. Signature is a keyed-Hash Message Authentication Code (HMAC).
    2. Consumer secret required part of HMAC secret key.
    3. Token secret is optionally included in HMAC secret key:
      (consumer_secret, token_secret) => uri_encoded_consumer_secret&[uri_encoded_token_secret]
  3. OAuthCredential. Represents the secret key(s) used to create the HMAC signature. OAuth requires a consumer credential, and optionally a token credential, representing the end user.

Now that these core concepts are complete, I am working on high-level policy, like classes for generating signed URLs and authorization headers.


JUnit — expecting exceptions in scala

Assuming JUnit 4.x, a test can expect an exception using the test annotation:


    public void ExampleThrowsException(){
        throw new IllegalArgumentException();

This needs to be modified for scala:


@Test { val expected=classOf[IllegalArgumentException] }
    def ExampleThrowsException {
        throw new IllegalArgumentException

The reason for it is outlined here in the Java annotations section on named parameters.

Here is the documentation for scala annotations. Seealso: the documentation for scala 2.7.3 (includes dbc).

Closures and return

The return statement immediately returns from the current method, even if you’re within a closure. Omit return in this case — return is optional anyway.

When to use semicolon line terminator

Never — apart from:

  • When a method returns Unit (equivalent to void) and you aren’t using return keyword. [TBD: Add example].

How to use blocks

var count = 1
times(2) { println("Printed " + count + " times")}
protected def times(count : Int)(block : => Unit) = { => block)

Seealso: some executable examples on github


ALPHA, DIGIT, '-', '.', '_', '~'

Written by benbiddington

18 September, 2009 at 13:37

Posted in development

Tagged with , , , , , , , ,

Particle physics, mocks and stubs

leave a comment »

Steve Freeman had interesting analogy in TDD 10 years later (17m30s, slide 26: The origins of mock objects). He describes mocked unit test being “rather like particle physics”.

You fire something at a particle, things splinter off and you can detect what happens…


A mock is used to both detect the emissions from the system under test (SUT), and verify expectations. Additionally, a mock object may perform stub duties. This doesn’t quite fit, since fission is one-way.

Testing “by detection” like this is considered behaviour verification: verifying collaborations between the SUT and other objects.

To be testable in such a manner:

  • Requires the ability to isolate the SUT sufficiently, i.e., detach it completely from its context and collaborators. A test fixture should be able to create the SUT easily by itself.
  • Then the SUT should minimize concrete dependencies.
  • Collaborators must be designed in such a way to allow a mock to be generated that can intercept interactions. This means identifying the abstraction(s) for collaborators.
  • Mock is a stub in the sense that it needs to stand in for a real (if inert) object. But a mock is also a “detector” and is used as the means of assertion.
  • Stub queries and mock actions. “we mock when the service changes the external world; we stub when it doesn’t change the external world – stub queries and mock actions”


Written by benbiddington

12 September, 2009 at 15:15

Posted in development

Tagged with , , , ,

Book review — Clean Code

with one comment

An excellent book by Bob Martin, with tips on often overlooked fundamentals.

3 — Functions

Functions should:

  • Be small.
  • Do one thing, with no side effects.
  • Do something or answer something, not both (command query separation). A function should either change the state of an object (but not its arguments), or return information about an object. Doing both is confusing.
  • Operate at one level of abstraction.
  • Have as few arguments as possible


Arguments are required for a function to do its job. Arguments are parameters describing how a function should operate. Zero argument functions (niladic) are ideal, from both understandability and testability perspectives.

Arguments should:

  • Be at the same level of abstraction as the function
  • Describe input, not output. We expect information to go in to a function through its arguments not out (consider mathematical functions — they have no concept of output arguments). Functions should not, therefore, modify their arguments. Passing a list to a function expecting it to be filled when the function returns is incorrect usage. Plus it violates the “do something or answer something”. [TBD: What about functions that accept Streams and write to them? Is this considered modifying an argument?]
  • Not contain flag arguments. Flag arguments imply the method does more than one thing, anyway. Consider splitting the method in two in this case.

Monadic functions

Two common reasons for passing single argument:

  1. To ask a question about it (e.g., File.Exists(“path”)).
  2. To operate on the argument, transform it and return it (e.g., Stream inStream = File.Open(“path”)).

[TBD: TW anthology describes trying to limit classes to two instance fields, is this similar?]

Argument objects

If a function expects more than two or three arguments, it’s likely that at least some of those should be wrapped in their own class. For example:

Circle createCircle(Int32 x, Int32 y, Int32 radius);

Could be refactored to:

Circle createCircle(Point point, Int32 radius);

This is not cheating, provided the resultant object actually makes sense. In the first version, x and y are ordered components of a single value (or concept). You wouldn’t do the same thing with:

void WriteField(Stream outStream, String name);

Here, Stream and String are not components of the same concept.

Error handling is “one thing”

Consider extracting error handling to its own function — so the one thing it does is handle errors. A function written in this style will start with try and do nothing after its catch/finally. [TBD: Give this one a try]

Arguments or instance variables?

[TBD: How doI tell whether to pass a variable as an argument or add it as an instance member of the object?]

Currying is a way to simplify a function signature, but where should the line be drawn?

Perhaps its worth focusing on the arguments that clients would like to be able to supply.

Should instance members only be used for real object state? If an object uses a variable to perform its functions, surely that qualifies as eligible for instance membership?

6 — Objects and data structures

This was perhaps my favourite section (even though it has that cretinous modern Star Trek character on its title page).

Hiding implementation is about more than defining getters and setters on instance fields — it’s about abstractions.

Consider these interfaces:

// 1
public interface Vehicle {
    double getFuelTankCapacity();
    double getGallonsInTank();
// 2
public interface Vehicle {
    double getPercentFuelRemaining();

(2) is considered preferable, because it is defining an abstraction, rather than exposing data. [TBD: I am not sure about this, though. Shouldn’t I be able to query for internal state? Shouldn’t I be able to see how much gas my vehicle has?].

The reason (2) is preferred is outlined in the next section, data/object anti-symmetry.

Data/Object anti-symmetry

Objects and data structures and virtual opposites, as described by these anti-symmetry rules:

  • Objects hide their data behind abstractions and expose functions that operate on those abstractions.
  • Data structures expose their data and have no meaningful functions

This section goes on to describe the differences between OO and procedural code, using calculating the area of geometric shapes as an example.

The difference in the two alternatives amounts to where you put your behaviour (functions).

If we followed the antisymmetry rules, we’d add a Geometry class that defined an area function. We would have successfully kept our data structures pure, but we’d have to modify the area function whenever we add a new data structure (which violates the open-closed principle).

Procedural code makes it hard to add data structures

The OO approach forces our shapes to implement a polymorphic area function. This is the way I am most used to, however it has a down side: if we want to add new functions, we have to change all of our data structures.

OO code makes it hard to add functions

Also, we have polluted our data structure with functions — our shapes no longer satisfy the anti-symmetry rules. Our shapes are now hybrids.

This, too, shows that objects and data structures are opposites.

Interesting. The final point in the section is that the idea that everything is an object is a myth — sometimes the procedural approach is applicable.

Bob Martin has written more about this in his post about ActiveRecord. Here he makes the case that an object designed as an active record contains both data and behaviour. By definition, a class like this exposes both its innards, and a persistence abstraction.

The Law of Demeter

So, if objects hide data and expose operations, then an object must not expose its internal structure through accessors [TBD: ?].

A module should not know about the innards of the objects it manipulates.

Note: The term object is important, because the law does not apply to data structures. Data structures are supposed to expose their innards — so we’re free to dig as deep into them as we like.

The Law of Demeter:

A function f of class C should only call the methods of:

  • C
  • An object created by f
  • An object supplied as an argument to f
  • An object held as an instance variable of C

Note: f should not invoke methods on the objects returned from these allowed functions either.

Talk to friends not strangers.

11 — Systems

[TBD: Returned the book already]

Written by benbiddington

22 June, 2009 at 17:49

Posted in development, oop

Tagged with , , ,