Ben Biddington

Whatever it is, it's not about "coding"

Serialization rules for Adobe Content Server

with 31 comments

Working with Adobe Content Server can be a truly depressing experience. The recommendation is to use a jar file — UploadTestJar — written by Adobe to perform HTTP RPC operations against the Content Server.

Problem is that UploadTestJar only does uploads, but we need full control, like deletes for example. Porting the java is possible, but it’s some of the most poorly written crap I have ever seen, and finding a specification is resisting web search.

Finally we managed to get a description from the support staff which’ll be helpful if you’re intending to port that awful UploadTestJar mess.

  1. All adjacent text nodes are collapsed and their leading and trailing whitespace is removed.
  2. Zero-length text nodes are removed.
  3. Signature elements in Adept namespace are removed.
  4. Attributes are sorted first by their namespaces and then by their names; sorting is done byte wise on UTF-8 representations.
    1. If attributes have no namespace insert a 0 length string (i.e. 2 bytes of 0) for the namespace
  5. Strings are serialized by writing two-byte length (in big endian order) of the UTF-8 representation and then UTF-8 representation itself
  6. Long strings (longer than 0x7FFF) are broken into chunks: first as many strings of the maximum length 0x7FFF as needed, then the remaining string. This is done on the byte level, irrespective of the UTF-8 boundary.
  7. Text nodes (text and CDATA) are serialized by writing TEXT_NODE byte and then text node value.
  8. Attributes are serialized by writing ATTRIBUTE byte, then attribute namespace (empty string if no namespace), attribute name, and attribute value.
  9. Elements are serialized by writing BEGIN_ELEMENT byte, then element namespace, element name, all attributes END_ATTRIBUTES byte, all children, END_ELEMENT byte.

This list is in actually the javadocs for the XmlUtil class. Why it’s all lumped in there is anybody’s guess. The serialization as described above is mostly implemented by one very long method in (1000+ line) XmlUtil.java: Eater.eatNode.

Note: The values of the constants BEGIN_ELEMENT etc are listed in the XMLUtil class.

Why I consider UploadTestJar poorly written

Here are some things I’ve noticed:

  • Nothing reads like a narrative, i.e. , methods call other methods that occur before it in the file — makes files very hard to follow.
  • Too many comments. I know this is a java idiom, but it make reading the stuff that matter more difficult
  • Idiotic comments: inline comments that state the obvious and are just noise. e.g.:// retrieve HMAC key and run a raw SHA1 HASH on it.
    byte[] hmacKeyBytesSHA1 = XMLUtil.SHA1(getHmacKey());
  • XMLUtil.java contains several classes
  • XMLUtil class does more than one thing:
    • Parses XML
    • Normalizes XML
    • Creates XML documents
    • Serializes XML, dates, bytes and strings
    • Checks signatures
    • Signs XML documents
    • Hashes things
  • Class UploadTest does everything in ctor: reads a file from disk, validates it, makes some xml, signs it and then posts it to the server.
  • UploadTest the main entry point for executable, and it contains all the behaviour — it’s 1600 lines long
  • Cannot use UploadTest without a real epub file
  • UploadTest does too many things:
    • Ctor does too many things
      • Handles command line input
      • Displays help/usage
      • Asserts a file on disk has been supplied
      • “Makes” content
        • makeContent requires a file an epub on disk
        • makeContent loads xml
        • makeContent assembles xml files
        • makeContent hashes things
        • makeContent swallows errors and writes to stdout
    • “Sends” content via HTTP
    • Methods that do too many things, e.g., if/else branches based on the verboseDisplay flag

Written by benbiddington

16 February, 2010 at 10:39

31 Responses

Subscribe to comments with RSS.

  1. Hi Ben,

    Interesting post. I’ve not had that many problems with the UploadTestJar. Why do you think it’s so badly written?

    Jim

    17 February, 2010 at 10:24

    • Don’t know, but perhaps there’s a clue in the name: it’s a *test* jar.

      I think it’s probably something hacked together as a proof of concept that has then been released and never touched again.

      An all too common scenario…

      benbiddington

      17 February, 2010 at 10:37

    • The fact that its a 1681 line class, accompanied by a 1100 file called “XMLUtil” might have something to do with it…

      Matt

      17 February, 2010 at 12:05

  2. Sounds pretty comprehensive to me. I managed to get the delete functionality working as well. Maybe you just need to read through the code more

    Jim

    17 February, 2010 at 12:59

    • It works with the jar does it? That’s handy.

      You’re right, we probably need to read the source better, but they’ve sure made it hard to do so.

      We’ve managed to port it c# now thankfully, so we’ve ditched the jar entirely.

      benbiddington

      17 February, 2010 at 13:19

  3. Hey Ben,

    I must say I find this post a little terse. I’ve been using Adobe Content Server for a number of months now and have never taken issue with any of the things that you list, in fact I think it’s been a joy to work with.

    Your comment that the code should read like a narrative seems to contradict one of your other points, that there are too many comments? When I dug in to the code it sure did read like a book to me, may be you just need to spend a little more time with it?

    Why you’ve ported it to c sharp is beyond me to. Surely this was a waste of time?

    Cheers, Nigel

    Nigel

    17 February, 2010 at 18:12

    • Hi Nigel,

      By narrative I mean you can read through an execution path top-to-bottom, rather than having to jump around the file.

      We’ve ported it to C# to avoid having to use the jar at all, originally it was because we could not get deletes to work, though perhaps this is actually possible.

      Any tips on this appreciated.

      My point really is that I am frankly surprised and dismayed that this is deemed acceptable quality by Adobe.

      benbiddington

      17 February, 2010 at 18:23

  4. I think what people determine is acceptable quality is subjective. Personally I feel the quality of the library is exceptional.

    As Nigel suggests, it does read like a book. And a good one.

    Jim

    17 February, 2010 at 18:28

    • Well, there’re enough javadocs for a book…

      benbiddington

      17 February, 2010 at 18:30

  5. Jim, do you not consider the fact that a multi-thousand line single class is bad software design? I think it violates just about every tenet of OO design I can think of.

    Plus the fact that this could come out of an enterprise size company, accompanied with little to no tests surrounding it is rather surprising.

    Matt

    17 February, 2010 at 18:32

    • @Matt,

      I think the fact that it comes out of an enterprise size company is enough for me and particularly in the case of there PDF reader, they do some really fantastic products so there code must be good.

      Besides, what does OO design matter when the code works.

      @Ben,

      If there’s enough javadocs for a book and you’ve got the time you clearly have what with rewriting the sample code and all, why don’t you just print it all out and read it that way?

      Cheers, Nigel

      Nigel

      17 February, 2010 at 19:50

  6. Nigel, why should a car company manfacture a high quality vehicle when it succeeds in getting from A to B?

    I can think of a thousand arguments for good software design, not least of which are:
    a) its maintainable (I don’t need to read a thousand lines of javadocs to understand whats going on)
    b) its testable (it works when it should, and I understand why it fails when it does)

    Matt

    17 February, 2010 at 22:08

  7. Matt,

    To manufacture drugs you have to read a lot of books that are easily in excess of 1000 lines. You and Ben seem to have a problem with expending the effort necessary to grok this product.

    Nigel

    Nigel

    18 February, 2010 at 09:12

    • Hi Nigel,

      Fair point — I am definitely used to test-driven software composed of small objects that are easy to understand:

      * Classes that are described by tests. By reading the tests I can tell what the an object’s behaviour should be. By having the tests I can refactor with the gayest of abandon

      * Classes that are easy to read and understand because they only do one thing. Classes like this will tend to contain fewer lines.

      Apologies if my comments have offended you, feel free to criticize my work in return:
      http://github.com/ben-biddington

      benbiddington

      18 February, 2010 at 09:55

    • Nigel, you seem to confuse reading a book with writing software. Just because you are used to reading thousands of lines to understand a piece of code doesn’t mean that there isn’t a better way.

      Matt

      18 February, 2010 at 11:01

  8. @Nigel, I agree their are plenty of excellent books you must read that are more than 1000 lines long.

    It seems unfair to the writers if you criticize something that you have not red.

    I highly recommend this book to start with

    http://www.amazon.co.uk/Visual-Basic-Programmers-Reference-Programmer/dp/0470182628/

    Jim

    18 February, 2010 at 10:26

  9. @Ben,

    I’m not out criticise your work, man. What I would say though, is that in your post you come across as being very dogmatic about the quality of code, your github projects would seem to reflect this as well, mentioning “clean code” on a couple of occasions at least. I tend to take a bit more of a pragmatic view about this sort of thing; if the code is documented well which, in this case it is through thorough use of comments and the comprehensive documentation and it provides all the functionality it needs to which as Jim and I have established it does, I’m not sure what the problem is. I certainly don’t see what the point in having tests is anyway, for me if I have to fix a bug in the code then I’ll fix it when it occurs. Furthermore, your comment in relation to refactoring “with gayest of abandon” is just confusing; why on earth change something for no apparent reason, maybe I don’t understand refactoring enough though? Either way I sure hope your employer gets off on you wasting time like you clearly are.

    While were on the subject of book recommendations I think this book would be worth reading: http://www.amazon.co.uk/gp/aw/d.html/ref=redir_mdp_mobile/278-1774217-9688920?a=020161622X

    Nigel

    18 February, 2010 at 14:46

    • Hi Nigel,

      Yep, I am gun-shy and refactor happy, I have to learn to balance quality against practicality.

      But my point still stands: UploadTestJar is a surprising lump of mud.

      benbiddington

      18 February, 2010 at 15:03

    • I have read that book thanks Nige’. I especially enjoyed the parts where Hunt and Thomas talk very much in favour of refactoring and ruthless testing to ensure good software design and correctness, and very much against coding by luck (ie. “it just happens to work”).

      But then again, having read it, you already knew that, right?

      Also, I’ll pass on your thoughts on time-wasting to our Development Manager. I’m sure he’ll be pleased to hear that we can be even further ahead of our project plan than the 3 weeks we currently are. Cheers

      Matt

      19 February, 2010 at 10:51

  10. I have to say, you’re all talking nonsense here; both Ben and Matt are clearly people that you can rely on in a gun fight and, as a manager I want those sorts of people.

    What’s your problem Nigel, because if you if you’ve got a problem “Yo. I’ll solve it.” u get me?

    Dan Rough

    20 February, 2010 at 02:23

  11. Hey Ben / Matt,

    It’s been a while. I just watched this video filmed by Uncle Bob who I bet is one of your heroes.

    You thought you had it bad.

    Man up.

    Nigel

    8 March, 2010 at 11:26

  12. Hello Nigel,

    Thanks for your concern. I had forgotten all about you since we had ported your code over to something nice and testable weeks ago, but its nice to see you’re still thinking of us. You’ll be pleased to know that we can now send all sorts of signed requests to ACS not just what you had spewed out in the “UploadTestJar”. We are confident that it works because it is nicely tested (we are not “coding by cooincidence”), and that it is readable and maintainable by developers other than us.

    All of this is due to the fact that we have read the book you recommended to us, amongst others (including ones by Uncle Bob). In future, you should actually have read the books that you recommend us, instead of just googling the words “pragmatic” and “programming”. Maybe if you had, you wouldn’t be churning out such garbage that was responsible for us wasting many hours on.

    PS. We are ahead of our project plan by 3 weeks, but I’ll pass on your concerns about our team’s priorities to our development manager. Cheers.

    Matt

    8 March, 2010 at 15:29

  13. Hey Matt,

    Seems I struck a nerve; unintentional of course.

    Couple of things though: Firstly, I read Pragmattic Programmer years ago, presumably whilst you were still in diapers. It talks a lot of sense the stuff about broken window theory for example. You’ve got to remember though, the title of the book is “Pragmattic (the key part) Programmer). In our case, when the code works we don’t attempt to break it by attempting to fix it – I happen to working on something not out of coincidence, mainly when somebody needs something doing – like fixing a bug or something.

    Secondly, doesn’t your Development Manager deserve capital letters? 3 weeks ahead of plan eh, impressive. You’ve clearly never _actually_ read the Mythical Man Month – go get that and read it soon it will change your life, that’s where reality’s at.

    Speak to you soon I hope, Nigel

    Nigel

    8 March, 2010 at 16:53

  14. Nigel, sarcasm aside, you seem to be missing the point. You say your code isn’t broken, but how do you know this, and more importantly, how do WE know this? To us, your java code appeared to work in the first instance but we had no way of knowing whether the results were repeatable, what your testing procedures were (or if they even went beyond a few manual ingestions), whether they are even correct. There wasn’t even documentation about how you signed authentication codes beyond a few comments in your code samples. Unfortunately, our requirements are for something a little more bulletproof than “the code just works”. The name UploadTestJar doesn’t exactly inspire confidence, and your smug responses further up didn’t exactly alleviate our concerns about your coding standards.

    At our workplace, we test drive our code so we can be confident that it works and that expected behaviours are documented for others coming after us. I’m assuming by your standpoint that you have no such practices and this is a shame, especially considering the experience you claim and the literature you have read.

    Matt

    8 March, 2010 at 18:49

  15. I should note here that I don’t work for Adobe, just in case anyone takes the above comments the wrong way 😉

    One final thought though Matt:

    “At our workplace, we test drive our code so we can be confident that it works and that expected behaviours are documented for others coming after us”

    Whatever, dude. I’d rather spend my time writing production code that will actually earn me money than writing code that will never make it to production.

    I’d be interested in talking to your Development Manager at some point to see how he can justify to his boss having people like you working in the way that your clearly are.

    Nigel

    10 March, 2010 at 13:12

  16. Hi Ben,
    I am myself trying to port those jars in c#.
    I am having so much trouble doing so.
    Any chance I can get my hands on your work ?
    Or you can tell me what I am doing wrong with mine.

    Thanks

    Bill

    3 August, 2010 at 14:44

  17. Here’s a basic python implementation:

    http://github.com/internetarchive/acs4_py

    It’s incomplete, but it works, and we use it in production.

    Mike McCabe

    29 January, 2011 at 09:01


Leave a reply to Matt Cancel reply