Serialization rules for Adobe Content Server
Working with Adobe Content Server can be a truly depressing experience. The recommendation is to use a jar file — UploadTestJar — written by Adobe to perform HTTP RPC operations against the Content Server.
Problem is that UploadTestJar only does uploads, but we need full control, like deletes for example. Porting the java is possible, but it’s some of the most poorly written crap I have ever seen, and finding a specification is resisting web search.
Finally we managed to get a description from the support staff which’ll be helpful if you’re intending to port that awful UploadTestJar mess.
- All adjacent text nodes are collapsed and their leading and trailing whitespace is removed.
- Zero-length text nodes are removed.
- Signature elements in Adept namespace are removed.
- Attributes are sorted first by their namespaces and then by their names; sorting is done byte wise on UTF-8 representations.
- If attributes have no namespace insert a 0 length string (i.e. 2 bytes of 0) for the namespace
- Strings are serialized by writing two-byte length (in big endian order) of the UTF-8 representation and then UTF-8 representation itself
- Long strings (longer than 0x7FFF) are broken into chunks: first as many strings of the maximum length 0x7FFF as needed, then the remaining string. This is done on the byte level, irrespective of the UTF-8 boundary.
- Text nodes (text and CDATA) are serialized by writing TEXT_NODE byte and then text node value.
- Attributes are serialized by writing ATTRIBUTE byte, then attribute namespace (empty string if no namespace), attribute name, and attribute value.
- Elements are serialized by writing BEGIN_ELEMENT byte, then element namespace, element name, all attributes END_ATTRIBUTES byte, all children, END_ELEMENT byte.
This list is in actually the javadocs for the XmlUtil class. Why it’s all lumped in there is anybody’s guess. The serialization as described above is mostly implemented by one very long method in (1000+ line) XmlUtil.java: Eater.eatNode.
Note: The values of the constants BEGIN_ELEMENT etc are listed in the XMLUtil class.
Why I consider UploadTestJar poorly written
Here are some things I’ve noticed:
- Nothing reads like a narrative, i.e. , methods call other methods that occur before it in the file — makes files very hard to follow.
- Too many comments. I know this is a java idiom, but it make reading the stuff that matter more difficult
- Idiotic comments: inline comments that state the obvious and are just noise. e.g.:// retrieve HMAC key and run a raw SHA1 HASH on it.
byte hmacKeyBytesSHA1 = XMLUtil.SHA1(getHmacKey());
- XMLUtil.java contains several classes
- XMLUtil class does more than one thing:
- Parses XML
- Normalizes XML
- Creates XML documents
- Serializes XML, dates, bytes and strings
- Checks signatures
- Signs XML documents
- Hashes things
- Class UploadTest does everything in ctor: reads a file from disk, validates it, makes some xml, signs it and then posts it to the server.
- UploadTest the main entry point for executable, and it contains all the behaviour — it’s 1600 lines long
- Cannot use UploadTest without a real epub file
- UploadTest does too many things:
- Ctor does too many things
- Handles command line input
- Displays help/usage
- Asserts a file on disk has been supplied
- “Makes” content
- makeContent requires a file an epub on disk
- makeContent loads xml
- makeContent assembles xml files
- makeContent hashes things
- makeContent swallows errors and writes to stdout
- “Sends” content via HTTP
- Methods that do too many things, e.g., if/else branches based on the verboseDisplay flag
- Ctor does too many things