Ben Biddington

Whatever it is, it's not about "coding"

Adobe Content Server — packaging large files is painful

with 5 comments

Ordinarily, working with Adobe Content Server (ACS) is more or less tolerable, but recently we have encountered what may be another indication of the quality of this product.

Packaging

Packaging an ebook amounts to posting a signed request to the server, describing the file you’d like to ingest.

You can use UploadTestJar (which may have its own issues), or (if you’re lucky) you can rewrite some sample codes in your chosen language and use that.

You’d think then, that once you’ve got it working you can summarily be on your way, forget about ACS and finish your application.

And you can.

Until the OutOfMemoryExceptions start

While test-driving our application, we naturally wanted to describe what happens with different sizes of files, so we tried some large ones. These would fail with errors about being out of heap space, errors like:

21-Apr-2010 13:37:26 org.apache.catalina.core.StandardWrapperValve invoke
at com.adobe.adept.servlet.AdeptServlet.doPost(AdeptServlet.java:180)
SEVERE: Servlet.service() for servlet Package threw exception
java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Unknown Source)
at java.lang.AbstractStringBuilder.expandCapacity(Unknown Source)
at java.lang.AbstractStringBuilder.append(Unknown Source)
at java.lang.StringBuffer.append(Unknown Source)
at com.adobe.adept.xml.XMLAbstractDigestSink.characters(XMLAbstractDigestSink.java:133)
at com.adobe.adept.xml.XMLSink.characters(XMLSink.java:261)
at com.adobe.adept.xml.XMLFieldReader.characters(XMLFieldReader.java:447)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.characters(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
at com.adobe.adept.servlet.AdeptServlet.doPost(AdeptServlet.java:180)

We tried adjusting the available memory for “sags” (as Dan Rough would put it), and this did work to a certain degree, but is not a satisfactory solution.

To me it looks a bit like an attempt to load entire file into memory at once. Surely this can’t be right, can it?

Examining AdeptServlet

In an effort to understand the nature of the problem before solving it, we decided to have a look at that servlet. We decompiled it and set about finding that class, and it is:

com.adobe.adept.packaging.servlet.Package

In the doPost method, there are these lines:

if (paramParsedRequest.data != null) {
    localObject1 = new PDFPackager(paramParsedRequest.data);
} else {
    localObject1 = new PDFPackager(new File(paramParsedRequest.dataPath));
}

This shows the two methods of loading a PDFPackager.

Examining PDfPackager, we can see the ctor has four overloads including these two:

public PDFPackager(byte[] paramArrayOfByte) throws Exception {
    this(new ByteBufferByteReader(paramArrayOfByte));
}

public PDFPackager(File paramFile) throws Exception {
    this(new FileInputStream(paramFile));
}

So, it appears the problem may result from usage of the first version.

That  paramParsedRequest argument to doPost is of type ParsedRequest, and its data property is a Byte array.

This could be a problem: when submitting a package request with a data node instead of a dataPath node, we’re using the byte array overload.

Where is the error actually coming from?

From the stacktrace it looks as though it is coming from whatever is creating the arguments to supply to Package.doPost.

This is the responsibility of Package‘s supertype: AdeptServlet<RequestParser>. It is this class that is responsible for parsing the http request into one of those ParsedRequest objects, and then supplying that to Package.doPost.

The problems starts here at the top level request handling:

// AdeptServlet<RequestParser>
doPost(HttpServletRequest paramHttpServletRequest, HttpServletResponse paramHttpServletResponse)

This is where the request parsing happens, and then — as the stack trace shows — an error ends up resulting from XMLAbstractDigestSink.characters.

XMLAbstractDigestSink.characters attempts to append data to an internal StringBuffer.

Summary

This mechanism has not been designed in any kind of scalable manner — buffering files in memory is utterly nuts.

Why not just write the posted data to a temp file and use the other PDFPackager ctor?

The solution

Well, one suggestion is to not post the files at all, but make a slightly different packaging request that supplies a path to a file on disk rather than the file itself.

To do so requires — as described in ContentServer_Technical_Reference.pdf — supplying a dataPath node in your request instead of a data node.

The downside for us is that now we need to manage this shared file location — a non-trivial task when working with Windows services.

Another (unlikely) solution

Modify the application, i.e., AdeptServlet<RequestParser> so it first copies the posted file to disk, and then proceeds as though it received a dataPath request.

Pretty hard without the source — it’s probably actually against the law, is it?

References

Advertisements

Written by benbiddington

27 April, 2010 at 14:00

5 Responses

Subscribe to comments with RSS.

  1. Adobe is great! kthxby.

    Nigel

    28 April, 2010 at 11:12

  2. Cheer up Nigel. It must be hard to have others poke holes in your work all the time, especially MS developers…

    Matt

    28 April, 2010 at 17:46

  3. I’ve had the same problem when packaging large book into ACS4, using the dataPath method and importing books already in the disk is helpfull.

    But after that steps the error still occurs go to tomcat configuration and define the maximum pool memory to a value high enough, let’s say 1024MB. Also don’t forget to use -Xnx1024M in your command.

    Hope this helps.

    Sérgio

    7 May, 2010 at 15:36

  4. ACS4 has to be one of Adobe’s biggest mess of an application they have ever created. SO many things wrong with it. Everything from the tech writing in the manuals, to the user interfaces or lack thereof, the poor coding and processes involved in protecting a file, all the way to the terrible stock digital editions reader. Truly this is a hot pile of you know what. But there really is no alternative out there for many of us, so we just have to keep fighting thru.

    Dave

    28 July, 2011 at 02:25

    • Adobe should open source it to get it fixed, perhaps.

      benbiddington

      28 July, 2011 at 12:40


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: