Comics, graphics and file sizes

A quick look at the issue of file-size in digital comics distribution, storage, and the pricing model.


While working on potential PDF support on Libreture, I've had an opportunity to take a closer look at the digital comics field. And a few things are worrying me about how e-comics are currently done.

Graphics-Heavy titles

While many e-book format file-sizes have remained consistent, the same cannot be said of digital comics.

The variey of formats, with varying quality, file-size and reader compatibility may be having an impact of some areas of the market. The most common digital comic formats now include:

    ComicBookZip/ComicBookRar - These are zipped files containing the individual pages as JPEG files. That's it. The final letter in the abbreviation (Z/R) refers to the compression format: Zip or Rar. These files usually contain no metadata, only the images.
  • PDF
    Portable Document Format seems the perfect format for comics. Each file is a linear, page-by-page, describing visual and text elements. Again, no metadata to describe the file's contents.
  • ePub
    This common e-book format works just as well for comics. Each page contains the same JPEG that would be included in the CBZ/R file, but with the added benefit of metadata to describe the title, author and other information.

The CBZ/R and ePub files seem to share their source material, the JPEG images. The same comic or graphics-rich book in either of these formats will usually be the same or similar file size, since they're both simply zip files containing the images.

PDFs, on the other hand are getting much larger than their counterpart formats, since they're now used as the preferred 'High Quality' version by many comic publishers and retailers, such as the wonderful Image Comics.

This leads us to a bit of a problem.

Size isn't everything

In supporting both e-book readers and retailers, it's my intention that Libreture addresses the fragmentation of digital libraries.

Buying e-books can end up being a messy task, especially when you're determined to only buy DRM-free titles. Buying only DRM-free books, both novels and comics means buying from many different shops. You regularly end up losing track of your purchases.

Libreture addresses this by creating a central library for all your digital reading material, but there's a cost involved : storage.

Text-based e-books, in the common ePub or Mobi formats, tend to be around 2MB in size. If they're much larger than that, it's usually down to the book being absolutely huge or becuase of it containing lots of inset images or a large cover image. Text is text and doesn't take up much space.

Comics on the other hand, or any graphics-rich title, is much larger. The JPEG format, used by most comic book publishers, doesn't have a way to describe text, other than as an image - effectively a drawing of the text. In a digital format, a drawing of text takes up more storage space than the text itself would.

Caveat: It's been a while since I studied image formats as part of my degree, so details may have changed, but the underlying principles are still sound.

Usually comics are drawn as vector images, as in line-art with clearly delineated areas of flat colour (not always, I know, but stick with me here). And JPEG is simply not the right format for that kind of image.

JPEG stand for Joint Photographic Experts Group, and the clue to its intended use is in the name. It's an image format designed for photographs and representative of reality, with gradual changes in colour and contrast.

JPEG is not well suited for line drawings and other textual or iconic graphics, where the sharp contrasts between adjacent pixels can cause noticeable artifacts. Such images are better saved in a lossless graphics format such as TIFF, GIF, PNG, or a raw image format.

JPEGs that contain line art have to use a low compression setting to look acceptable. So we're already wasting the benefits of JPEGs by not using the correct format for the job. Our lecturer used to get very angry at students using the wrong graphics package for the wrong job...

Avoiding high-compression when creating comics means the files are larger than they really need to be. And you can see that comic publishers are aware of the issue, since they've started increasing the compression for CBZ/R and ePub files to maintain lower file sizes, while at the same time reducing the compression of their PDF files, and labelling them 'High Quality'.

We're gonna need a bigger server...

Comics afficionados don't pay for low quality.

Comic publishers know that their readers won't stand for not receiving the highest-possible quality file when buying digital comics. If the switch from paper to digital means 'less' in any way, that's a big ol' nope.

Let's look at a particular comic and see the difference in image quality across formats.

Cover of Trees - Issue 1.I have Trees #1 by Warren Ellis and Jason Howard to hand. Published by Image Comics, it's available in CBZ/R, ePub and 'High Quality' PDF formats.

The book is 35 pages long.

ePub & CBZ: 34MB
PDF: 121MB

I zoomed into the same area of a particular page in both files. The difference was apparent. A lot of the bad image quality comes down to artefacts in JPEG files.

Not only has the PDF been created with less compession of the awesome original artwork, but the format has its own compression methods that come into play at very large sizes. Meaning the very largest files PDF files are not much bigger than their ePub of CBZ counterparts.

You would want the very best quality version to hand, right? But downloading all your PDF editions and storing them separately is a big ask, when the files can be up to 6 or 700MB for a single book!

And many portable devices simply won't open those files. I have two separate Android tablets of different ages, that are becoming increasingly useless as both apps and files increase in size and memory usage.

Comic book readers are being asked to download one format for fast access and reading on older or lower-capacity devices, while downloading another for reading on high-end or newer devices, and for archiving. This leads to fragmentation of the comic collection.

If Libreture is as much for comic book readers as novel readers, which I intend it to be, then I need to address this storage capacity problem. Libreture is in a great position to provide a worthwhile service to comic readers, RPG fans (try getting those tomes in ePub format), and readers of other graphics-rich titles.

While it already supports ePub comics, I'm currently looking at how to best support PDF uploads, and add user-contributed metadata to PDFs that don't already have it. More on that in another post.

The remaining issue is the greater storage cost of supporting such large files. Increasing the cost of the Libreture subscription for everyone isn't fair, so the cost needs to be tied more closely to usage. This is an even bigger problem when considering there's such a large disparity in file size across different formats for the same number of books.

You could store 300 ePub novels in the same space as a single PDF comic.

You know what that means:

I'm gonna have to go FREEMIUM!

Tiered pricing is something I was hoping to avoid, but there are definite benefits to ensuring your customers can get exactly what they need from a service.

I'm off to ponder the best price tiers for readers. Libreture is designed from the ground up to be sustainable, and that includes changing to support everyone's different needs.

There sems to be a real problem with comic book storage and online collection management. So let's see what we can do about it.

I would love to hear your thoughts. I'm on both Twitter and Mastodon, so give me a shout to discuss.


Discuss #comicstorage

Join Libreture

Digital storage for hardcore readers

Upload your digital comics and e-books, organise your reading lists, and discover great new titles.