Sunday, April 5, 2009

Is Adobe Hindering eBooks?

Despite agitating for eBooks over print books, I have to confess I haven't looked much into the sausage-making process of ePub-formatted eBooks until earlier this year, when I began the Sisyphean task of converting Jack London's The People of the Abyss into a free ePub eBook.

At the time I began, I couldn't find any such ePub version of that book.

Yesterday, it came to my attention that someone -- using proper ePub expertise and skills -- has created such an edition.

Looking at the "professional" version was revelatory.

Several formatting problems that were stumping me apparently turn out to be a shortcoming both of the underlying code for ePub (XHTML) as well as the engine driving ePub display in Adobe Digital Editions!

In other words -- it wasn't me!

Let me first establish that this isn't a trick or rigged. This is the professional Abyss and my version showing up in the Adobe Digital Editions library. The pro one is first:


Click = big

Let's see what an opening chapter page of Abyss looks like in the authoritative printed version:


Click = big

Each chapter opens with an extract of some kind: poetry, a quote, an interview. Notice how the above is formatted.

Now see what it looks like in my alpha version:


Click = big

Remember: this is still alpha. I have yet to go in and really fix that bit. Given my poor knowledge of HTML, getting it right stumped me. So I was very keen on seeing how an expert professional would handle that.

This is how:


Click = big

Don't go all faint yet. This post has just begun!

InfoGrid Pacific, creator of the professional-level ePub, notes:
People of the Abyss (Jack London) 4.9MB Added 2009-02-12

Notes: Created with IGP:FLIP. Embedded font used on title page. Centered poetry and images display incorrectly in ADE

Emphasis added by me.

ADE = Adobe Digital Editions.

The awful significance of that note will be made clear towards the end of this post.

As I was writing this, I was complaining about ePub creation issues with Moriah Jovan on Twitter and she led me to the idea to also view the professional version in Calibre, which can also display ePub. That too was a revelation.

Calibre:


Click = big

Notice that the display formatting is perfect. Look particularly at "Goldsmith" -- it's properly rendered in Small Caps in Calibre, but not In Adobe Digital Editions!

I also want to note that my alpha doesn't have justified text yet. That's "easily" fixed via a change in its stylesheet. But it's worth noting that Adobe Digital Editions didn't support justified text until a few months ago!

More poetry is required as further illustration before I turn to something else.

From the printed version:


Click = big

My alpha version:


Click = big

Since I was mystified as to how to accomplish that formatting, I just enclosed it in a Blockquote for the time being so it'd stand out for later formatting.

The professional version:


Click = big

And now Calibre:


Click = big

But what's going on here? The per-line formatting is OK, by why isn't it all centered as in the printed version?! I don't know what's going on there (nor have I tried to rip the ePub back down to the source files; maybe later).

Now let's turn to the issue of Tables!

From the printed version:


Click = big

From my alpha version:


Click = big

Yes, I'm using Courier there as a placeholder. But I want you to notice that I have done my best to adhere to the printed formatting.

From the professional version:


Click = big

What is that?! It's hardly anything like the printed Table! Why are the items on the left not aligned flush left? I see some additional information in that Table too, which leads me to believe they've used a different edition than my source text (which I've tried to conform to the printed version used in this post). Even so, why are the denominations all out of alignment?

Calibre:


Click = big

The table looks the same -- but notice, Calibre shows it centered! Calibre, unfortunately, seems to also throw in an additional blank space right before the chart.

Another Table.

The printed version:


Click = big

My alpha version:


Click = big

The professional version:


Click = big

Again, why isn't the list of items aligned flush left?

Calibre:


Click = big

Again, the table looks the same -- but notice, Calibre shows it centered! Calibre, unfortunately, seems to also throw in yet another blank space right before the chart.

Now let's turn to something that is really damning of the code underlying ePub: Fractions!

From the printed version:


Click = big

My alpha version:


Click = big

Yes: I know I have the fraction incorrect in one place. This is an alpha version. It hasn't gone through the final pre-release line-by-line proofing.

The professional version:


Click = big

They don't even try for the fraction symbol, however! They cheat with decimal!

Why?

Let's see another example.

From the printed version:


Click = big

My alpha version:

\
Click = big

The professional version:


Click = big

And now Calibre:


Click = big

I doubt most readers will catch the issue here. It's this: the code that is the basis for ePub creation doesn't have code for every fraction.

Fractions outside the code have to be created with hand-made code!

Look closely:



The one-half fraction looks clean. That's because there's code for it. But for the six-sevenths? Go scratch! Create some handmade code!

This is inexcusable!

We are supposed to have a technology here to encompass all of the printed word.

I have a design background. I've been educated in design and type. I've made money doing design, so you could say I'm a paraprofessional (but not an expert or dedicated to it). I can see crap because I've been trained to do so.

How can HTML leave us in a lurch when it comes to something as basic as fractions? Why must they look like crap?

And I am mistaken that there isn't code for captions for photos and illustrations too? Will I have to work around that by editing the Abyss photos to include a caption made of a bitmap font?

It makes me wonder what other shortcomings reside in the underlying code that's been chosen.

It was Hadrien of FeedBooks who reminded me in Twitter to separate ePub from the technology used to display it.

Which brings me back to Adobe. And the awful significance of that note about how Adobe Digital Editions renders ePub.

Here it is: the Adobe software also powers the display of ePub eBooks on the Sony Reader. That Adobe software will also be the engine for other eBook readers too.

In other words, this formatting inability will be poisoning a large percentage of the eBook-reading hardware ecosystem!

Hadrien doesn't believe this is much of an issue.

I do.

As Adobe incrementally (and glacially, it seems) improves its ePub rendering engine, it's going to become larger.

How do we know that all current hardware eBook readers have enough ROM space to accommodate an upgrade?

And what about rendering speed? Although the new Sony Reader model 700 is peppy, what would happen when a larger block of Adobe code is shoved into the model 505? Would it become unusably slow?

Adding ePub to the Sony Reader made the original model 500 obsolete. Would another Adobe update make the 505 obsolete too?

And how is it, to begin with, that Adobe -- which sells InDesign as an eBook creation tool (among other things) and which sits on the IDPF -- can't create an ePub rendering engine to properly display ePub?

I think this mess provides a plausible explanation for Apple not entering eBooks. Apple's people can't be snowed. They have a heightened radar for crap. Steve Jobs himself must have sampled some ePubs on a Sony Reader -- and maybe even eBook samples on the abominable Kindle.

He must have alternately been disgusted and amused.

Disgusted because it looks like crap and isn't up to Apple's design heritage.

Amused because he knows it'd be easy to exterminate!

Do you really think a company like Apple -- which helped to pioneer desktop publishing, which brought design sensibilities both to dull metal boxes and to operating systems -- would rally around the current state of eBooks?

Apple already has a software tool that can be expanded to create a writer-friendly eBook creation tool: Pages.

If you haven't seen Pages, don't own a Mac, go to an Apple Store and play with it. It's filled with template after template so that even near-blind design illiterates can produce a flyer or booklet or resume without revealing their suckiness at design.

It's not too big of a stretch to see Pages extended to encompass eBook publishing -- but eBook publishing based on an Apple standard that won't throw away several hundred years of design progress.

Why should Apple bother to go with ePub? Stop to consider the millions of iPhone and iPod Touch devices out there. Their numbers crush the current population of all current eBook reading devices.

And if the rumor of an Apple tablet with a ten-inch screen are true, then perhaps Apple won't even have to bother with eBooks at all. Because a ten-inch screen could make the point moot by adequately displaying all existing PDF eBooks.

Apple doesn't have to play the current eBook game. It can create its own game. Just as it did with the original iPhone. And iPod. And Macintosh.

From what I have seen, it seems to me that Adobe is a big clog in the eBook pipeline. Due to its DRM scheme for ePub, it enjoys an advantage of near-Microsoft proportions. And it also seems to me this very lack of competition has caused it to act just as lazy as any monopoly.

After all, who else is there to turn to?

11 comments:

Unknown said...

Too much to talk about here. I'll just note that you shouldn't confuse the rendering engine with the format either. Adobe DE is not a requirement for rendering ePub. Stanza will continue to use WebKit on the iPhone.

The bigger issue is that as we see wider ePub implementation we could see an ebook equivalent of the browser wars. Not all rendering engines will support the spec in quite the same way, leading to a range of challenging formatting issues across devices and applications.

The alternative is a tightly controlled non-standard spec with rendering software owned by a single company. We have a good example of that now. It's called Kindle. And even Kindle has problems. I've heard a fair amount of grumbling regarding Kindle formatting issues - poor support for tables, failure to support fixed width fonts, etc.

Mike Cane said...

>>>The alternative is a tightly controlled non-standard spec with rendering software owned by a single company.

And that could be Apple.

Unknown said...

Yes, that could be Apple. Or Microsoft, or Google for that matter.

I'm still of the opinion that Apple has no interest in the book business, and it's not because of ePub or any other technical constraint. The issues surrounding licensing and the associated difficulties of dealing with publishers are unnecessarily complex for what would ultimately be a small potential payoff. Books are a tiny business relative to the other types of media Apple sells.

What Apple has done with the next version of the iPhone OS is to essentially open the platform and the marketplace enough for others to build an ebook business on top of. I believe that's as close as Apple will ever get to the book publishing business.

MoJo said...

I've heard a fair amount of grumbling regarding Kindle formatting issues - poor support for tables, failure to support fixed width fonts, etc.

Kindle sucks, at least from my perspective.

I put my 740-page doorstopper in Kindle in straight HTML. That was agonizing. It took me DAYS to get that halfway right and I still am not happy with it.

I put this project (illustrations and poetry) into Kindle uploader as a PRC file and it came out without a hitch--but it still looked...not great. I had to tweak the HTML for it (why? I used Mobipocket Creator to grind out the PRC file, natch), but I got it in three tries. Took maybe an hour.

That was apropos of nothing in this post, really. Just blowing off some Kindle steam.

Mike Cane said...

>>>Yes, that could be Apple. Or Microsoft, or Google for that matter.

I don't see Microsoft getting back in. There has to be a story there about why their eBooks effort just fizzled out. Too bad too -- says the rampant MS hater! -- because, as you can see from the two other posts, I really think highly of MS Reader.

Now Google is an interesting idea! Yes, with all the backlog they've stolen, they now have an interest in eBooks big-time. Perhaps even moreso than Amazon. Plus they have Android. I wonder if Google would gobble up an existing bit of software -- such as Stanza -- or do its own?

Unknown said...

Can you post the actual EPUB file so that you can see if this is a rendering issue in ADE or an issue with your markup?

Mike Cane said...

>>>Can you post the actual EPUB file so that you can see if this is a rendering issue in ADE or an issue with your markup?

There's a link to the pro version of Abyss in the post. And it can be UnZipped, because I've just done so.

MoJo said...

Just for show'n'tell, here's the CSS I did for the EPUB of the project I've been working on that involves poetry and illustrations.

Here's the EPUB file output.

RichardIGP said...

Mike,

I only just came across your post today after a bit of a track-back on TeleRead. I made the Abyss in my spare time as a result of your earlier blog about giving it a go. It was an interesting book, with a few interesting structures so I used IGP:FLIP and put it together "in my spare time" - so I have to take exception to the "professional" label - I was running in enthusiastic ePub supporter mode at the time.

Regretfully It didn't go through our normal QC processes, and I didn't have access to the scan images (where are they from?) so the fraction thing took me a bit by surprise. I took some eBook interpretive liberties with the book. I decided to drop dot-leaders on tables as I felt they would hinder reflow, and the space above and below the table is a 1em margin on the table block. The table is elastic with the width of the reader viewport. This is another issue with ADE I don't particularly like (Stanza has the same problem), they don't like vertical margin statements and strip them off. I can see why, so larger images go to the top and bottom.

AZARDI, Calibre and Stanza all use the WebKit rendering engine. So they are effectively the Browser in a box. Because of the Webkit high level of compliance with CSS, pushing of CSS-3 and superb SVG rendering, they fulfill most of the core engine requirements as long as you don't overdo the CSS.

This was always intended as a free example, so no surprise about the unzipping. We will be putting up about 50 more on the site in the next week or so, and encourage people to use AZARDI R2 to play with the internals and modify things interactively. I would like to have a go at the fractions with SVG, if these are the only ones I will update the onsite version. That should be interesting!

Anyway. Thanks for an amazing deconstruction, and comments. Of course I have a lot more to say, but it's late and I gotta go home!

Mike Cane said...

>>>I made the Abyss in my spare time as a result of your earlier blog about giving it a go.

Good God. This is getting recursive. And I'm flattered.

The tinted images were just placeholders. I intend to use the Google ones. Those aren't the best, but they're larger, untinted, and I can always upgrade them later.

Since it was free, I had no qualms UnZipping it. I figured it was put there to look at for the curious. If it had a password (can it have one?), I wouldn't have gone further due to lack of skillz.

Ian said...

You don't have to do something ugly like HTML super- and subscripts with a normal slash between them. You can instead build a fraction with unicode super- and subscripts with a fraction slash between them. Compare a unicode built-in fraction to a unicode constructed one: ⅔ ⁶⁄₇

This looks better in some rendering engines that others, so they probably won't look similar in your browser.