Jeff Duntemann's Contrapositive Diary Rotating Header Image

10,000 Pirated Ebooks

Ebook-related items have been gathering in my notefile lately, and this is a good time to begin spilling them out where we can all see them. The triggering incident was a note from the Jolly Pirate, telling me that one of my SF stories was present in a zipfile pirate ebook anthology that he had downloaded via BitTorrent. That people are passing around pirated versions of my stories is old news. “Drumlin Boiler” was posted on the P2P networks a few months after it was published in Asimov’s in 2002, and my better-known shorts have popped up regularly since then. No, what induced a double-take was the name of the pirate anthology: “10,000 SciFi and Fantasy Ebooks.”

10,000? You gotta be kidding!

But I’m not. Jolly sent me the 550K TOC text file, which is 9,700 lines long, with one title per line. Not all are book length, and many, in fact, are short stories. Still, the majority of all book-length SF titles I’ve read in the last thirty years are in there, and so was “Borovsky’s Hollow Woman,” albeit not under my byline. (I wrote the story with Nancy Kress, who is listed as sole author.) The only significant authors I looked for but did not find were George O. Smith and Charles Platt. (One howler: Bored of the Rings is said to be by J. R. R. Tolkien. Urrrrp.)

The collection is 4 GB in size. The Jolly Pirate said that he had downloaded it in just under three hours. He attached the file for “Borovsky’s Hollow Woman,” which was a plain but accurate 57K PDF. Intriguingly, the date given under the title is January 28, 2002. The damned thing has evidently been kicking around for at least seven years, if perhaps not in its full 4 GB glory. This suggests that the anthology is not entirely ebook piracy but mostly print book piracy. (“Borovsky” was never released in ebook form.)

Some short comments:

  • I verified the existence of the anthology from the Pirate Bay search engine. It really does exist. (So, evidently, does the Pirate Bay, which surprised me a little, considering recent efforts to take them down.)
  • 10,000 ebooks do not take a great deal of space by today’s standards. (Admittedly, better files with cover scans would be larger.) No one will think twice about a 4 GB download for size reasons, when 750 GB drives are going for $69.95.
  • The PDF is ugly. The lines are far too wide for easy readability and (since this is not a tagged PDF) not reflowable. That said, I did not find a single OCR error.
  • The Windows pathname of the text file from which the PDF was generated is shown at the top of every page. The pathname includes the full name of some clueless Dutch guy, from whose Mijn Documenten folder the file came. Ebook piracy clearly belongs to the common people, not some elite cabal skilfully dodging the **AA.
  • I’ve used a scanner to rip a couple of print books (plus ten years of Carl & Jerry print stories) and it is horrible hard work. However, the anthology demonstrates that if print is a form of inadvertent DRM (which I have long thought) it is not a particularly strong one. After all, as Bruce Schneier has said about DRM systems generally, they only have to be broken once.

This last item is key. A printed book is a worst-case challenge for an ebook pirate. Compared to cutting off the binding and making sure the paper pages all feed straight through the scanner ADF and then fixing the inevitable OCR errors, stripping out an ebook’s DRM is trivial. If ebook piracy is not yet a big deal, it isn’t because it’s difficult. It’s still because reading ebooks is borderline painful. I may not be typical, but if I can buy a used copy of a recent hardcover of interest for $10 or less, I’ll choose the hardcover rather than an ebook at any price. Sooner or later the readers will catch up to paper, and by then, well, we may see a 4 TB file called “10,000,000 EBooks About Everything” on the file-sharing networks, and it won’t even take an objectionable chunk of our 80 TB hard drives.

You think I’m kidding? Let’s compare notes in five or six years.

9 Comments

  1. Darrin Chandler says:

    Predicting the future is tricky, not because the assumptions one makes are invalid but because of factors that are not currently known or are known but have unknown future impact.

    Imagine this: p2p evolves a bit so that a 10,000 eBook torrent are equivalent to the concatenation of 10,000 torrents of the individual eBooks. That changes the way people download, because you can select what you want. You can always return later to pick up another batch of eBooks.

    Or… let’s take it a little further. Imagine that you can download and store the entire catalog of human knowledge in reasonable time and for reasonable cost. Now what? The burden becomes finding what you want and consuming it. If download and storage is cheap and practical then you might as well NOT download anything until you are ready to consume it.

    It seems to me that the idea of hording/stockpiling things you may want some day or obtaining a huge bundle for the sake of a few desired items is mostly outdated now and will become more so in the future.

  2. Depends. I asked Jolly about this and he told me that he thinks monitoring of P2P networks and Usenet will grow even worse in the future, so he’s bringing down everything in sight right now against a day when some of it and maybe most of it will be unavailable for purely political reasons. Storing it doesn’t cost him much, and it appears to be a sort of hobby, like stamp collecting.

    I think the key question he’s raising is this one: Who controls my searches when I search for content? If the content is on local storage, I do. When it’s out on the Net, somebody else does. I don’t know how important an issue this is, and I don’t have the paranoia gene, but it’s not a completely idiotic question.

  3. Darrin Chandler says:

    There’s an assumption in there that content searched for or obtained through p2p is unlawful. While that may be largely true today it doesn’t have to remain so. Apple *could* do iTunes through p2p if they chose. P2p itself is neutral tech, and some legit companies/organizations use it for distribution. I think a big factor is having a lawful, convenient, reasonably priced means of getting content.

    On privacy: there are (at least) two distinct issues. One is wanting to hide things because you’re doing something unlawful, and the other is not wanting every detail of your private life end up in company/government databases or put on display on the Internet. Those two things share some concerns but they’re certainly not the same.

    I’ll be interested to see how this plays out over the next decade.

  4. […] and his answer was simple and obvious: “Because it was easy.” Most of you have seen my entry for December 29, 2009. Jolly downloaded 10,000 ebooks in a couple of hours. That scares some authors and publishers a […]

    1. Daniel says:

      People have been downloading music for years now..publishers should have known this was coming…Knowledge should be free!!!!!!!!!!! what next? your gonna start charging people for group discussion talks? Lil Wayne music artist sold 1.5 million records in 1 week… so whats this tell you? if your worth a S&*T you will sell regardless…so stop crying and actually put books out that people wont mind buying regardless of the price. music, movies, books and software should be free..if you create software worth a damn APPLE/Microsoft will offer you jobs!!! DERRRR.. look at all the amateur youtube hits daily? they get on T.v and get contracts for movies with no thought or money down…a simple click and record..so get with the program…

  5. Mike Jolley says:

    Hello Mr. Duntemann. I found your site by searching for Borovsky’s Hollow Woman. I read it in the 3rd Omni Book of Science Fiction when I was in high school, maybe 15 years ago, and haven’t forgotten it. I then lent it to a friend and never got it back.

    I found the reference in your post about pirated ebooks. I purchased the book, so I think I’m somewhat entitled to download the story for free. Also, it’s for good reason: I’d like make a screenplay out of it. It would make a terrific movie. You wouldn’t be mad at me if I did that, would you?

    1. Not at all. I’m delighted that you still even remember the story, which was first published (after all) in 1983. And if you bought the book, you bought the story. (I’m not a copyright paranoid by any stretch of the imagination.) I’ve republished the story in my 2008 story collection, Souls in Silicon, which can be had here:

      http://www.lulu.com/product/paperback/souls-in-silicon/3408142

      As for a screenplay, that would be very cool. Keep in mind that the story is only half mine: Nancy holds the other half and would have to be consulted and (probably) permission requested.

      If you’re serious and would prefer to talk via email, you can contact me at jeff at duntemann dotcom.

  6. […] I’ve written before about massive ebook collections available on pirate sites. The file I mentioned in that entry was an old one, and crude. Newer and even bigger ones are available now. (No, I won’t tell you where they are.) Specialized collections are turning up as well, as specific as books on programming for Android. […]

  7. Tom Baker says:

    I used to feel the way you do about print books vs ebooks but that has changed. Reading ebooks (with calibre) is very handy. I can recline on my couch or chair and flip through the pages with my wireless mouse, making the print any size I want to. I now have no interest in paper books anymore.

Leave a Reply

Your email address will not be published. Required fields are marked *