16 The Curation of Obscurity (Peter Brantley)

Peter Brantley is the Director of the Bookserver Project at the Internet Archive, a San Francisco-based not-for-profit library. He was previously the Director of the Digital Library Federation, a non-profit association of research and national libraries.  He has worked in senior information technology management roles at the University of California; the New York University Libraries and Press; Rapt, a startup firm focusing on advertising optimization, acquired by Microsoft; and the mass market division of Random House. You can find Peter on Twitter at: @naypinya.

The problem with reading, when you are trying to think about books, is that you wind up abstracting the act of reading and what is read–the cognition, the understanding of character, story, or explication, and the dreamworld of immersion. This does horrible things for our comprehension.

As we encounter the book’s future, we also face the challenge of not knowing the true state of the thing it is we want to talk about. To be droll, it has not changed but is in the act of changing, and may yet soon be the thing it will become while preserving certain aspects of what it is. It is as if Schrodinger’s Cat is made real; the book exists in a superposition of forms, paper and virtual; yet when we pause to consider it, we must perceive it in the light of one, casting only a vague and translucent shadow on the other, lest it appear as an enigmatic muddle.

Books on the Web

I make the assumption that as books increasingly become digital, they will be presented on the Web. The browser—or more specifically, the browser’s rendering engine, e.g., webkit—has been the dominant rendering mechanism for digital content ever since the advent of the Web. The Internet offers a low-barrier distribution mechanism for information, and the browser provides an interaction container for a relatively coherent set of standards over content presentation and behavior. As the Web becomes more ubiquitous, HTML’s cluster of more or less open standards is ever-growing in sophistication, adding support for sensor data and geo-locational awareness as well as more transparent media inclusion and user feedback.

What the Internet does not provide is a sense of boundedness. When books become unbound experiences on the Web, there is no package to download and preserve, nothing encapsulated to protect. As a librarian, one of the most glaring and problematic ramifications of the networked book is the deleterious impact on cultural preservation. Networked books are inherently less substantial than containerized digital books, much less so than print ones. To preserve a library, or a publisher’s backlist, will increasingly require preservation of the Web.

Web-based culture is at tremendous risk of loss, because we have few standardized means of recording it. Even the Internet Archive’s Wayback Machine[1] can provide no assurance of permanence. Nor does it provide any assurance of completeness. There is nothing today that says of a book on the Web: “I am a book and worth preserving.” Our only recourse is determined planning and the sagacity to ensure that our culture is recorded as many times as possible, in as many ways as possible, and should one website blank out, it be redirected to a place that still responds to a HTTP GET.

Web-based content also demonstrates the “Show me the money!” conundrum. It is hard to generate revenue from individual pieces of content on the Web, yet atomic (unit) pricing has been the dominant model for book markets. We can expect subscription or site-based access models, but they will only be compelling to the extent that books self-organize or are aggregated by publishers and retailers into online communities. An aggregation might be content neutral, such as an expansion of Amazon’s store into a more fully web native environment, or it might be a topic-focused, curated site for specific areas of interest.

There are technical areas of uncertainty as well. The dominant ebook standard, EPUB 3, is essentially a self-contained website in a file that incorporates a manifest and a specified set of included content. There is no assurance that publishing will seek a future EPUB 4 that would support a manifest of links; in other words, to pose it as a question, will restrictions on application behavior be attractive enough to warrant the production costs and barriers of standards compliance? That is not clear. Already, browser rendering engines are beginning to support EPUB parsing and rendering. It is conceivable that they could extend that support to defined, bounded parameters governing permissible content, interactions, scripting, and privacy. It is also possible that the EPUB development process may become so interwoven with HTML standards that the distinction between book and web will eventually fade entirely.

Beyond the question of what gets published, there’s the question of who publishes. Publishers fulfilled the task of content preparation for a physical supply chain and generated accompanying uniform pricing control and revenue mechanisms. By dint of a fairly stable set of organizations working in the same industrial process over the course of decades, the historical publishing business harmonized our concept of the book to an easily manufactured and shipped product using an expected set of industrial partners. Although a coffee-table high art book exhibits significant differences from a mass market “pulp” romance, they are both easily recognizable as different species of the same genus. That is not likely to be as true in the future.

Publishing has also impacted who writes, and how. A system of advances for creative output to be executed for delivery at a future date and well-worn machinery around author-agent-publisher relations and contracts have meant that individual production roles were well established and widely considered normative. These relationships were critical when the authors of a book could not readily be an integral part of the distribution and sales process.

Writing books for the Web will require a different set of relationships. The widespread availability and robustness of blogging software has moderated the most difficult aspects of crafting web-compliant code. Although web standards are arguably growing more complex as they accommodate a wider range of browser behaviors, the software ecosystem around blogging will inevitably match this pace. With push-of-a-button distribution, many of the traditional publishing activities are obviated. The ability to incrementally publish and solicit community feedback marries well with community-sourced funding services such as Kickstarter. Community-centric design and funding is likely to create a different kind of literature than author advance-led publishing.

It seems inevitable that these changes in how publishing is executed will touch with a heavy hand the product of the creative process. Books and their successors may have a fungibility that we do not presently encounter outside of the academy’s scholarly communication practice, where any number of pre-prints or a post-print might easily substitute for the worth of the formally published piece. More critically, when one can revise and link to external content trivially, the boundaries of any work become more porous. What evolves out of this process as the dominant social form for cultural education and entertainment may not be something that we refer to as a book, or if the term persists, its meaning may slowly but subtly shift as reader expectations change.

Books as Text

Despite the felicity of media-agnostic machine-mediated information creation and access, text remains an attractive format for idea production and consumption. Via literacy, it offers a low threshold for cognitive processing and conceptual understanding. Although often punctuated with illuminations, such as graphics, pictures, maps, and the placement of text on the virtual or physical page as a canvas, text as a base layer is easily converted into mental imagery and learning. It is an efficient way of telling stories and providing narratives, whether fictional or not. It is also, fortunately, one of the most parsimonious vehicles for cultural preservation possible.

All books that have been migrated to the network present the availability of enrichment: linking out to resource articles, online interactive maps, multi-user environments that add new layers of engagement. Yet a simple textual, and often linear, narrative offers something even simpler: the ability to not fully elucidate, to not share extra layers of detail and information, and create shadow through parsimony.

Part of story-telling is about choosing artifice. The curation of a certain amount of obscurity enlists our minds in the drafting of a story, a mood, and a dream—all in concert with the work of the author. Great literature is made in the interweaving of self and story.

I am increasingly convinced that a great deal of human story-sharing must persist at the simplest level available. We are not very intelligent creatures; we poison our world, craft intricate designs of power that do violence to hopes and dreams, and treat each other with willful, artful cruelty. These are not necessarily the hallmarks of a long-lived species. I suspect our ability to use the full level of technical tools at our disposal to assist our storytelling has been superseded by the potential complexity of the stories those tools can tell. It is our storytelling singularity, and one we have yet to master.

We are just beginning to grapple with how we learn from and use complex media. Creative arts will have to acquire an understanding of when and how we can take advantage of presentation technologies now emerging. When should their affordances be made visible, and when should they collapse into transparency? Most importantly, we need to ensure the reader can retain control of the experience they increasingly help to craft, always permitting them to choose when they suspend disbelief. Stories will increasingly become ours at an explicit level through choice and act, rather than simply through our implicit imaginings. Augmented reality and haptics will influence the manner in which we expect information to be presented, but they will not replace narrative.

It is true that we are surrounded by books that, by their nature, should always have been digital and never books at all. We froze them into a physical form because that was all we could do with tabular data and rich information. Atlases made rigid as oversized picture books, compendia of various facts and speculations printed as so many beautifully designed encyclopedias. Cookbooks and phonebooks presented as a series of manually navigable facets: soups, vegetables, meats, and desserts on one hand; an alpha-sort order listing by name, and type-of-business on the other.

It is intriguing to consider how punch cards, and the Hollerith language in which they became uniformly coded, allowed us to manipulate the simplest of otherwise frozen facts; first one way and then another. Sort by name, street address, or zip code. Examine one response cross-tabulated by another. These were the beginnings of databases made fluid in a digital form, through query. They offered a target for a curiosity more nimble than the scrutiny afforded by older finding aids: tables of contents and indices. Unlike the printed book, computer card decks were inflexible in the formats of information they could capture: text and numbers. Computers were hardly residential: the requisite analytical equipment was impractical to acquire for the home environment. One could not compare black-eyed pea recipes in one’s cookbooks by throwing a deck of cards into an analytical engine.

Living in Obscura

The world occupied by the book both grows and shrinks. The genesis of Wikipedia was transformative in the lay understanding of how data could be presented and interacted with online, with the broadest level of access. To conceive of a book on birds, seashore shells, or the plants of Hawai’i without the framing of Wikipedia is impossible. We see a growing use of linked data concepts to embed metadata within online data repositories, allowing them to be interlinked easily, to tell new kinds of stories. User queries against the growing assemblage of information deepen our understanding of the world, and make for it a more mutable framing.

Yet in a fashion, this interlinkage of experience begins to redefine our lives. Role and ritual seem plastic in all but the most fundamental binary aspects of parent/child, awake/asleep. The tools we use subtly interweave business and family, entertainment and education, work and play. Reading is a world-of-its-own activity, but the book as an object was merely a harbinger of the combined real-world isolation and digital integration we craft more generously for ourselves with mobile computers, tablets, and phones. We must not cling to the firmament of the object as it becomes virtualized by a growing permeability of real and virtual life. We tell each other stories, but we tell them in a digital shadow of ourselves. For us those stories become real, when we touch and join our stories together.

One of the last winnowing spheres of our separateness is the privacy we previously were able to enrobe ourselves in by being in a specific place, and not any other. As books take residence on the Web, our browsing, acquisition, and reading experiences are intrinsically uplifted to machines for processing, mining, and re-presentation in recommendations and other marketing. Who we are, seen through the lens of what we experience, becomes commoditized. For the reader, as a user, it will be important to fight for the ability to control the distribution of our information and to provide mechanisms for people to exercise control of how information is shared, and with whom.

It is a difficult design task for the Web to simultaneously deliver privacy education with empowerment, but it is something that we must engineer into the fabric of our virtualized existence, even as we begin to embody it ever closer to ourselves. It is not something to ask of the book; rather, it is something to demand of our technology and our tools. Business prerogatives are independent of the fundamental respect for individual rights; to consider privacy as something that we must manage communally will require the creation of both legal and software codes. Rights are not policies or practices.

Story-telling is a nexus of our technology, intuition, and empathy. We can embrace and celebrate its future as long as we preserve one for ourselves.

