Last month, together with Silvia Stoyanova, I delivered a lecture at the “Methodological Intersections”: Trier Digital Humanities Autumn School 2015 (which Silvia co-organised) on the topic of ‘The Multidimensional Scholarly Archive’. Underneath Silvia’s fantastic contribution. I will post my part of the lecture tomorrow.

A file clerk at the FBI working with a table covered with files. (Photo by Thomas D. Mcavoy

Power Point Slides

In this lecture, we will address the digital scholarly archive as an agent of mediating the consumption, production and publication of humanities scholarship. In particular, we will discuss

  • The transformation of the traditional archive in the Digital Humanities, namely digital archives produced by scholars for the use of other scholars;
  • Archives scholars create for recollecting and organizing their research material;
  • Digitally-born archives and self-archiving;
  • Methodological, economic and political considerations in creating a digital archive;

Digital archives in the humanities: definitions, functionalities, goals

The digital medium has changed the way scholars gain access to and work with archive material, and offers new platforms and functionalities for its organization and presentation. In the past two decades the emergence of digital projects describing themselves as archives has raised fundamental questions about how the digital paradigm of the archive is informed by and transforms its traditional concept as a repository of material artefacts, especially codices in rare book collections.

SLIDE 6 The Walt Whitman Archive

The first major Digital Humanities projects in the 1990s – the Rossetti archive, the Walt Whitman archive, the William Blake archive, the Willa Cather Archive – call themselves archives in their ambition to offer comprehensive access to the material record left by individual authors. Folsom, editor of the Whitman Archive, describes the objective of the project in terms of accessibility and comprehensiveness:

“Our goal when we began this project in 1996 was to make all of Whitman’s work freely available online: poems, essays, letters, journals, jottings, and images, along with biographies, interviews, reviews, and criticism of Whitman”.

The objective of many DH projects also today is to produce digital archives and editions, where generally the edition is more focused on published works and the archive is more comprehensive.

Kate Theimer has recently argued that the distinction between the traditional notion of the archive and its adoption in DH projects is not always appreciated by digital humanists, who see the digital archive as a collection of digitized records, gathered often from different physical sources and for a specific scholarly purpose.

Indeed, Kenneth Price, Folsom’s co-editor of the Whitman Archive, suggests that in the digital environment archive has come to mean “a purposeful collection of surrogates”, which can take a variety of other designations, such as edition, project, database, thematic research collection (2009). Price proposes a new one, free of the history of print culture of archive and edition, and of the technological implications of database, namely arsenal (in the etymological sense of workshop).

Price also writes that the digital conflation of an archive and a critical print edition, which characterizes many DH projects, produces a superior hybrid: “archive in a digital context has come to suggest something that blends features of editing and archiving. To meld features of both — to have the care of treatment and annotation of an edition and the inclusiveness of an archive — is one of the tendencies of recent work in electronic editing.” In the case of Whitman, who “is well known as the writer who couldn’t stop writing, revising, and reissuing Leaves of Grass”, another demand for creating a comprehensive digital archive was to be able to reconstruct the creative process in its variants, whereas “print editions tended to falter when dealing with multiplicity, whether of versions or of authorship.” (Price 2009)

Folsom further emphasizes the editorial goal behind the digital archive’s desire for inclusiveness, namely its function of unifying what is dispersed in physical space in order to virtually reconstruct an original whole: “On The Walt Whitman Archive, you can now place next to each other documents that previously could not be seen together. Already, notebooks that were once disbound and ended up in different states or different countries are being rediscovered, and manu­scripts are fitting together like the rejoined pieces of a long-scattered jigsaw puzzle. […Scholars are] discovering lost connections (even reassembling notebooks that were long ago dispersed).” (Folsom 2007)

The aspiration for inclusiveness and reconstructive editorial activity of digital archive creators strongly diverge from the curatorial principles of the traditional archive, as Theimer explains.

SLIDE 7 Principles of the traditional archive

The etymology of “Archive” implies “authenticity” and “authority”, and it is this root concept that the traditional archive aims to respect in its principles of preserving original documents in their original context and order. Theimer emphasizes the “organic” relation between the documents in an aggregate, the archive’s objective being to preserve that relation by means of the principles of provenance, collective control and original order. Thus there are four fundamental characteristics of the physical archive:

  • Original or unique (not published) aggregates of materials with an organic relationship
  • Provenance: archivists identify aggregates according to the source of the aggregate, not the subject; records of different origins (provenance) must be kept separate to preserve their context; source of a record =/≠ author (family, organization, etc.)
  • Collective control: the aggregate of records may contain records with many different authors; the contents will not be removed and added to other aggregates based on the individual authorship or topic
  • Original order: the original order imposed by the source of records should be preserved

Theimer suggests that digital humanists should rather use the term “digital collections” to describe their projects instead of archives and is concerned that “there is the potential for a loss of understanding and appreciation of the historical context that archives preserve in their collections”.

Shillingsburg, “Development principles for virtual archives and editions” (slide 8)

Peter Shillingsburg’s developmental principles for virtual archives and editions generally support Theimer’s arguments of making clear the distinctions between the original record and its digital surrogates and layers of interpretation. The material record is primary, the digital surrogate secondary as the first layer of interpretation, the critical annotation another degree removed as a second layer of interpretation; digital imaging is the closest surrogate of the original record and can be more accurate but it is two-dimensional; a modular design for maintaining the digital archive’s contents allows for its repurposing by multiple users; archival surrogates are maintained by keeping them separate from the layers of analytical markup.

At the same time, Shillingsburg notes that we need to appreciate the different appeal the digital archive holds for the scholar: “the appeal of digital archives is never that they are fundamental; instead, it is that they are accessible anywhere, malleable, searchable, and capable of being analyzed and commented upon.“ The appeal is no longer that of the original artefact’s materiality and uniqueness or with its status of a historical witness, but rather its capacity for circulation and interaction. In other words, users appreciate the digital archive‘s discursive and performative dimensions.

The inclusiveness to which digital archives aspire is a wish to bring together what is dispersed in physical space in order to recreate a multitude of historical frames in the life of a record – by re-arranging the order of records within the archive and by connecting and inscribing the records in the archive to other sources of different provenance. The digital archive thereby emancipates the records from the authority of their original context, as well as from the authority of their published versions, in favor of a multiplicity of possible orders which the digital medium allows to simultaneously keep and distinguish. This is not to say that the digital archive challenges its status as a surrogate, but rather that its act of surrogacy, of mediation, is generative of agency for the original record. The record’s analog dimensionality of the here and now is dialectically enhanced by its two-dimensional reduction in the digital archive.

In fact, in a recent talk (2014) on the „archive as participatory“, Theimer discusses a new business model for the digital transformation of the archive and suggests the notion of archives as „platform”, in the sense that the digital archive gives people the tools and opportunity to create things. Perhaps this term also more fittingly describes the objective of digital humanities projects which describe themselves as archives.


Jerome McGann, editor of the Rossetti archive, polemically wrote twenty years ago in his classical piece “The rationale of hypertext” that the archives are sinking in a sea of paper, and advocated the relegation of the codex as the tool for storing, consuming and producing scholarship in favor of the hypermedia archive. McGann argues that the tools we can use in digital space to search and organize the material do not simply add another perspective in a series, but raise our general comprehension of the material to a higher order. Instead, when editing in codex forms the logical structures of the critical edition function at the same level as the material being analyzed, and so are limited in establishing relations between its various elements, especially if we are dealing with a great amount of material. What happens is you end up replicating various parts of the archive in as many places as you need to reference, or create a referential structure that becomes very difficult to navigate on paper.

The rationale for hypertext is nothing new in the history of scholarly texts: “The concepts of interrelating, annotating, and cross-referencing, as well as the abstract and material hyperlink, date back over one thousand years. So-called proto-hypertexts can be traced as far back as the Middle Ages, when glosses appeared in the Jewish Talmud, the Bible, canon law, and medical text.” (Astrid Ensslin, The Johns Hopkins Guide to Digital Media, 2014). It is our taking advantage of the digital medium that is lagging behind in our scholarly platforms for knowledge production and communication.

Hans Gabler’s theory of the digital edition also conceives of its function as producing a relational web of discourse, which remains constrained in print and is instead the prerogative of the digital medium’s dimensionality. “In the Renaissance, when books first became the medium for editions, printers devised breathtaking lay-outs for surrounding texts with commentaries, often in themselves again cross-referenced. In effect, they attempted to construct in print the relationality of what today are called hypertexts. But with books to establish the third, relational, dimension against their material two-dimensionality, has always been a rudimentary gesture, and has always depended on involving and stimulating the reader’s imagination and memory. For editions existing electronically, in contrast, the relational dimension is a given of the medium, and complex relationalities may be encoded for them into the digital infrastructure itself.”


Besides the obvious by now advantages of the digitalization of the traditional archive, such as immediate access, collation and linking of material records dispersed in physical space, targeted retrieval of information, flexibility and openness to incorporating new resources, which prompted the initiatives for creating digital archives, and, more recently, exploiting the relational dimension of the digital medium, such as the linking of records into semantic networks and user participation, important transformations produced by the digital shift is greater democratization in the consumption and production of scholarship. As Katherine Harris’ recent entry on the “Archive” in The Johns Hopkins Guide to Digital Media (2014) summarizes, the digital archive “Preserves and records multiple metanarratives” (Voss and Werner 1999); it accommodates multiple editorial perspectives and allows users to record and share their experience: “The object continues to acquire meaning based on the users’ organization of the material, on its continuous re-mixing, re-using and re-presentation.” (Harris)

Another, less obvious but fundamental, rationale for creating digital surrogates in the form of archives, editions, or platforms, is that sometimes the nature of the material record itself has structural complexity that demands the multi-dimensional re-presentation enabled by the digital archive – not just for facilitating its physical access but for enhancing its semantic accessibility. One of the motivations for creating the digital Whitman Archive, as its editor Ed Folsom writes, was the processual nature of Whitman’s work, which “resists the constraints of single book objects” (2007). In addition, much of Whitman’s writing was dispersed in notes and notebooks and is in itself a collection of observed phenomena. This kind of unpublished fragmentary archival material invites active re-mixing on the part of the user at an even more fundamental level than that of published works. As Gabler contends, “the editing of manuscripts from private transmission such as drafts, diaries or letters, belongs exclusively in the digital medium, as it can only there be exercised comprehensively”, because of their double nature: first, as documents of a material nature and as documents to be read, thus scholars require both the facsimile surrogate of the materiality of a text and its content, and second, as the documents of a process which are by default virtual – in a state of potentiality, and therefore require a medium of representation capable of representing that processual status.

The editor of such virtual text could become an extension of the author, to evoke Novalis’ romanticist notion of the reader who keeps refashioning the ideas in the work, where the work is always processual in the form of dynamic arrangements of fragments. While the task of scholarly criticism is essentially this refashioning of ideas, formally it remains at a respectful distance from the original and rarely rewrites it.

SLIDE 11 Performative scholarship

Barthes’ A Lover’s Discourse is one of my favorite examples of a codex that enacts its theme rather than just describe and analyze it, although it still relies on an arbitrary alphabetical arrangement of fragments. The romantic reader remains a theorized notion in the context of academic criticism, because, despite our digital platforms, we are still producing scholarship primarily in codex form. McGann rightly says that “we no longer have to use books to analyze and study other books or texts”, but we also no longer have to use books to discuss other books, especially when the contents of those books challenge its codex form in attempts to go beyond the limitations of their analog medium. However, most scholars in the humanities lack an academic culture of creative remixing of primary sources – one which would employ the technological and professional frameworks to enable the practice of performative criticism envisioned by the early romantics.

SLIDE 12 Walter Benjamin’s Arcades project (Passagenwerk): the agency of the archive

McGann’s statement that the book is no longer a necessary mediator for scholarly work echoes a statement that Walter Benjamin made in the early 1920s – “And today the book is already, as the present mode of scholarly production demonstrates, an outdated mediation between two different filing systems. For everything that matters is to be found in the card box of the researchers who wrote it, and the scholar studying it assimilates it into his own card index.”

Benjamin is commenting here on the dialectical resurgence of the three-dimensionality of script in the culture of advertisement and in the poetry of Mallarme and the writings of avant-garde artistic movements like Dadaism and Futurism. “If centuries ago it [script] began gradually to lie down, passing from the upright inscription to the manuscript resting on sloping desks before finally taking itself to bed in the printed book, it now begins just as slowly to rise again from the ground.” The three-dimensionality of script infiltrating printed text is thus reflected in the three-dimensionality of its form of mediation – the index card box, the personal research archive of the scholar contains all the information necessary for the production of scholarship without the mediation of the codex.

Benjamin’s comment is undoubtedly disparaging of the quality of scholarly books in his time, meaning that their mediation does not become discourse generative of new knowledge. However, it is also prophetic in a positive sense: “the moment of a qualitative leap [is approaching], when writing, advancing ever more deeply into the graphic regions of its new eccentric figurativeness, will suddenly take possession of an adequate material content. In this picture-writing, poets […] will be able to participate only by mastering the fields in which it is being constructed: statistical and technical diagrams.” The quantitative explosion of media will make a qualitative leap when it gains substance, adequate subject matter rather than stay on the surface of advertisement statements. For this to happen, authors will also have to learn to master the tools of what today Johanna Drucker has theorized as graphesis and argues that scholars are still far from appreciating the knowledge-generating potentiality of the graphical form, which in the digital medium gains greater possibilities for application. “Graphesis is concerned with the creation of methods of interpretation that are generative and iterative, capable of producing new knowledge through the aesthetic provocation of graphical expressions.”

SLIDE 13 Walter Benjamin’s Arcades project (Passagenwerk): the agency of the archive

Benjamin’s comment is prophetic also with respect to his own scholarship, which remained largely in fragmentary and essay form, and especially the immense scholarly archive that is the Arcades project, occupying him from the late 1920s until the end of his life. Now published in a codex form, the Passagenwerk is not a book but a work in progress: “What we are faced with, the text, is nothing but a collection of fragments—citations from newspapers, advertisement, signs, guide books, literature, poetry, political manifestos, letters, economical, social, and philosophical researches—assorted and arranged according to various more or less thematic, more or less chronological “convolutes” and interspersed not so much with Benjamin’s interpretive remarks as with his subtle interventions.”

Yet, in a sense, the project is a coherent whole at a fundamental level, because its contents have an organic relation established by the curatorial agency of the collector. Whereas every act of arrangement, of framing, of curation, is necessarily invested with subjective intentionality, some acts of taking care, as we know from Heidegger, are more authentic, more generative of agency than others. Benjamin aspired to practice curation that allows for the agency of the observed phenomena to be activated: “Method of this project: literary montage. I needn’t say anything. Merely show. […] But the rags, the refuse – these I will not inventory but allow, in the only way possible, to come into their own: by making use of them.” The disposition of the collected fragments triggers associations, instead of relying on a teleological narrative to explain them. The scholar’s mediation does not require the narrative exposition of a book, but generates discourse in the spatial and graphical organization of the material. “Benjamin’s mode of working is marked by the techniques of archiving, collecting, and constructing. Excerpts, transpositions, cuttings-out, montaging, sticking, cataloguing and sorting appear to him to be true activities of an author.” (Ursula Marx et al) At the same time, the Passagenwerk records a vision for an order that fell short of its execution. The organizational devices that Benjamin used — alphabetical index, tabular form of important points, diagrams, colored symbols, cutting up the sheet of paper to re-arrange its contents, etc. – are traditional tools of organizing research material in paper, and it proved inadequate to his working method.

Benjamin’s romanticist aspiration towards performative criticism and its mediation challenges have been shared by other prominent authors of research archives, who bequeathed the curation of their work projects to posterity, such as Valéry and Wittgenstein in the 20th century. In the preface to his Philosophical Investigations Wittgenstein regrets that the best he could write would never be more than philosophical remarks, because his thoughts were soon crippled if he tried to force them in any single direction against their natural inclination, thus posing the organizational problem as an ethical one. The scholarly intentionality that has generated these research archives attends to the agency of the recollected phenomena with the objective to faithfully represent it. However, this phenomenological performative writing is achieved at the expense of their authors’ ability to give their archives an organizational structure that would fit the codex publishing model of scholarly production typical for their time. What they needed was a working platform that would allow to mediate the contents of their research archives according to their ethical vision for creating scholarship.

SLIDE 14 The hypertext pioneers’ vision of a universal archive

The creation of at least the basic features of such a platform was the quest of the hypertext pioneers, Bush, Engelbart and Nelson, who individually were motivated by a common vision for “the perfect archive, a machine that might extend, through replication, human mental experience and capture the interconnected structure of knowledge itself” (Barnet). Dissatisfied with the artificiality of the alphabetical and numerical indexes and feeling the urgency to finding more efficient ways to retrieve information in its exponential growth, Vannevar Bush designed a machine which would imitate the associative organization of the mind. In 1945 he introduced the Memex and the concept of trails linking two units of information together. The Memex was supposed to preserve human intellectual history in its logical interconnections. Bush’s design was an analog machine and was never implemented, but the idea was adopted by Engelbart – a computer engineer, who used digital technology to pioneer human-computer interaction. Engelbart updated the Memex design into a computer screen, a mouse, and hyperlinks. Engelbart’s system was called NLS, for oN-Line System, and it allowed for information to be reordered, linked, nested, juxtaposed, revised, deleted, or chained window by window.

Ted Nelson was developing similar ideas at the same time as Engelbart, although he came from the perspective of the humanist scholar rather than the computing community, and is credited with inventing the vocabulary for a hypertext environment. Nelson shares the same premise as Bush and Engelbart that “Human knowledge, and particularly literature, has a networked structure to it which is deeply at odds with conventional forms of indexing” and wanted to build a system for writers to express the complexity of their thought in its development, which could only happen by liberating text from the two-dimensionality of paper. In Literary Machines, 2/4-2/5, Nelson defines the objective of his Xanadu project as the need “to create a general representation and storage system that will permit automatic storage of all structures a user might want to work on, and the faithful accounting of their development.” Nelson believes that hypertext should not only be an ‘archive for the world’ but also a ‘universal instantaneous publishing system’ (ibid).

SLIDE 15 Scholarly hypertext: the structural exigencies of thought

What is at the core of his now 50 year-long research project Xanadu, is the ambition to build a technological platform that would free thought of its mediatic and extrinsic constraints and enable its full expressive potential. The typical formal structure of the scholarly publication imposes limitations and exclusions that are counterproductive for some writers, especially the greater the mass and scope of their subject matter. Rather than make cuts and suppress associations, which could be followed at a later point, Nelson suggests to re-arrange the textual units according to relevance. This demands a universal working and publishing environment that accommodates processual writing not only technologically, but also in terms of the economics and politics of academic knowledge production.

Janneke will speak about some of these challenges and opportunities.

