By Josh Cowls on October 24, 2016
At a time when it’s commonplace to see a movie trailer embedded in a tweet, or photos posted in a message thread, it’s clear that the experience of using the web involves an immersive mix of text, images and video. Of course, underlying what appears a seamless combination of media content is a huge amount of technical sophistication. The story is no different for annotation programs, which allow users not only to view but also to comment on various types of content. Integrating different annotation programs is the work of our latest HyperStudio Fellow, Daniel Cebrián Robles, who was spending September visiting us from the University of Málaga, where he is a Professor of Science and Education. I recently sat down with Daniel to talk about his work, and what he hopes to achieve in his time with us.
One of Daniel’s areas of expertise is his development of technology designed to meet the needs of students and researchers. This makes him a natural fit as a visitor to HyperStudio, given our focus on both research and pedagogy in the context of Digital Humanities. Daniel’s focus while at HyperStudio will be on integrating the Open Video Annotation project (OVA), for which he is the Lead Application Developer, with our own online annotation tool, Annotation Studio. OVA, an initiative originally based out of the Center for Hellenic Studies at Harvard, enables the annotation of online video material, allowing students and teachers alike to create, collate and share tags and comments at different points in a video. Given the explosive growth of online video in recent years, the project serves to make watching video online a more interactive and immersive experience.
As noted, here at HyperStudio we have our own online annotation tool, Annotation Studio, which is also designed for students, researchers and others to collaboratively annotate online material. The crucial difference, however, is that Annotation Studio is currently designed for annotating text, but not – as yet – video. This, then, is the basis of Daniel’s work with us – to integrate the video annotation capacities of the Open Video Annotation Project with Annotation Studio. Doing so undoubtedly poses several technical challenges, which will require Daniel’s depth of experience in this area. Daniel explained that his ambition for this month is to develop a first working version of the integrated functionality.
As both a developer and educator, Daniel is perfectly placed to negotiate between what users want and what is technically feasible, allowing him to swiftly fix bugs and incorporate suggestions made by instructors and students. This ground-level engagement thus guides his development efforts, serving a similar function to the workshops and trials we regularly hold with users of Annotation Studio. These engagements are often the most rewarding part of the development process: Daniel mentioned the time that one of his users, a teacher in training, told him that he would use the program with his future high school students.
Looking further ahead, Daniel believes that his work integrating the Open Video Annotation project with Annotation Studio is only the beginning of a much wider process of bringing diverse forms of media together for annotation into one platform. Daniel speculates that beyond just text, photos and videos, potentially also maps and even 3D objects might belong to such a platform in the future. And the impact on user experience could be empowering and inspirational. Giving students, teachers and the general public the ability not only consume media online, but also share opinions and perspectives on it through annotation, could revolutionize how we experience the vast catalog of content available online. Daniel’s work marks just the start of this process, but we are excited to have him on board!
By Evan Higgins on December 7, 2015
As HyperStudio’s other new resident Research Assistant, I’m finding it hard to believe that I’m nearly one fourth of the way through my time here. I’ve worked so far on a number of interesting Digital Humanities projects exploring topics as varied as US foreign relations research and methods of collaborative annotation. And while all these assignments have been fascinating in their own way, the one that has commanded the majority of my attention and interest is our new interactive archive that explores the history of African American physicians. This project, tentatively titled Blacks in American Medicine (BAM), has been in development for years but is now beginning to take shape as an interactive online archive thanks to the possibilities provided by Digital Humanities tools and techniques. As with several of the other projects here, BAM makes use of digital tools to tell stories that have been left untold for too long.
A Brief History of the Blacks in American Medicine
BAM has been in development since the mid 1980’s when Pulitzer Prize Finalist author, Kenneth Manning, undertook the herculean task of aggregating the biographical records of African Americans in medicine from 1860-1980. With the help of his colleague and fellow researcher, Philip Alexander, Ken set out to create a nearly comprehensive list of black American medical practitioners to not only make research about this community less arduous for scholars but also to test traditional narratives about African American communities in the United States.
Over the years, this team built up an impressive collection of biographical records for over 23,000 African American doctors. These records were collected through the careful combing of academic, medical and professional directories. Once a record was found, they were then stored in a digital database with the aim of one day making this content available to a wider audience. Each of these mini-biographies includes personal, geographic, academic, professional and other information about doctors that helps shed light on this unexplored corner of American history.
While searching for these biographical records, Ken and Philip also set about gathering documents associated with these doctors. Correspondence, reports, surveys, diaries, autobiographies, articles and other content collected from years of archival research help flesh out aspects of these doctors’ lives and allow readers to understand the complex situations and challenges that these doctors faced.
In my many hours spent searching through the archive, I’ve come across hundreds of documents that provide a window into the history of the black experience in America. One that continually comes to my mind is a letter written by Dr. S Clifford Boston to his alma mater, the University of Pennsylvania in 1919. In this letter, Dr. Clifford politely asks the General Alumni Catalogue to “kindly strike out the words ‘first colored graduate in Biology [sic], as I find it to be a disadvantage to me professionally, as I am not regarded as a colored in my present location.” This letter is an important artifact not only because it provides evidence of the ways in which blacks “passed,” but because it elucidates some of the complex societal challenges that many African Americans in medicine faced. The formal, detached way in which this doctor asks to be dissociated from his heritage gives a brief glimpse into the systemic racism and segregation that blacks of this era faced. These types of first-hand documents provide a chance to add nuance to traditional histories of the black experience in America that is too often told in large, overly simplistic narratives.
These unique stories in combination with our massive amounts of standardized, biographical information create a unique archive that allows for layers of interaction. By incorporating both a focused study into the history of specific physicians and a broader analysis of the trends within the African American medical community, this trove of content highlights untold chapters in the vast history of the black experience.
HyperStudio Takes the Project into the Information Age
With an eye towards the dissemination of this rare and important content, Ken and his team recently began working with HyperStudio to take better advantage of the affordances of digital humanities.
While still in the initial stages of formalizing the structure of the platform, we are working on a number of intersectional methods to display this trove of content. As with most of HyperStudio’s archival projects, content will be discoverable by both scholars as well as more casual audiences. To do this, documents and records will be encoded using established metadata standards such as Dublin Core, allowing us to connect our primary materials and biographical records to other, relevant archives.
We’re also planning on integrating our Repertoire faceted browser, which allows for both a targeted search given a specific criteria and the ability to explore freely documents that interest the user. Additionally, this project will feature our Chronos Timeline, which dynamically uses events and occurrences to present historical data. We also plan on incorporating geographic, visual and biographical features, as well as a storytelling tool that will enable users to actively engage with our content.
As I round the corner on my first semester at at MIT, I can’t help but be excited by this project. Too often existing narratives about marginalized groups go untested and unchallenged. By providing an multi-faceted interface and rich, previously inaccessible content, we’re creating a tool that will help interrogate these traditional views of African American history. For more information on the project as it develops follow us here on the blog.
Image: Leonard Medical School on Wikipedia (source)
By Josh Cowls on October 6, 2015
It’s great to be getting underway here at MIT, as a new graduate student in CMS and an RA in HyperStudio. One of my initial tasks for my Hyperstudio research has been to get to grips with the exciting Artbot project, developed by recent alumni Desi Gonzalez, Liam Andrew, and other HyperStudio members, and think about how we might take it forward.
The genesis of Artbot was the realisation that, though the Boston area is awash with a remarkable array of cultural offerings, residents lacked a comprehensive, responsive tool bringing together all of these experiences in an engaging way. This is the gap that Artbot sought to fill. A recent conference paper introducing the project outlined the three primary aims of Artbot:
- To encourage a meaningful and sustained relationship to art
- To do so by getting users physically in front of works of art
- To reveal the rich connections among holdings and activities at cultural institutions in Boston
With these aims in mind, the team built a highly sophisticated platform to serve up local art experiences in two ways: through a recommendation system responsive to a user’s expressed interests, and through a discovery system drawing on meaningful associations between different offerings. Both these processes were designed to be automated, building on a network of scrapers and parsers which allow the app to automatically categorize, classify, and create connections between different entities. The whole project was built using open-source software, and can be accessed via artbotapp.com in mobile web browsers.
I’ve spent some time getting first-hand experience with Artbot as a user, and several things stick out. First, and most importantly: it works! The app is instantly immersive, drawing the user in through its related and recommended events feeds. Experiencing art is typically a subjective and self-directed process, and the app succeeds in mimicking this by nudging rather than pushing the user through an exploration of the local art scene.
Second, it is interesting to note how the app handles the complexity of cultural events and our varied interest in them. On one level, events are by definition fixed to a given time and place (even when they span a wide timespan or multiple venues.) Yet on another level, a complex package of social, cultural and practical cues usually governs the decision over whether or not we want to actually attend any particular event. This is where the app’s relation and recommendation systems really become useful, drawing meaningful links between events to highlight those that users are more likely to be genuinely interested in but may not have searched for or otherwise come across.
Finally, the successful implementation of the app for Boston’s art scene led us to think about the different directions we might take it going forward. In principle, although the app currently only scrapes museum and gallery websites for event data, the underlying architecture for categorization and classification is culturally agnostic, suggesting the possibility for a wider range of local events to be included.
The value of such a tool could be immense. It’s exciting to imagine a single platform offering information about every music concert, sporting event and civic meeting in a given locality, enabling residents to make informed choices about how they might engage with their community. But this is crucially dependent on a second new component: allowing users to enter information themselves, thus providing another stream of information about local events. As such, we’re proposing both a diversification of the cultural coverage of the app, but also a democratisation of the means by which events can be discovered and promoted. We’ve also given it a new, more widely applicable name: Knowtice.
This move towards diversification and democratisation chimes with the broader principles of the platform. ‘Parserbot’ – the core component of Artbot which performs natural language processing and entity extraction of relevant data – is open source, and therefore could in future allow communities other than our launch locality Boston to adopt and implement it independently, shaping it to their own needs. At root, all events require some basic information: a time and date, a location, a purpose, and people to attend. This data is standardisable, making it possible to collect together information about a wide array of events in a similar format. Yet despite these structural similarities, in substantive terms no two events are ever the same, which is why we are committed to providing a platform which facilitates distinctiveness, letting communities to express themselves through their events.
We recently entered the Knight Foundation’s News Challenge with a view to taking the app forward in these new directions. You can view our submitted application (and up-vote it!) here. As we state in our proposal, we think that there’s tremendous potential for a tool that helps to unlock the cultural and social value of local activities in a way that engages and enthuses the whole community. We plan to build on the firm foundations of Artbot to create a social, sustainable, open-source platform to accomplish this broad and bold ambition. Keep checking this blog to find out how we get on!
By Andy Stuhl on March 19, 2015
On January 23, 2015, HyperStudio hosted a workshop that convened more than seventy educators and technologists to discuss the future of annotation. “Collaborative Insights through Digital Annotation: Rethinking the Connections between Annotation, Reading & Writing” drew thoughtful perspectives on the opportunities and challenges facing HyperStudio’s Annotation Studio and other pedagogical tools more broadly. The workshop’s dynamic combination of formal panel conversations and unconference crowdsourced breakout sessions allowed particular topics of interest to emerge as flashpoints over the course of the day; these themes became the organizing basis for the closing remarks delivered by HyperStudio research assistants Andy Stuhl and Desi Gonzalez.
One issue that arose over and over was the question of copyright. What kinds of texts can educators share on Annotation Studio? In our terms and conditions, we ask users of Annotation Studio to respect intellectual property and not post materials that violate copyright. But the question of what constitutes fair use for educational purposes is itself difficult to answer. However, there are a few guidelines that one might want to follow. A useful guide has been put together by the Association of Research Libraries specifically for faculty & teaching assistants in higher education.
When digital annotation tools are used for public humanities projects, these questions become all the more pressing. During a breakout session on using annotations in public humanities projects, Christopher Gleason and Jody Gordon of the Wentworth Institute of Technology shared their digital project on the legacy of former Boston mayor James Curley. As a part of the project, Gleason and Gordon asked students to annotate a historical text describing Curley’s mansion near Jamaica Pond. The student-scholars added comments that would better help them understand the original interiors of the house, complete with definitions and images of historical furnishings. This project stressed a recurring question for Annotation Studio: how do we best deal with issues of copyright—not just of the original text, but also of the content with which the text is annotated? The Annotation Studio team is exploring ways to simplify the addition of metadata, including copyright information to media documents used in annotations.
Pedagogy first, tool second
Both Ina Lipkowitz and Wyn Kelley have used Annotation Studio in multiple classes in the Literature Department at MIT. But the kinds of texts they teach—from classic novels to recipes to the Bible—and the ways in which they and their students annotate differ wildly. When reading entries from the medieval cookbook Forme of Cury, annotations might be used to translate Old English words; in Frankenstein, students might share anything from personal impressions to interpretations of the text.
Annotation Studio was built as an easy-to-use web application with a core functionality: the collaborative annotation of texts. This one simple software, however, has yielded a multiplicity of affordances. It’s not the tool that determines how texts are used in the classroom, but rather the texts determine the tool: for one, we implement new features based on how educators hope to teach texts to their students; moreover, educators constantly find new strategies for using collaborative annotations in their classrooms.
Advancing technical and scholarly literacy together
The workshop demonstrated how annotation is a perfect case study for a bridging of readerly, writerly, and technical skill development central to the digital humanities. In the first panel, Mary Isbell documented how both reading/writing assignments and the work of learning and troubleshooting digital tools can both be mutually reinforcing components of a DH curriculum: by factoring in space for learning and troubleshooting these tools within the course plan, she’s found that formerly stressful encounters with software become opportunities to engage with and adapt the technical piece of the puzzle. This type of work often includes putting different tools and different types of media into conversation with one another, as Annotation Studio’s multimedia annotation evidences. Through these uses, students, as HyperStudio’s Executive Director Kurt Fendt noted, come to “think across media” and thereby expand their understanding of how meaning is constructed and conveyed differently through different media.
In the Teaching Annotation breakout session, participants brainstormed ways to create new kinds of assignments that integrated the new affordances of digital tools with existing pedagogical goals. This conversation included suggestions about directing students to turn to one another for technical aid in using these tools—this conversation, in turn, was part of a larger one about challenging notions of expertise and conceptual categories in the classroom. This subtle back-and-forth between technical and scholarly engagement offers instructors and students alike new ways to expand and combine their skill sets.
Voices in (and voices of) annotation
Much as digital annotation recasts and recombines different kinds of expertise from different sources, the voices of those annotating and of those annotated are also put into a dynamic exchange. The notion of voices cropped up at the center of insights both about annotation’s potential in the classroom and about considerations we should carry forward in refining our approaches to it. In his keynote, John Bryant demonstrated how annotation and revision can help expose the multiple voices of a singular author, giving scholars a more nuanced access to that author’s voice and to the process of critical thinking via that voice. Panelists including Suzanne Lane and Wyn Kelley touched on how environments like Annotation Studio can put students’ voices in the same plane as the authorial voices they study. Co-mingled voices of informal debate, of traditional student roles, and of teacherly authority can democratize the learning space and inspire confidence; they can also, as participants noted, require a careful negotiation and reassertion of pedagogical roles in order to advance a constructive learning conversation.
These opportunities and challenges are at the foreground of HyperStudio’s design process, as Lead Web Applications Developer Jamie Folsom described, in building more writing-focused features that will help students transform their reader voices into authorial voices. More broadly, the theme of voices opened to all participants exciting ways to think about the project of helping students discover, build, and reinvent their own scholarly voices—a project in which, as the workshop made clear from many angles, annotation has always been and continues to be a very powerful method.
By Andy Stuhl on December 19, 2014
In mid October, a transatlantic group of scholars gathered in New York City to present their research into more than a century of French theater and to discuss the tool that had helped them examine this history. This tool was a faceted browser, developed by HyperStudio as part of the Comédie-Française Registers Project (CFRP).
Under the guidance of Jeffrey Ravel, a professor of history at MIT and the principle investigator of the project, the HyperStudio team has developed ways to catalog, browse, and visualize the contents of thousands of register pages—the hand-written ledgers in which the ticket sales of every performance had been meticulously recorded since the Comédie-Française’s opening in 1680. The faceted browser, the latest of these tools, was made available to this group of scholars a few months prior, and the workshop in New York marked the first time they had convened to discuss their work with CFRP data. In preparation for this meeting, scholars used the faceted browser to examine the dates of the workshop (October 14th and 15th) throughout the theater’s 300-year history. From that launching point, each presenter crafted a research question and probed it through the faceted browser. At the conference, the scholars shared the impressive array of findings from their research, as well as their insights into the design of CFRP tools.
It quickly became clear that researchers drew on the flexibility of the faceted browser in a myriad of ways. For some, screenshots of the faceted browser’s register entries and responsive filters, as well as of the built-in time wheel data visualizer, ably illustrated their arguments. These presentations often relied on juxtaposition—for example, showing the difference in attendance between Molière’s plays and of Voltaire’s during the 1781-1782 season. For others, the browser served as a method by which they could generate data that could be entered into outside tools, allowing them to create additional visualizations. This latter category reminded us as developers of digital research tools how users will always incorporate additional tools into their projects. Accordingly, tailoring for every possible use should take a back seat to enabling smooth transitions of data from HyperStudio’s platform into others.
As we go about bringing CFRP tools to a wider scholarly audience, one priority is to develop case studies to familiarize viewers with the interface, while also showing compelling stories about its usefulness. Reflecting on their experiences of learning to use the CFRP faceted browser, many scholars at the workshop noted that such case studies would be very helpful to new users. Their presentations, meanwhile, provided invaluable examples of the stories we might build these studies around and indicated paths toward an interactive design for the case studies. Building on the thorough documentation of related digital humanities tools such as the French Book Trade in Enlightenment Europe, we are working to design a model for web-based CFRP case studies that will inspire readers to jump straight from the page into the data with their own questions.
The need to continue to craft engaging entry points to the Comédie-Française data was driven home by the workshop’s final discussion, which raised questions about the use of CFRP and similar tools in the classroom. Damien Chardonnet-Darmaillacq of Paris West University Nanterre La Défense discussed his introduction of CFRP and the faceted browser to a group of high school students, who enthusiastically took it up in launching their own investigations into French theater history. The students’ excitements and frustrations with the tool demonstrate both the rewards and the challenges of opening a still-developing project to pedagogical use. Chardonnet’s students, he noted, were quick to take on the data-based research approach and to become adept with its tools.
It’s always tempting with large-scale data projects to think that the body of data offers a neat and tidy representation of the underlying texts and events; the workshop’s discussions reminded all that the data from the registers is thoroughly embodied in the history and the physical space of the theater itself. For example, participants raised challenging problems of defining the different categories of seats within the theater and of understanding how these definitions changed across the troupe’s movement into a new theater building in the 1780s. Some called for a visualization tool that would evoke the three-dimensional space of the theater itself in representing data trends. Projects in the digital humanities bring along with them strong and complicating connections to the materiality of texts, performances, and spaces; yet the digital humanities also provide unique approaches to harness this materiality in digital representations through thoughtful design. When the presentations had concluded, participants headed uptown for a performance of a Voltaire play, tying the workshop’s ventures into the realm of data, queries, and visualization back into the lively theater tradition that inspired them.
By Desi Gonzalez on October 21, 2014
I recently celebrated my one-year-anniversary as the author of h+d insights, HyperStudio’s weekly newsletter that shares the latest news, projects, resources, and fellowship and conference opportunities related to the intersection of technology and the humanities. (Subscribe here to stay in the loop!) That’s 52 weeks of combing through blogs, tweets, videos, slide shares, and news articles to find the most pressing issues in the field of digital humanities.
This responsibility (and privilege) has afforded me a unique perspective on DH: what’s happening to the field, what the current controversies are, and where the most exciting and cutting edge work is happening. Here, I highlight a few trends I’ve noticed over the past year.
1. What happens in the world affects digital humanities. While humanists often study the past and its artifacts, current events influence the work we do all the time. In April, net neutrality—the idea that Internet service providers and the government should allow equal access to all content and applications, without favoring specific products or websites—was threatened in the U.S. with the announcement that the Federal Communications Commission was considering changes its policies. Digital humanists immediately responded, recognizing how net neutrality could affect the type of work—often open-access and done with little funding—we do. Adeline Koh explained the implications: “Imagine this: an Internet where you can access Apple.com in a fraction of a heartbeat, but a non-profit activist website would take five minutes to load.” A group of leaders in DH penned an open letter urging the FCC to protect “the fundamental character of the open, non-discriminatory, creative, and competitive Internet.” Hybrid Pedagogy hosted a discussion on the implications of net neutrality for educators and learners, which turned out to be a riveting discussion.
Others are recognizing the importance of creating digital archives of major events happening now so that scholars may be able to use them in the future. For example, Ed Summers advocated for the documentation of the aftermath of the Michael Brown shooting and the subsequent protests in Ferguson, Missouri. He wrote a blog post sharing the process he used to archive tweets from the event.
2. Universities are evolving. DH continues to be placed at the center of heated and often hyperbolic debates about the so-called demise of the humanities. No matter which side you land on (or, if you’re like me, if you believe these debates are asking the wrong questions), it is clear that there are certain, very real changes happening in academia right now.
The Modern Language Association undertook the daunting task of producing a comprehensive report on the current state of doctoral education in the humanities. The report’s executive summary offers many recommendations, including redesigning the doctoral program to fit with “the evolving character of our fields,” providing support and training for technology work, and reimagining what a dissertation might look like. (In fact, a few months prior Cathy Davidson had reflected on what it means to write a digital dissertation.)
And speaking of dissertations, Mark Sample’s Disembargo project challenges the concept of the academic embargo, in which dissertations are withheld from being circulated digitally for up to six years. Every ten minutes, a character from his dissertation manuscript is added to the project website—published under a Creative Commons license—at “an excruciating pace that dramatizes the silence of an embargo.”
Finally, more and more scholars are opting for #altac (alternative academic) and #postac (post-academic) careers—paths outside of the traditional tenure-track route, which is increasingly becoming more untenable. Sarah Werner considered the relationship between #altac work and gender and shared her advice on pursuing an untraditional career.
3. Digital humanities is happening in the public sphere. As the #altac movement shows, digital humanities is often happening outside of the university. Crowdsourcing continues to be a popular strategy to involve the public in the development and execution of DH projects. Letters of 1916, for example, recently celebrated its first anniversary. Within the past year, families (in addition to cultural institutions, libraries, and archives) have donated letters penned during the Easter Rising in Ireland. Others have considered how the potential of crowdsourcing can be further enhanced: Mia Ridge outlines cultural heritage projects that combine crowdsourced data with machine learning, while Trevor Owens imagines crowdsourcing being used in tandem with linked open data.
Additionally, museums, archives, and libraries outside of academia are spearheading some exciting technology initiatives. Open data from museums and libraries allows broad audiences to do DH in their own homes. Nina Simon highlighted some of the data visualization projects that resulted from the Tate releasing an API of its collection. A consortium of museums involved in the Getty Foundation’s Online Scholarly Catalogue Initiative are reimagining what art publications look like in the 21st century. The Walker Art Center, for example, challenges the notion of what is a page in the digital age.
The above are just a handful of the ideas and projects flowing through the digital humanities community. A blog post can’t do justice to all of the fantastic work being done in the field; to keep up with the latest, I recommend that you follow the #DH hashtag on Twitter and subscribe to h+d insights. I’m excited to see how the next year shapes up!
Image: visualization of the digital humanities community on Twitter by Martin Grandjean (source)
By Rachel Schnepper on September 18, 2014
After an exceptionally busy summer, we are thrilled to be able to announce the release of Annotation Studio 2.0. The new version of Annotation Studio offers the following features:
- Anchored annotations that correspond to the text currently visible
- Touch capability so that Annotation Studio can be used on mobile devices
- Upload of very basic PDF files
- Slide-in menus with context aware tools
- Private subdomains (e.g., MassU.annotationstudio.org) hosted by HyperStudio
- Improved annotation sidebar
- Wider document view
- Breadcrumb navigation
- Expanded help forum at support.annotationstudio.org
User feedback has been incredibly important to determining which features were added to Annotation Studio 2.0. We have listened carefully to what the Annotation Studio community was saying, what both instructors and students liked and disliked about Annotation Studio. Accordingly, we have endeavored to preserve what users like best about the application, namely its simplicity and ease of use, while nonetheless adding the features and functions they wanted.
In the weeks to come, we will continue to add new features, including increased functionality to the PDF uploader. As we do so, we hope to continue to hear back from the Annotation Studio community. We encourage you to make use of the Annotation Studio support forum, where users can learn from one another. We hope you continue to share with us how you are using Annotation Studio, like the subjects of our pedagogy case studies. Our goal is to make a tool that reflects the user’s needs, and we can’t do that without your feedback.
Please check out the new version of Annotation Studio here!
By Liam Andrew on August 27, 2014
Books and manuscripts are an archivist’s bread and butter, respectively. Librarians have honed techniques for storing, maintaining, and retrieving their contents for millennia—go into any stack in the library, organized by call number, for ample evidence. But newer media artifacts often don’t fit into old ways of storing and finding information. Digital media brings this problem into full relief, but centuries ago, the newspaper might have been the modern archivist’s first challenge.
Today, archives face the challenge of digitizing their collections, an issue of particular importance for us at HyperStudio, as our research focuses on the potential for digital archives to provide new opportunities for collaborative knowledge creation. For archivists, the digitization of newspapers raises unique questions when compared to their traditional stock. At the DH2014 conference in Lausanne, Switzerland, one panel in particular addressed historical newspaper digitization head-on.
Newspapers are rich archival documents, because they store both ephemera and history. The saying goes that “newspapers are the first draft of history,” but not all news becomes history. In a typical paper, you might find today’s weather sitting next to a long story summarizing a major historic event; historians have traditionally been more interested in the latter. Journalists sometimes divide these types of news into “stock” and “flow”. Flow is the constant stream of new information, meant for right now (think of your Twitter feed). Stock is the “durable stuff,” built to stand the test of time (for instance, a New Yorker longread).
For archivists, everything must be considered “stock”: stored forever. Some historians may be in search of ephemera in an effort to glean insight from fragments of local news snippets, advertisements or classifieds—so everything is of potential historical importance. The Europeana Newspapers project has digitized over 2 million pages with the help of a dozen key partner libraries around Europe, but by their calculations, 90% of European culture is not digitized. The project anticipates reaching 10 million records by 2015, along with metadata for millions more, but it is still a small fraction of Europe’s newspapers.
It is also no surprise that many biases exist even in this wide net of 10 million records. The 10% of culture that is digitized generally consists of culture’s most well-known and well-funded fragments. The lamentable quality of OCR (Online Character Recognition—a technology that turns scans into searchable text) likewise means that better image scans lead to better discovery. Moreover, groups like Europeana must work across dozens of countries, languages, and copyright laws; some of these will inevitably be better represented and better funded than others. So it seems you’re much more likely to find a major piece in a highbrow English paper than a blurb in the sports section of an obscure Polish daily.
Even taking as a given that everything is potentially important, newspapers present a unique metadata challenge for archivists. A newspaper is a very complex design object with specific affordances; Paul Gooding, a researcher at University College London, sees digitized newspapers as ripe for analysis due to their irregular size and their seriality. A paper’s physical appearance and content are closely linked together, so simply “digitizing” a newspaper changes it massively, reshaping a great deal of context.
Seriality and page placement also extend the ways in which researchers might want to query the archive. For some researchers, placement will be important (was an article’s headline on the first page? Above or below the fold? Was there an image, or a counterpoint article next to it?). Others could be examining the newspaper itself over time, rather than the contents within (for instance, did a paper’s writing style or ad placement change over the course of a decade?) Still others may be hoping to deep-dive into a particular story across various journals. Each of these modes of research requires different data, some of which is remarkably difficult to code and store.
In order to learn more about how people use digitized newspaper archives, Gooding analyzed user web logs from Welsh Newspapers Online, a newspaper portal maintained by the National Library of Wales, hoping to gain insight from users’ behavior. He found that most researchers were not closely reading the newspapers page by page, but instead searching and browsing at a high level before diving into particular pages. He sees this behavior as an accelerated version of the way people used to browse through archives—when faced with boxes of archived newspapers, most researchers do not flip through pages, but instead skip through reams of them before delving in. So while digital newspapers do not replace the physical archive, they do mostly mimic the physical experience of diving into an archive; in Gooding’s words, “digitized newspapers are amazing at being digitized newspapers.” Portals like Welsh Newspapers Online are not fundamentally rethinking archive access, but they certainly let more people access it.
The TOME project at Georgia Tech is aiming to rethink historical newspaper analysis from a different angle. Instead of providing an interface for qualitative researchers to dive in, TOME hopes to facilitate automatic topic modeling and entity recognition, to quickly get a high-level glance of a vast archive with quantitative methods. They are beginning with a set of 19th-century American newspaper archives focused on abolition. The project simplifies statistical analysis tools into a visually compelling interface, but at the risk of losing the context that seriality and page placement provide.
Perhaps the biggest challenge is how to present such a vast presence — and such a vast absence — to historians, curious researchers and individuals, all of whom may be after something slightly different. Where Gooding divided queries into three types — “search,” “browse,” and “content” — the TOME group follows John Tukey’s divide between “exploration” and “investigation”—or those who know what they want, and those who are looking for what they want. A good portal into a newspaper archive requires all of these avenues to be covered, but it remains to be decided how best to turn news into data, to visualize troves of ephemera, and to represent absence and bias.
Important books and manuscripts — the “great works” that line history books — tend to present a polished and completed version of events. Newspapers offer another angle into history, where routines, patterns, and debates are incidentally documented forever. Where a book is usually written for posterity, the newspaper is always written for today, reminding the archive diver of history’s unprepared chances and contingencies. The historians who mine old newspapers — and the archivists who enable them — have many new digital tools at their disposal to unearth promising archives, but much effort remains to fairly represent news archives, and determine how we might best use them.
By Rachel Schnepper on July 3, 2014
In recent years, academic professional organizations have adopted guidelines for evaluating scholars’ work in the digital humanities. The MLA, for example, after exploring the challenge in a series of articles in a 2011 issue of its journal Profession, adopted a series of guidelines in 2012 that encourages digital humanists to be “prepared to make explicit the results, theoretical underpinnings, and intellectual rigor of their work.” The same year, the AHA acknowledged that with the continued growth of digital humanities, history departments needed to “establish rigorous peer-review procedures to evaluate new forms of scholarship.” How to evaluate and assess the work of digital humanists continues to be an thriving discussion on an international, multi-disciplinary scale, with special issues of the Journal of Digital Humanities devoted to it and countless columns in The Chronicle.
As digital humanities scholarship grows, so too does teaching with digital humanities tools and methods. Just as the monograph as the preferred presentation of research is being challenged with digital humanities projects, so too is the traditional term paper with multimodal work. Increasingly, new technologies, media, and tools are being integrated into classrooms, producing a lively and expanding discourse on digital pedagogy in forums such as HASTAC and the Journal of Interactive Technology and Pedagogy.
Just as new forms of scholarship require new approaches to assessment, so too do innovative digital pedagogies. Unfortunately, even as the discourse on evaluating digital humanities projects as scholarship for tenure continues to grow, the same cannot be said of evaluating the effects of using digital tools as pedagogical resources. In the lush landscape of literature on digital humanities, this is a glaringly depressing bald patch.
This is precisely the challenge we at HyperStudio face as we scale-up the use of our digital anntotation tool, Annotation Studio, in classrooms from local high schools to national and international universities. When Annotation Studio was in use exclusively in classes at MIT in 2012, we conducted a preliminary assessment investigation using surveys, focus groups, and interviews. The results of this research revealed that most students, despite a lack of experience with both analog annotation and online annotation, were extremely interested in using Annotation Studio. By the end of the term, students readily acknowledged the value of annotating, crediting Annotation Studio with helping them collect evidence from the texts to then construct better arguments in their papers.
The results of our initial assessment were encouraging.
But we want to know more.
One of HyperStudio’s goals in all our projects is to enable users to have a more meaningful experience with, to dig deeper into, whatever form of media they are engaging with. This is true whether it is a document from Mohammed Khatemi’s presidency in Iran from our US-Iran Relations Project or a piece of art. With Annotation Studio, however, in order to develop a tool that provides this deeper, more meaningful experience, we need to fully understand how instructors and students are using it in the classroom.
Consequently, HyperStudio is conducting an ongoing and intensive inquiry in several classes at MIT. We are beginning with extended discussions with faculty members who are using Annotation Studio in their classes, going over reading and writing assignments that use Annotation Studio. Instructors have identified the promotion of collaborative learning and evidence based classroom debate, careful examination and incorporation of textual evidence in writing assignments, and foreign language reading comprehension as some of the specific performance tasks they have used Annotation Studio for. By pinpointing the exact pedagogical purposes the instructor had in mind when assigning Annotation Studio, we can better evaluate the tool’s effectiveness through classroom observation, surveys, focus groups, and interviews with students.
While this past semester we conducted assessment with writing and foreign language classes here at MIT, we also recognize that Annotation Studio is increasingly being used in classes outside of MIT. Accordingly, we were excited to talk with professors from Yale University and the University of Washington about their experiences with Annotation Studio as well.
Our goal in conducting these assessment exercises is twofold. First, feedback from instructors determines the directions and priorities of our development of Annotation Studio. At HyperStudio, we believe that the functionality of the application should not determine its pedagogical uses. Rather, the pedagogical uses of Annotation Studio inform its development.
Second, we are committed to uncovering and enhancing student learning. If we want students to become more sophisticated and critical readers, learn how to work with textual evidence when writing essays, and develop enduring skills to increase understanding, then it is vital that we appreciate how these processes are taking place when students use Annotation Studio.
We are eager to unpack and analyze the results of our assessment this spring, to learn how Annotation Studio is transforming and enhancing student learning. As we move through the summer, we are developing with instructors an even more extensive assessment strategy to begin at the start of the fall.
Beyond evaluating the students’ transferability of the skills cultivated through Annotation Studio and their endurability, we are also examining the analytics data the tool generates. We are capturing the stats and metrics of student use of Annotation Studio, including timestamps on annotations, the use of tags, private versus public annotations. This data provides users of Annotation Studio with unqiue insights into how students are reading texts and interacting with one another in the social environment of Annotation Studio. With this data, we can produce visualizations of the reading process, which reveal trends in the reading process, the evolution of interactions with passages in the text, and provide practical feedback for instructors.
As we do learn more about Annotation Studio, we look forward to sharing our research. We intend to share the results of our research with instructors, so they can refine their assignments and improve their pedagogy. We intend to share the results of our research with students, so they can understand better their own learning processes. And, finally, we intend to share the results of our research with the growing Annotation Studio community, so they can be inspired.
By Liam Andrew on May 13, 2014
The first thing that struck me about I Annotate 2014 was the setting: unlike the standard stuffy, windowless conference hall, the event was held at the Fort Mason Center in San Francisco, a historic, bright, and beautiful space overlooking the Golden Gate Bridge. Given the impeccable weather, the location was one consolation for staying indoors, but what really kept all of us there was a drive to annotate the world’s knowledge.
Organized and run by Hypothes.is, the second annual conference demonstrated how far the annotation community has come in just a year, making inroads in a wide variety of industries and research groups. Whenever you find Wolfram Research next to Rap Genius on a conference attendee list, you know it’s going to be interesting. Attendees represented research labs, startups, foundations, and organizations. They showcased tools, tricks and platforms ranging from HyperStudio’s own Annotation Studio, to Harvard’s edX and H2O, to semantic initiatives at the BBC, Financial Times, and the OpenGov Foundation. Scores of other projects and tools came from academia and industry, such as Rhizi and PeerLibrary. The W3C was also in attendance, in their effort to further the incorporation of annotation into the next generation of web standards. Based on the wide range of attendees and uses, they have their work cut out for them.
Most conference presenters did not try to pose a singular, overarching definition of annotation, instead showcasing their own projects and related examples. To me, this seemed wise; otherwise, we might have argued for days about what annotation is. When we talk about annotation, we are talking about many different practices couched in one ambiguous term. We were all gathered to advance the ways in which media can be published and discussed online, whether in the humanities, sciences, finance, law, or rap music. Each of these fields has different uses for annotation, and the one thing that we all had in common was a single word with many meanings.
It does seem clear that annotation tools provide a way to make online texts “read-write” rather than just “read.” Whether providing peer-review of scientific texts or analyzing your favorite rap lyrics, when you annotate you are providing feedback, complicating communication on a technical as well as social level. You are also forging connections between texts, making content both more discoverable and more nuanced. Much discussion revolved around whether annotations themselves should be annotatable, or even publishable (bringing up questions of copyright). Is annotating an archival act, or a discursive one? This adds a complex layer to the web, but gives unprecedented priority and weight to everyday users, and encourages more focused interaction with a text than a simple comment or published response.
There was also much discussion about possible new frontiers in annotation. How can we give people the ability to annotate images, audio, video, and scientific data? What about more complex media, like video games, or even real-world experiences? Does this stray too far from what “defines” an annotation? Some naysayers suggested what couldn’t or shouldn’t be annotated (perhaps we can go too far in archiving and explaining). Others claimed that the focus was in the wrong place: perhaps an annotation platform should be framed as a social network, with precedence placed on building communities rather than technologies.
In the end, all of these claims are probably true for some of the platforms and less applicable to others. There was a wide breadth of tools in many disciplines, using annotation in myriad different ways—the conference was exploring a technique and architecture rather than an industry, a horizontal instead of a vertical. This made it vary in relevance, but always focused and utterly unique. Conference breakout sessions ranged from the technical to the philosophical, and allowed for those with aligned interests to interact after the presentations.
The proceedings were followed by a two-day hackathon, where a handful of coders worked on expanding the open-source Annotator library (which also powers Annotation Studio) and its community to new ends. It was a fitting conclusion to the conference: after talking, it was time to make.
Liam Andrew is a graduate student in Comparative Media Studies and a research assistant in HyperStudio.