Monday 8 February 2010

Announcing the Bentham Papers Transcription Initiative

Jeremy Bentham's body, preserved at UCL

Jeremy Bentham's body, preserved and on display at UCL.

We at UCL are all terribly proud of Jeremy Bentham (1748-1832)- whose body, or "Auto-icon" is on display in the South Cloisters. It is widely told that he was the founder of UCL - which isnt true, although he did influence those who did found our University. I dont think I'll ever get bored in saying "Good morning!" to him every day as I walk past. You'll be pleased to know his case gets locked up tight every evening to allow him some rest.

He was a prolific writer, scholar, jurist, philosopher, and social scientist. A.J.P. Taylor described him as `the most formidable reasoner who ever applied his gifts to the practical questions of administration and politics’. Since the 1950s, The Bentham Project has been working towards the production of a new scholarly edition of his works and correspondence, although they've only dented the surface of the 60,000 pages of writing he produced which remain in UCL's special collections.

The Bentham Project did receive some AHRC money a few years ago to start digitising the material, although it was time for a rethink. Enter the Arts and Humanities Research Council’s highly competitive Digital Equipment and Database Enhancement for Impact (DEDEFI) scheme.

I've been asked to join the project in an advisory role. It became clear to me very quickly that in a one year project there was never going to be enough time for two (maximum, under the funding) research assistants to digitise and transcribe tens of thousands of pages of manuscript material. So what, I thought, if we change the focus of the transcription initiative?

The Guardian Newspaper had run a very successful investigation into the UK MP's expense scandal in 2009, using an online crowdsourcing application to let their readership help sort though the 450,000 documents that needed closer study. Would it be possible, I thought, to develop a similar tool for cultural heritage documents? Can we persuade the wider historical community to contribute to the transcription effort?

I am pleased to say that UCL Laws, in conjunction with UCL Centre for Digital Humanities, UCL Department of Information Studies, and UCL Library Services, can announce the launch of the Bentham Papers Transcription Initiative, which has secured £260,000 funding from the AHRC DEDEFI scheme.

The Bentham Papers Transcription Initiative is a highly innovative and novel attempt to aid in the transcription of Bentham’s work. A digitisation project will provide high quality scans of the papers, whilst an online transcription tool will be developed which will allow volunteers to contribute to the transcription effort: providing a “crowdsourcing” tool which will be used to manage contributions from the wider audience interested in Bentham’s work, including school students, and amateur historians. It will be the job of the research assistants to manage interaction with the wider historical community, and monitor the quality of the transcriptions which are added to the database.

The use of such a tool for the transcription of cultural and heritage material is novel (although do shout if you know anyone else planning something similar), and UCL’s CIBER group will monitor the use of the online tool, providing an in-depth study of how such a crowdsourcing application was used during the year- long project.

Work on the project begins on March 1st 2010, and the project shall be shortly hiring for two research assistants. The online tool will be launched mid-summer 2010, when you can contribute to transcribing the works of Jeremy Bentham yourself!

Did I mention I was super excited about this? Grin.


Unknown said...

Hi Melissa

The idea of Web 2.0-powered crowdsourced transcription came up a few years ago when we developed the Linnean Online system, and that led to us adding commenting features to the basic EPrints repository (and in turn to the JISC-funded SNEEP project).

It was also suggested that this approach might be valuable for user-contributed metadata (for example, vernacular names of flora and fauna). The Linnean Society hasn't actually put any of this into effect yet, but I'm sure they'll be interested in the results of the Bentham project.

The EPrints commenting/tagging approach is pretty rudimentary, though. Elsewhere we've been exploring with the Bibliographical Society the conversion of a vast and trunkless database of information about the London Book Trades into an online wiki-based hypertext.

That's a brief history of why I suggested that the Bentham project look at harnessing the proven power of MediaWiki to manage collaborative contributions and editing. If we can make it work, it would be great to see it spread.

And, like you say, if anyone's done it before and has any helpful suggestions... tant mieux!

Melissa said...

Thanks for that background info! looking forward to meeting soon and taking this forward!

Anonymous said...

There, fixed the project name for you.

Jason said...

Hi Melissa,

We too at the New Zealand Electronic Text Centre have an interest in transcription of cultural material, and we've been doing some very exploratory work in crowd-sourcing transcription.

So far, we've produced a slightly-modified Drupal installation, the source-code for which is hosted in svn on Google Code.

It seemed expedient to customise an existing CMS (which would have functions such as user management, revision history etc) rather than create a new one from scratch.

Basically, the Drupal installation allows an administrator to upload page images for transcription, and users to visit the website and transcribe what they see in the page images, with statistics kept on who is contributing and how much.

You've probably got your own ideas for an online transcription framework, but I'll be happy to provide more information if you're interested: