Wednesday 16 May 2012

When was the last time you asked how your published research was doing?

A month or so ago, I posted about whether blogging and tweeting about academic research papers was "worth it". Whilst writing up my thoughts, the one thing that I found really problematic was the following:
I also know nothing about how many times my other papers are downloaded from the websites of published journals, or consulted in print in the Library. The latter, no-one can really say about - but the former? It seems strange to me that we write articles (without being paid) and we get them published by people who make a profit on them, then we don't even know - usually - how many downloads they are getting from the journals themselves.
That's true enough, I thought. But whose fault is it that I don't know about access statistics for journals I have published in? Heck, have I ever asked for the access statistics for how many times my papers have been downloaded from the journals they are published in? Has anyone?

So, Reader, I asked for some facts and figures, regarding the circulation of journals, and the download statistics of my papers.

I have to say that the journals were really very helpful, and forthcoming, if surprised:
"I imagine the publishers would be happy to tell an author the cumulative downloads for their papers... So far as I know, you are the first author ever to ask... certainly the first to ask me." said David Bawden, Editor of the Journal of Documentation.  Jonas Söderholm, Editor of HumanIT, highlighted some of the issues journals will face if people start asking this kind of question, saying
"A reasonable request and we would gladly assist you. Unfortunately we do not have direct access to server logs as our web site is hosted as part of the larger University of Borås web. We will take your request as a good excuse to check into the matter though, and also review our general policy on log data."
Most journals got back to me by return of email, telling me immediately what they knew (and being very aware of the limitations of their reporting mechanisms, for example whether or not the figures excluded robot activity, the fact that how long the user stays on the website is not known so accidental click-throughs are undetermined, etc. Such caveats were explained in detail).  Emerald, the publishers of JDoc and Aslib Proceedings, were not comfortable in giving me access to wider statistics about their general readership numbers, given this could be commercially sensitive information, which is understandable: they were very happy to give me the statistics relating to my own papers, though.

The only journal not to get back to me was LLC , published by Oxford University Press (The editor replied to say he was not sure he had access to these statistics, but would ask). This is ironic, given I'm on the editorial board. I'll press further, and take it to our summer steering-group meeting.

I suspect that the actual statistics involved are only really very interesting to myself. I had originally planned to make comparisons with the amount of downloads from UCL Discovery (Open Access (OA) is better, folks! etc) , but I think the picture is foggier than that. What this exercise does do is highlight the type of information that, as authors, we dont normally hear about, which can be actually quite interesting for us, as well as stressing the complex relationship between OA and paywalled publications. Here are some details:

  • One of my papers published in JDoc (Ross, C and Terras, M and Warwick, C and Welsh, A (2011) Enabled backchannel: conference Twitter use by digital humanists. J DOC, 67 (2) 214 - 237) was downloaded 804 times from the JDOC website during 2011, and was number 16 in the download popularity list that year. The total number of paper downloads from JDoc as a whole during that year was 123,228. Isn't that interesting to know? I have a top 20 paper in a really good journal in my discipline! Who knew? It has now been downloaded 1114 times from their website. In comparison, there have been 531 total downloads of that paper from UCL Discovery in the past 6 months. But the time frame for comparison of downloads with the OA copy from Discovery isn't the same, so comparing is problematic - and there are more downloads from the subscription journal than from our OA repository. Still, it shows a healthy amount of downloads, so I'm happy with that.
  • The Art Libraries Journal - only available in print, not online, were quick to tell me that the journal is distributed to 550 members: 200 going abroad to Libraries/Institutions, 150 sent to UK Personal members, and 200 going to UK Libraries/Institutions. My paper published there (Terras, M (2010) Should we just send a copy? Digitisation, Use and Usefulness. Art Libraries Journal, 35 (1)) has had 205 downloads in the last six months from UCL Discovery, so I perceive that as a really good additional advert for OA: the print circulation is fairly limited, but the OA copy is available to all who want it.
  • My paper in the International Journal of Digital Curation - itself an OA journal - (Gooding, P and Terras, M (2008) Grand Theft Archive: a quantitative analysis of the current state of computer game preservation. The International Journal of Digital Curation, 3 (2)) was downloaded 903 times in 2009 out of the 53,261 times the full text of a paper was accessed. (The average was 476, with standard deviation 307). In 2010 the paper accounted for 919 out of the 120,126 times the full text of a paper was accessed. (The average was 938, with standard deviation 1045.) That compares to only 85 downloads from the UCL repository, but hey, its freely available online anyway, without having to revert to an OA copy in an institutional repository. It might be worth drawing from this that copies of papers in institutional archives are only really used when the paper isnt available anywhere else, but you would hope that would be obvious, no?
  • InternetArchaeology journal has an online page with their download statistics readily available (how I wish all journals would do this). The journal gets around 6200 page requests per day. But since article size varies widely, with some split into 100s of separate HTML pages, it is difficult to know how meaningful this is.  I was sent a spreadsheet of the stats from my paper published there (Terras, M (1999) A Virtual Tomb for Kelvingrove: Virtual Reality, Archaeology and Education. Internet Archaeology (7)) which suggests that there have been 2083 downloads of the PDF version of the paper from behind the paywall since 2001 (but some may be missing due to the way the reporting mechanism is set up) with none in the past year (compared to 276 downloads of this from UCL Discovery in the past six months, so many more from our institutional repository comparing like on like periods). The HTML version of the table of contents has been consulted 16, 282 times since 2001 (this is freely available to all comers) but there have been  67, 525 views of all files in the directory since then - but since the paper is comprised of hundreds of individual files, its difficult to ascertain readership. Judith Winters, the Editor of Internet Archaeology, notes "It is curious that when the journal went Open Access for about 2 weeks towards the end of last year, the counts did increase but not dramatically so" - so when a non-OA journal throws open its doors for a limited time (IA did this to mark open access week last year) its not like access figures go wild. That's really interesting, in itself. 
If you are still reading, then thanks. This stuff gets pretty turgid. But its been fascinating, for me, to see the (mostly positive) reactions publishers have to being approached about this - and surprising that not more people have actually asked publishers about these statistics. We are giving away our scholarship to publishers, in most cases: shouldn't we get to know how it fares in the wide, wide world? As citation counts, and h-indexes, and "impact" become increasingly important to external funding councils and internal promotion procedures within universities, why would journal publishers not make this information available to authors? But why don't they do it more routinely?

Will you need this type of information for the next grant proposal, or internal promotion, you chase? Why would you not be interested in how your research flies?  But journal publishers will only start providing authors with this kind of information routinely if enough scholars start to ask about it, and it becomes part of the mechanics of publishing research - particularly when publishing research online.

So if you have published in a print journal which has an online presence, or in an online journal, drop them an email to ask politely how your downloads are going*. Do it. Do it now. Ask them. Ask them!

*Perhaps someone online can provide some input as to whether such a request comes under the rights of individuals in the Data Protection Act in the UK.   If you are a named author on a journal article, does access statistics about that journal paper count as personal information? just a thought...


Arno Bosse said...

In the three years I edited the Journal of the Chicago Colloquium on Digital Humanities and Computer Science I don't think I was ever asked by an author for readership stats. It's a pity because JDHCS is based on Open Journal Systems and so basic reporting of the kind you've listed here is a snap.

Here is a Dropbox hosted CSV file with the current readership as well as an Excel 2008 .xlsx version of the same.

Our stats are roughly comparable. Average number of PDF downloads is 1642. Lowest is 837. Highest is 2237. The older articles (from 2009) generally see a little more readership of course but on the other hand it's clear from the numbers that our second issue in 2010 was actually the most well read overall.

Adam said...

I use SelectedWorks to gather my research, advertise what I have published, keep it open access when possible--and they send me monthly reports on downloads, now broken out by location. I find it a great, if incomplete, yardstick of readership. (See it as scholarship tab on )

Peter Stokes said...

An interesting point which I think needs to be taken seriously. It's especially so in these days of impact, but the figures can also be useful when people question the value of our subject. We have analytics for the DigiPal blog articles and have cited them in reports to funding bodies; they also got a mention in the THES as 'evidence that even highly obscure topics generate more interest than one would ever expect'. We have analytics for Digital Medievalist, but we've never been asked for them by authors to my knowledge. There are other ways of estimating this, too. Whatever else you think about it, one interesting use of is that it gives statistics for hits on your summaries of article and talks. That may not translate to downloads of the articles, of course, but it does probably give some idea (and has surprising results in my case).

Alan G Pike said...

Melissa, thanks for keeping this conversation going. I think that it is extremely important for grad students on the market and faculty facing T&P decisions to keep track of their publications in this way. I work for the OA journal Southern Spaces ( and, although our articles cannot be downloaded as .pdfs, we give our authors annual reader reports noting how many hits their articles received, as well as the number of unique visitors culled from our Google analytics data. Although your experience seems to suggest that this is not something that other journals are in the process of implementing, I think that sharing this information with authors is an important contribution to the scholarly community.