Thursday, 9 May 2013

What People Study When They Study Twitter

So, keeping good to my Open Access promises - my latest co-authored paper to go up in preprint, which will be out in print sometime this year in the Journal of Documentation - hot off the presses! Just as it goes up in preprint behind a paywall on the journal pages! is a jointly authored paper with Shirley Williams, from the University of Reading, and Claire Warwick, from UCLDIS. And here it is:

Williams, S and Terras, M and Warwick, C (2013) "What people study when they study Twitter: Classifying Twitter related academic papers". Journal of Documentation , 69 (3). Free PDF Download From UCL repository.

In this paper, we identify the 1161 academic papers that were published about Twitter between 2007 (when the first papers on Twitter appeared) and the close of 2011. We then analyse method, subject, and approach, to show what people are doing (or have been publishing!) on the use of Twitter in academic studies, providing a framework within which researchers studying the development and use of twitter as a source of data will be able to position their work. Oh, we also provide the list of the papers we found, so you can have a look-see yourself.


And the story behind this one? Shirley was introduced to Claire and myself by the late (and much missed) Prof. Mark Baker at Reading, when we undertook the Linksphere project.  Now, I've written about Linksphere elsewhere - it was an ambitious project which really didnt take off due to a variety of factors - but the good things to come out of it were our RA, Claire Ross, and meeting Shirley. We published a paper on the use of twitter by academics at conferences when the Linksphere project was going. A year or so after the project finished, Shirley was granted a research sabbatical, and asked Claire and I if we would be interested in carrying on that work with her. Kicking around a few ideas, we wondered whether it would be possible to round up all the published work on Twitter - what are people using it for? And then to analyse it, to see if we can classify how people are using it, what the datasets are, what the methods are, and what the domains are. Wouldnt it be nice to have a bibliography on the use of twitter in research papers? And so away Shirley went, working with Claire and I, and building up this nice framework in which we can look at twitter based research.

The paper was accepted into the Journal of Documentation last summer, and this month went up in preprint at the Journal of Documentation website, and is now out in Open Access from UCL's research repository, before it even hits the Library shelves. Which is how it should be, non?

Monday, 15 April 2013

Changes at UCLDH

We’re going into our fourth year at UCL Centre for Digital Humanities, and there have been quite a few changes along the way. Since the centre was founded under the direction of Professor Claire Warwick, Claire has also taken on Head of Department in UCL Department of Information Studies, as well as Vice Dean of Research for the Arts and Humanities faculty. Over the past year, Claire and I have been co-directing the centre. I’m pleased, proud, and a little bit nervous to say that from now on I’ll be taking on full operational duties as Director of UCLDH, still working closely with Claire, who remains committed to Digital Humanities as a subject, and UCLDH in particular. I’d like to take this opportunity to thank Claire for her continued input into UCLDH – and I look forward to working with her in this slightly different capacity over the next few years, as well as the rest of the team at UCLDH, and putting my efforts into building up UCLDH even further after its great start.

Onwards! 

Monday, 8 April 2013

How Many Digital Humanists does it take to change a lightbulb?


Q. How many Digital Humanists does it take to change a lightbulb?
A. Two: The first to change the lightbulb using the available, existing technology. The second to say “You’re not DH unless you make the lightbulb yourself!”.

Q. How many Digital Humanists does it take to change a lightbulb?
A. Yay! Lets Crowdsource!

Q. How many Digital Humanists does it take to change a lightbulb?
A. One. But they have to have a PhD in Byzantine Sigillography AND at least 4 years experience of XSLT before you are going to let them near that bad boy.

Q. How many Digital Humanists does it take to change a lightbulb?
A. As many as you like, but no REAL humanities academic is going to trust that lightsource.

Q. How many Digital Humanists does it take to change a lightbulb?
A. It depends. Does the lightbulb count as a “scholarly primitive”?

Q. How many Digital Humanists does it take to change a lightbulb?
A. One. But only if they are allowed to include “multimedia experience” in their tenure portfolio.

 Q. How many Digital Humanists does it take to change a lightbulb?
A. These are such IN JOKES only the COOL KIDS on twitter will get them. Pout.

 (I originally came up with these jokes on the DayofDH2011 - reposting them here on the DayofDH2013 to have a copy on my own blog.)






This is not a blog post

(This is the blog post I've put up on the Day of DH site - where Digital Humanists all over the world are telling people what they are up to on a specific day. Me, I'm working hard in a different way, on holiday). 

This Day of DH sees me not doing much DH at all… but yet. I’m on holiday, on the second week of the school easter break, with my three young boys, in Scotland, staying with family. I wasn’t going to blog anything at all – but then, hey, this is part of my life as a DHer too, right? Its not just about the work, its about what you do elsewhere? But can you actually switch off from DH, from work, when you are away?– at least, it seems not that way if you are an academic.

So what did we get up to today? Not a bad night, up only three times (two night terrors and one sea shanty) and then woken early to a boy shouting “Mummy! Robot! Monkey!” repeatedly. A slow walk to the shop for papers and sweeties, some playing with watering cans, a trip to a garden centre to meet an old friend and her kid for coffee, a visit from my Aunt. The endless cleaning and tidying and management of stuff which comes with having three small people, roll calls to ensure people have their shoes and their stuffed animals from one stop to the next. Highlights included driving alongside a wind farm for a mile or so and the boys shouting “BIG. WINDMILLS! BIG. WINDMILLS!” – lowlights include turning my back for two minutes and seeing Twin Two up 8 feet in the air on something he shouldn’t be climbing on in the garden centre – tuts from other parents in the vicinity very forthcoming.

Its not like I haven’t thought about work. I find it very difficult to switch off when on holiday – it takes me about a week to stop sending myself emails reminding me to do X, Y, and Z when I get back. I’m on the twitters – I find hanging out on twitter gets me through the day when looking after the three weans all day and all night, especially if they are up through the night – and today of all days, it was fascinating to see twitter erupt and turn and shift around a news item. The asynchronous nature suits having a quick shufty at quiet moments – seconds – in the parenting day. But I haven’t been on work email for a week or so, and wont be for another week or so. Usually I’m glued to it, answering emails at all times of the day, but its important for me to step back from it a few times a year. I popped on there a couple of days ago to action something time-limited and laughed at all the emails that had come in setting me deadlines I hadn’t agreed to that I will miss in my absence. Meh – I’m usually quick on the mark but this week? I’m teaching my twins how to do forward rolls instead.

As I do more and more managerial work in my role at UCL Centre for Digital Humanities I wonder really how much of my interaction with computing is through email. (Most of it now). I’m a professional email answerer, really.  Been a while since I implemented something myself.  I wonder, amidst all the arguments about should DHers code, etc, how the whole “can code, but manages coders” fits in. But this week, I’m not even answering email. Oh no.  I’m on holiday. I’m away. And goodness, it is good to step away from email, that harsh, thankless taskmistress. But if I’m not on email…. I am a DHer any more?

But its not like I haven’t thought about work. It’s the blessing/curse of academia: obsessive compulsive behavior is rewarded, and its hard to switch off the obsession. So in the past week or so I’ve been ruminating on next steps, projects I’m undertaking, research I should do next, blog posts that are brewing, in between having cups of tea at my grandparents or visiting my cousins or dandling poorly boys at 3am. Everything you can do when you are not on email. The nice stuff online and offline, without the work email.

It’s not that I haven’t thought about work. Heck, I even blogged for the Day of DH. An example of the blended life style us DHers live: how hard it is to get away, even when you are away, how connected we all are, how it’s all a balancing act.

So I’m not sure that this is a blog post. I’m not sure that this is a holiday.  I don’t know what it is… must be DH, then.

Thursday, 28 March 2013

What Price a Hashtag? The cost of #digitalhumanities


The academic community I’m most heavily involved in – Digital Humanities – are fairly invested in twitter. At all times of the day there are major figures, students, and newbies in the field on there, just hanging out, debating topics, forwarding links to events, job postings, interesting research and cool things they have stumbled upon. People have studied this – graphing and charting the discussions, especially around the DH conference, and heck, even I have co-authored a paper on the subject.

I’m currently working on a book/project called Defining Digital Humanities  and I thought, wouldn’t it be fun to get all – and I mean all – the tweets that contain the hashtag #Digitalhumanities – what fun could be had charting the growth of the discipline, the geolocation of tweets, the networks that exist, the sentiments surrounding it – etc etc. Now, hindsight is a grand thing  - I should have thought to start scraping these back in 2006 – but surely it must be possible to get access to this for research? So I asked.

The first approach was to Gnip – who have “full historical access to the twitter firehose available exclusively”.  They were really very helpful, and we got into a conversation about my needs, their licensing, and – of course – costs. The upshot is that if you want a hashtag, you can get it for a price, with the text delivered in JSON format. I was quoted between $15,000 and $25,000 for the full historical set (depending on the exact volume of the data, they are now looking into it to give me the final figure - I and they dont yet know how many tweets there are containing this hashtag).

The second place I asked was Datasift– “the leading platform for building applications with insights derived from the most popular social networks and news sources”.  They do have access to the historical twitter firehose, but they don’t do one off searches, and licensing will start at $3000 per month to get access to it (on a yearly contract). They will be launching a pay as you go service at some point, they tell me. By the way,  you can get $10 worth of free credit for processing if you sign up and play around with some current searches: I set a set for #digitalhumanities and I had run out of credit within a few hours. (I find the user interface very obfuscating  – I’m still wrangling with it to see what that data actually is!).

Now, these costs are very little compared to the costs to access the full firehose and lets face it – a free service like twitter has to make its money somewhere. These were not vexatious enquiries: I’d really like to do this study. But now I have to find $25k down the back of the sofa to get access to this data (and incidentally, if I do, I wont be allowed to quote it, only to show the stats that emerge from the analysis).  $25k is a fair whack of money in academia-land. It will also take around 6 months (at least) to write it into a grant proposal to raise the money – and how to persuade academic funders that buying this dataset is good use of their money? Frankly, I’m not sure that will fly in the arts and humanities, where complete grant costings can come under £100k for a one year project.

Thinking caps are now on to see how we can get funding put together to get access to the data of the community I – goddamit – helped (in some small way) to create. I love twitter with a passion and it continues to inform and aid my teaching and research. But when we invest so much in a free service, we are selling ourselves. It’s interesting to see how much #digitalhumanities is “worth” to others. Anyone got a free $25k?

Tuesday, 30 October 2012

What's in a name? Academic Identity in the metadata age, or, I didnt see #tarotgate coming


At the end of last week I was pulling together some internal appraisal documentation - the kind of thing where you say "oh look how many people cited my work over the past year". Wandering over to my Google scholar profile to check some citation counts I noted something weird. Alongside my usual digital humanities-ey, digitisation-ey, digital classicist-ey journal papers and book chapters were some strange papers that I definitely had not written. Things like Human Experience and Tarot Symbolism, (link still working to my Google scholar profile at time of writing) or Tarot and Projective Hypothesis.

Now, I'm a committed atheist, which means I also don't care for the occult either. And while I try to respect other's research choices (its a bit like feminism - I may not like what you are doing with your life, but I respect your right to choose what you do with it) this is not something that, professionally, I would choose to be associated with. And it's extremely strange to see your name academically associated with something you don't want to be associated with - especially when academic identity is everything in this game.

Perhaps there is another Melissa M. Terras? I first thought. I'm very lucky that there aren't too many Terrases about - I've never really had to deal with name disambiguation (I know academics who have other colleagues with the same name as them in the same department) so I'm a bit spoiled on that front in real life. Melissa is a really rare name in Scotland (although not other parts of the world) and so I had never met another one until I was 15. I sometimes get confused for Melody Terras, who is in Psychology, and automated algorithms especially like to assign me her works (such as Google scholar, or the algorithm in our open access repository).  But no, the works clearly showed that the Melissa M. Terras was in the Department of Information Studies at UCL. There's only one of us there, and that's me.

A bit of googling told me that the author of the book in which all of these chapters were published was Inna Semetsky. A few seconds more of googling took me to her personal website, which had a big fat CALL ME NOW skype button on it (I'm not linking to it here, as I dont want to encourage anyone to call her. If you want to seek her out, you will have to do so yourself).  So I CALLED HER NOW, not really expecting anyone to pick up. She picked up on video chat after a few rings, although it was clearly in the middle of the night wherever she was (which I wasnt to know). She knew immediately who I was: it was clear that the book had been up with the wrong authorship attributed for some time, which I find strange: if you had written a book, and the "Internet" decided it was written by someone else, would you not fight to get it righted? A heated exchange followed. I dont want to say too much about Inna Semetsky - she is entitled to her own privacy and her own research space. Let's just say we didnt exactly hit it off, and that heated exchange continued over email. (Everyone knows that the one way to anger someone from Scotland is to call them English, right?)

By now it had become clear that there was no real malice in this: but I suspected metadata fail. I had previously published a chapter in a book which was published by Sense Publishers - who published Semetsky's book Re-Symbolization of the Self, Human Development and Tarot Hermeneutic. It seemed to me that somewhere in the ingestion process into the Springer system, that my name had gotten into the author field - field slippage in a database? that's fairly common? S next to T in the alphabet, so perhaps field slippage in Sense Publisher's author database? - which meant I had been erroneously associated with this work. So now to get it right.

I contacted both Springer and the Press. Sense Publishers were very helpful, but ducked for cover when I hit them with the cease and desist. Springer said they would take it down. But they didnt. I complained again, and lined up the UCL lawyers to begin legal proceedings. Springer said they would take down the content ("We induced the deletion of the content from our platform, which will be done as soon as possible" - but hey, their English is better than my German, so I cant really mock them too badly when they later said they would delete the "hole" book). It took 6 days of constant emailing to complain and escalating legal threats before they eventually assigned the correct authorship information to the publication, which really must've been a 5 minute job. I hate to think how they would handle a request from someone not nearly as... pushy as me.

I repeatedly asked for an explanation from the  press and Springer - explaining a professional interest (and thinking of my dear, neglected blog, and the folks who were all pitching in on twitter by this time, following #tarotgate and the toings and froings from Springer and book author and me). I have received no explanation. You take my name, you pin it to something else, and you expect me not to want to find out why? When I work in an information studies department? No explanation has been given, and really - without malice! - I would like to know the assignation structure of author to material, given it seems so very fragile.

And so I am no longer associated with Tarot in publications databases. Except at time of writing, I still am. Various places crawl and syndicate authorship content online - Google Scholar is still showing me as author of various pieces of Tarot scholarship, and now its going to be up to me to chase down mentions of my name associated with something I never chose to be associated with, simply because of an automated error, replicated across time and space and electronic repository, in a professional space becoming obsessed with citation counts and authorship and If You Liked This You May Like That, all churned out by thousands of servers and databases and... who cares if a database field slips in all this and an academic name is assigned to the wrong thing? It's the future! It's how scholarship works these days!

I have up til now pretty much ignored the discussions and systems about how to look after your scholarly identity - things like ORCID - why do I need to register! I have an unusual name! Everyone knows that it's me who publishes on the digitisation-ey, digital humanities-ey stuff! Except the machines, the machines they dont care. We're looking at a future where we dont just have to look after the stuff we have published, we now have to weed out the things that we havent. We have to be vigilant that the joiney-uppey automated systems dont replicate authorship errors uncontrollably. How rare of commonplace is this? I have no way to tell. But when the electronic record can be so easily compromised, how can we trust digital-only publications, without a canonical physical artefact to check?

It takes a long time to build up a scholarly identity. One slipped database field may have permanently associated me with an area I, quite frankly, dont respect. This blog post will go some way to explaining how that happened, so serves a dual purpose - explanation of how, and reference for why it's not me. When was the last time you checked what the Internet said you wrote? Will you ever be able to rectify it, should a mistake be made? Will I? I didnt see that one coming. Maybe I should take up Tarot.

Update: 31/10/12, Response from the Publisher!!!!!! I will paste the email below.

My colleague Georg Kaimann alerted me about the serious mistake in the author information of several chapters in Dr. Semetsky’s book. Please do accept our apologies, also on behalf of our data conversion partner, and be assured that the problem is taken seriously.

High quality metadata, especially the correctness of titles, author names, and affiliations are of utmost importance for us. We have therefore taken the matter up with the production manager at our supplier. It turned out that errors in two places led to the incorrect author information which was published online: Firstly, the operator who created the xml metadata mistakenly used an already filled-out sample template for updating the chapter metadata; and secondly, quality control did not check the metadata in all chapters because they assumed that they were created in the approved way.

Although this is a very rare error (I don’t remember having seen such a case before), we of course want to rule out that it can happen again. At the vendors end, the technical team will work on improving the tool for capturing metadata information to avoid errors related to manual intervention. At Springer’s end, we will investigate if further data checks can be introduced so that errors are caught early in the production process.

Once again, we apologize for this mistake. Be assured that this problem is taken seriously as it affects the relationship between authors and publisher which is of high importance to us.


Best regards

Ilse Wittig
Springer
Production

Manager Quality Assurance
Update: 06/05/12. Further contact from the publishers (Sense Publishers, this time, not Springer). See our conversation below. I would note at this stage that they have indeed kept their promises, the catalogue now reflects the true authorship of the pieces, and even google scholar finally no longer lists me as author of them. It doesn't mean that this didn't happen, though... hence the emails below. Oh, and its lovely weather here as I type this, hence my final comment.

Dear Melissa:

Now this matter had been solved quite a while ago, could you please delete it from your blog?

http://blogs.lse.ac.uk/impactofsocialsciences/2012/11/26/terras-identity-metadata-age/

Many thanks,
Peter de Liefde
SENSE PUBLISHERS

-----Oorspronkelijk bericht-----
Van: Inna Semetsky
Verzonden: zondag 28 april 2013 21:46
Aan: Peter de Liefde
Onderwerp: Problem again

Dear Peter
I sincerely hope that you will contact Melissa to request her to remove this insulting post from the Internet. As you recall she promised to to so after she apologized for her insinuations . Still the post exists. This is degrading to me as the author and to Sense as publishers. Please take it in your hands as it was a case between Sense and Springer -- yet my name is involved. I hope that upon your request Melissa will delete the post which goes viral here http://blogs.lse.ac.uk/impactofsocialsciences/2012/11/26/terras-identity-metadata-age/
I consider this matter a subject if legal action -- my academic status is tarnished because of Springer -- and Springer got all data from Sense.

Please contact Melissa!
Thank you
Inna


Sent from my iPhone
To which I replied:

Thanks for your email!

- it may have been resolved, but it still happened. My blog post will be staying up, both here and on my own personal blog. There is nothing factually incorrect, and I haven't been nasty or untruthful.

best wishes,

Melissa
To which the publisher replied:

Good for you.

Thank God most of the people I deal with are a lot nicer.

All best wishes,
Peter de Liefde
SENSE PUBLISHERS

To which I replied:

Hmmm. I'm too nice to get into a flame war with you over this.

Have a good day, and enjoy the sun.

Melissa

Saturday, 22 September 2012

Showing the Arts and Humanities Matter


Greetings from Dublin airport. Its been a busy week - on Tuesday I hosted the first 4Humanities conference, at UCL, then jetted off to Galway, Ireland where on Thursday I keynoted at the Digital Arts and Humanities PhD Programme annual conference.

The conference at UCL, entitled Showing the Arts and Humanities Matter gathered together various initiatives who are actively promoting the arts and humanities, to allow discussion regarding what is the best way forward to ensure that the benefits and contribution that the Arts and Humanities make to society is recognised. It was a fantastic day, and I learnt a lot - I'm fairly new to this area. Ernesto Priego live blogged and tweeted the event, and there is a storify of the tweets for those who want to catch up on the discussion.

One thing we decided to do was have a practice based artist, Dr Lucy Lyons, as a conference artist in residence, sketching and note taking, using a different sort of technology (pen, pencil, paper) than the ones we usually use, in what she calls "a frenetic, haptic method of note taking and engaging with the speakers". Lucy created a wonderful set of notes and drawings of the day that capture the flavour of the event.

It is interesting to reflect that our discussions seldom wandered into talking about the practice led arts - and the fact that the immediate reaction of many speakers who saw Lucy's drawings was "great! a new avatar for me on twitter!" (how we are all addicted) rather than a discussion of what integrating this process into the conference setting would show or tell us. I'm still processing that, myself - but I loved having a conference artist in residence, and hope to feature this again at future events.

Time for me to check in, I'll tell you about #dahphdie at another time!