Snowflake Indicators | Postmodern Research Evaluation | Part 5 of ?

No two snowflakes are alike. No two people are the same.

                                                                                                   — Horshack

Image

                                                                Snowflakes by Juliancolton2 on flickr

Earlier posts in this series attempted to lay out the ways in which Snowball Metrics present as a totalizing grand narrative of research evaluation. Along with attempting to establish a “recipe” that anyone can follow — or that everyone must follow? — in order to evaluate research, this grand narrative appeals to the fact that it is based on a consensus in order to indicate that it is actually fair.

The contrast is between ‘us’ deciding on such a recipe ourselves or having such a recipe imposed on ‘us’ from the outside. ‘We’ decided on the Snowball Metrics recipe based on a consultative method. Everything is on the up and up. Something similar seems to be in the works regarding the use of altmetrics. Personally, I have my doubts about the advisability of standardizing altmetrics.

— But what’s the alternative to using a consultative method to arrive at agreed upon standards for measuring research impact? I mean, it’s either that, or anarchy, or imposition from outside — right?! We don’t want to have standards imposed on us, and we can’t allow anarchy, so ….

Yes, yes, QED. I get it — really, I do. And I don’t have a problem with getting together to talk about things. But must that conversation be methodized? And do we have to reach a consensus?

— Without consensus, it’ll be anarchy!

I don’t think so. I think there’s another alternative we’re not considering. And no, it’s not imposition of standards on us from the ‘outside’ that I’m advocating, either. I think there’s a fourth alternative.

SNOWFLAKE INDICATORS

In contrast to Snowball Metrics, Snowflake Indicators are a delicate combination of science and art (as is cooking, for that matter — something that ought not necessarily involve following a recipe, either! Just a hint for some of those chefs in The Scholarly Kitchen, which sometimes has a tendency to resemble America’s Test Kitchen — a show I watch, along with others, but not so I can simply follow the recipes.). Snowflake Indicators also respect individuality. The point is not to mash the snowflakes together — following the 6-step recipe, of course — to form the perfect snowball. Instead, the point is to let the individual researcher appear as such. In this sense, Snowflake Indicators provide answers to the question of researcher identity. ORCID gets this point, I think.

To say that Snowflake Indicators answer the question of researcher identity is not to suggest that researchers ought to be seen as isolated individuals, however. Who we are is revealed in communication with each other. I really like that Andy Miah’s CV includes a section that lists places in which his work is cited as “an indication of my peer community.” This would count as a Snowflake Indicator.

Altmetrics might also do the trick, depending on how they’re used. Personally, I find it useful to see who is paying attention to what I write or say. The sort of information provided by Altmetric.com at the article level is great. It gives some indication of the buzz surrounding an article, and provides another sort of indicator of one’s peer community. That helps an individual researcher learn more about her audience — something that helps communication, and thus helps a researcher establish her identity. Being able to use ImpactStory.org to craft a narrative of one’s impact — and it’s especially useful not to be tied down to a DOI sometimes — is also incredibly revealing. Used by an individual researcher to craft a narrative of her research, altmetrics also count as Snowflake Indicators.

So, what distinguishes a Snowflake Indicator from a Snowball Metric? It’s tempting to say that it’s the level of measurement. Snowball Metrics are intended for evaluation at a department or university-wide level, or perhaps even at a higher level of aggregation, rather than for the evaluation of individual researchers. Snowflake Indicators, at least in the way I’ve described them above, seem to be aimed at the level of the individual researcher, or even at individual articles. I think there’s something to that, though I also think it might be possible to aggregate Snowflake Indicators in ways that respect idiosyncrasies but that would still allow for meaningful evaluation (more on that in a future post — but for a hint, contrast this advice on making snowballs, where humor and fun make a real difference, with the 6-step process linked above).

But I think that difference in scale misses the really important difference. Where Snowball Metrics aim to make us all comparable, Snowflake Indicators aim to point out the ways in which we are unique — or at least special. Research evaluation, in part, should be about making researchers aware of their own impacts. Research evaluation shouldn’t be punitive, it should be instructive — or at least an opportunity to learn. Research evaluation shouldn’t so much seek to steer research as it should empower researchers to drive their research along the road to impact. Although everyone likes big changes (as long as they’re positive), local impacts should be valued as world-changing, too. Diversity of approaches should also be valued. Any approach to research evaluation that insists we all need to do the same thing is way off track, in my opinion.

I apologize to anyone who was expecting a slick account that lays out the recipe for Snowflake Indicators. I’m not trying to establish rules here. Nor am I insisting that anything goes (there are no rules).  If anything, I am engaged in rule-seeking — something as difficult to grasp and hold on to as a snowflake.

Should we develop an alt-H-index? | Postmodern Research Evaluation | 4 of ?

In the last post in this series, I promised to present an alternative to Snowball Metrics — something I playfully referred to as ‘Snowflake Indicators’ in an effort to distinguish what I am proposing from the grand narrative presented by Snowball Metrics. But two recent developments have sparked a related thought that I want to pursue here first.

This morning, a post on the BMJ blog asks the question: Who will be the Google of altmetrics? The suggestion that we should have such an entity comes from Jason Priem, of course. He’s part of the altmetrics avant garde, and I always find what he has to say on the topic provocative. The BMJ blog post is also worth reading to get the lay of the land regarding the leaders of the altmetrics push.

Last Friday, the editors of the LSE Impact of Social Sciences blog contacted me and asked whether they might replace our messy ’56 indicators of impact’ with a cleaned-up and clarified version. I asked them to add it in, without simply replacing our messy version with their clean version, and they agreed. You can see the updated post here. I’ll come back to this later in more detail. For now, I want to ask a different, though related, question.

COULD WE DEVELOP AN ALT-H-INDEX?

The H-index is meant to be a measure of the productivity and impact of an individual scholar’s research on other researchers, though recently I’ve seen it applied to journals. But the original idea is to find the number of a researcher’s publications that have been cited at least X times. Of course, the actual number of one’s H-index will vary based on the citation data-base one is using. According to Scopus, for instance, my H-index is 4. A quick look at my Researcher ID and it’s easy enough to see that my H-index would be 1. Then, if we look at Google Scholar, we see that my H-index is 6. Differences such as these — and the related question of the value of such metrics as the H-index — are the subject of research being performed now by Kelli Barr (one of our excellent UNT/CSID graduate students).

Now, if it’s clear enough how the H-index is generated … well, let’s move on for the moment.

How would an alt-H-index be generated?

There are a several alternatives here. But let’s pursue the one that’s most parallel to the way the H-index is generated. So, let’s substitute products for articles and mentions for citations. One’s alt-H-index would then be the number of products P that have at least P mentions on things tracked by altmetricians.

I don’t have time at the moment to calculate my full alt-H-index. But let’s go with some things I have been tracking: my recent correspondence piece in Nature, the most recent LSE Impact of Social Sciences blog post (linked above), and my recently published article in Synthese on “What Is Interdisciplinary Communication?” [Of course, limiting myself to 3 products would mean that my alt-H-index couldn’t go above 3 for the purposes of this illustration.]

According to Impact Story, the correspondence piece in Nature has received  41 mentions (26 tweets, 6 Mendeley readers, and 9 CiteULike bookmarks). The LSE blog post has received 114 mentions (113 tweets and 1 bookmark). And the Synthese paper has received 5 (5 tweets). So, my alt-H-index would be 3, according to Impact Story.

According to Altmetric, the Nature correspondence has received 125 mentions (96 tweets, 9 Facebook posts/shares, 3 Google+ shares, blogged by 11, and 6 CiteULike bookmarks), the LSE Blog post cannot be measured, and the Synthese article has 11 mentions (3 tweets, 3 blogs, 1 Google+, 2 Mendeley, and 2 CiteULike). So, my alt-H-index would be 2, according to Altmetric data.

Comparing H-index and alt-H-index

So, as I note above, I’ve limited the calculations of my alt-h-index to three products. I have little doubt that my alt-h-index is considerably higher than my h-index — and would be so for most researchers who are active on social media and who publish in alt-academic venues, such as scholarly blogs (or, if you’re really cool like my colleague Adam Briggle, in Slate), or for fringe academics, such as my colleague  Keith Brown, who typically publishes almost exclusively in non-scholarly venues.

This illustrates a key difference between altmetrics and traditional bibliometrics. Altmetrics are considerably faster than traditional bibliometrics. It takes a long time for one’s H-index to go up. ‘Older’ researchers typically have higher H-indices than ‘younger’ researchers. I suspect that ‘younger’ researchers may well have higher alt-H-indices, since ‘younger’ researchers tend to be more active on social media and more prone to publish in the sorts of alt-academic venues mentioned above.

But there are also some interesting similarities. First, it makes a difference where you get your data. My H-index is 4, 1, or 6, depending on whether we use data from Scopus, Web of Science, or Google Scholar. My incomplete alt-H-index is either 3 or 2, depending on whether we use data from Impact Story or Altmetric. An interesting side note that ties in with the question of the Google of altmetrics is that the reason for the difference in my alt-H-index when using data from Impact Story and Altmetric is that Altmetric requires a DOI. With Impact Story, you can import URLs, which makes it considerably more flexible for certain products. In that respect, at least, Impact Story is more like Google Scholar — it covers more — whereas Altmetric is more like Scopus. That’s a sweeping generalization, but I think it’s basically right, in this one respect.

But these differences raise the more fundamental question, and one that serves as the beginning of a response to the update of my LSE Impact of Social Sciences blog piece:

SHOULD WE DEVELOP AN ALT-H-INDEX?

It’s easy enough to do it. But should we? Asking this question means exploring some of the larger ramifications of metrics in general — the point of my LSE Impact post. If we return to that post now, I think it becomes obvious why I wanted to keep our messy list of indicators alongside the ‘clarified’ list. The LSE-modified list divides our 56 indicators into two lists: one of ’50 indicators of positive impact’ and another of ‘6 more ambiguous indicators of impact’. Note that H-index is included on the ‘indicators of positive impact’ list. That there is a clear boundary between ‘indicators of positive impact’ and ‘more ambiguous indicators of impact’ — or ‘negative metrics’ as the Nature editors suggested — is precisely the sort of thinking our messy list of 56 indicators is meant to undermine.

H-index is ambiguous. It embodies all sorts of value judgments. It’s not a simple matter of working out the formula. The numbers that go into the formula will differ, depending on the data source used (Scopus, Web of Science, or Google Scholar), and these data also depend on value judgments. Metrics tend to be interpreted as objective. But we really need to reexamine what we mean by this. Altmetrics are the same as traditional bibliometrics in this sense — all metrics rest on prior value judgments.

As we note at the beginning of our Nature piece, articles may be cited for ‘positive’ or ‘negative’ reasons. More citations do not always mean a more ‘positive’ reception for one’s research. Similarly, a higher H-index does not always mean that one’s research has been more ‘positively’ received by peers. The simplest thing it means is that one has been at it longer. But even that is not necessarily the case. Similarly, a higher alt-H-index probably means that one has more social media influence — which, we must realize, is ambiguous. It’s not difficult to imagine that quite a few ‘more established’ or more traditional researchers could interpret a higher alt-H-index as indicating a lack of serious scholarly impact.

Here, then, is the bottom line: there are no unambiguously positive indicators of impact!

I will, I promise, propose my Snowflake Indicators framework as soon as possible.

Postmodern Research Evaluation? | 3 of ?

Snowball Metrics present as a totalizing grand narrative. For now, let me simply list some of the ways in which this is so, with little or only brief explanations.

  1. Snowball metrics are a tool for commensuration, “designed to facilitate crossinstitutional benchmarking globally by ensuring that research management information can be compared with confidence” (p. 5 — with all references to page numbers in this PDF).
  2. Snowball metrics are based on consensus: “Consensus on the ‘recipes’ for this first set of Snowball Metrics has been reached by a group of UK higher education institutions” (p. 8).
  3. Despite the limited scope of the above consensus, however, Snowball Metrics are intended to be universal in scope, both in the UK “We expect that they will apply equally well to all UK institutions” and “to further support national and global benchmarking” (p. 8).
  4. Snowball Metrics are presented as a recipe, one to be followed, of course. The word occurs 45 times in the 70 page PDF.
  5. Other key words also appear numerous times: agree (including variations, such as ‘agreed’) appears 31 times; method (including variations, such as ‘methods’ or ‘methodology’) appears 22 times; manage (including variations) appears 15 times; impact appears 16 times, 11 times in terms of “Field-Weighted Citation Impact.”
  6. Snowball Metrics are fair and “have tested methodologies that are freely available and can be generated by any organisation” (p. 7).
  7. Snowball Metrics are ‘ours‘ — they are  “defined and agreed by higher education institutions themselves, not imposed by organisations with potentially distinct aims” (p. 7).

To sum up, using their own words:

The approach is to agree a means to measure activities across the entire spectrum of research, at multiple levels of granularity: the Snowball Metrics Framework. (p. 7)

Coming in the next post (4 of ?), I present an alternative ‘framework’ — let’s call it Snowflake Indicators for now.

Postmodern Research Evaluation? | 2 of ?

First, let me say where I am coming from and what I mean by ‘postmodern’. I’m working from Lyotard’s simple “definition” of the term: “incredulity toward metanarratives” (from the introduction to The Postmodern Condition). One interesting question that arises from this definition is the scope of this incredulity — what counts, in other words, as a metanarrative?

Lyotard also distinguishes between what he calls ‘grand’ narratives and ‘little stories’ (les petits récits). Importantly, either a grand narrative or a little story can make the ‘meta’ move, which basically consists in telling a story about stories (where ‘story’ is understood broadly). Put differently, it is not the ‘meta’ toward which the postmodern reacts with incredulity. It is, rather, the totalizing character of the grand narrative that evinces doubt. By its very nature, the claim to have achieved certainty, to have told the whole story, undermines itself — at least from the postmodern perspective.

Of course, the grand narrative is always at pains to seek legitmation from outside itself, to demand recognition, to assert its own justice. Often, this takes the form of appeal to consensus — especially to a consensus of experts and authorities. The irony of the little stories is that they legitimate themselves precisely in not seeking hegemony over all the other stories. Not seeking jurisdiction over the whole, the little stories have the status — a venerable one — of ‘fables’. The little stories are told. We are told to accept the grand narrative.

Post 1 of ?

Postmodern Research Evaluation? | 1 of ?

This will be the first is a series of posts tagged ‘postmodern research evaluation’ — a series meant to be critical and normative, expressing my own, subjective, opinions on the question.

Before I launch into any definitions, take a look at this on ‘Snowball Metrics‘. Reading only the first few pages should help orient you to where I am coming from. It’s a place from where I hope to prevent such an approach to metrics from snowballing — a good place, I think, for a snowball fight.

Read the opening pages of the snowball report. If you cannot see this as totalizing — in a very bad way — then we see things very differently. Still, I hope you read on, my friend. Perhaps I still have a chance to prevent the avalanche.