There is something very appealing about the simplicity of using a single number to indicate the worth of a scientific paper.
But a growing group of scientists, publishers, funders, and research organizations are increasingly opposed to the broad use of the Journal Impact Factor (JIF) as the sole measure used to assess research and researchers.
I don’t have time to be doing this, but it’s important. Making time is a state of mind — as, claims Cameron Neylon, is ‘Open’:
Being open as opposed to making open resources (or making resources open) is about embracing a particular form of humility. For the creator it is about embracing the idea that – despite knowing more about what you have done than any other person – the use and application of your work is something that you cannot predict.
There’s a lot to unpack, even in this short excerpt from Neylon’s post. Whether, for instance, the idea of ‘humility’ is captured by being open to unintended applications of ones work — surely that’s part, but only part, of being open — deserves further thought. But I really do think Cameron is on to something with the idea that being open entails a sort of humility.
To see how, it’s instructive to read through Robin Osborne’s post on ‘Why open access makes no sense‘:
For those who wish to have access, there is an admission cost: they must invest in the education prerequisite to enable them to understand the language used. Current publication practices work to ensure that the entry threshold for understanding my language is as low as possible. Open access will raise that entry threshold. Much more will be downloaded; much less will be understood.
There’s a lot to unpack here, as well. There’s a sort of jiujitsu going on in this excerpt that requires that one is at least familiar with — if it is not one’s characteristic feeling — the feeling that no one will ever understand. What is obvious, however, is Osborne’s arrogance: there is a price to be paid to understand me, and open access will actually raise that price.
In my original talk on “Open Access and Its Enemies” I traced one source of disagreement about open access to different conceptions of freedom. Those with a negative concept of freedom are opposed to any sort of open access mandates, for instance, while those appealing to a positive concept of freedom might accept certain mandates as not necessarily opposed to their freedom. There may be exceptions, of course, but those with a positive concept of freedom tend to accept open access, while those with a negative view of freedom tend to oppose it. The two posts from Neylon and Osborne reveal another aspect of what divides academics on the question of open access — a different sense of self.
For advocates of humility, seeing our selves as individuals interferes with openness. In fact, it is only in contrast to those who view the self as an individual that the appeal to humility makes sense. The plea is that they temper their individualistic tendencies, to humble their individual selves in the service of our corporate self. For advocates of openness, the self is something that really comes about only through interaction with others.
Advocates of elitism acknowledge that the social bond is important. But it is not, in itself, constitutive of the self. On the contrary, the self is what persists independently of others, whether anyone else understands us or not. Moreover, understanding me — qua individual — requires that you — qua individual — discipline yourself, learn something, be educated. Indeed, to become a self in good standing with the elite requires a certain self-abnegation — but only for a time, and only until one can re-assert oneself as an elite individual. Importantly, self-abnegation is a temporary stop on the way to full self-realization.
Self-sacrifice is foreign to both of the advocate of humility and the advocate of elitism, I fear. Yet it is only through self-sacrifice that communication is possible. Self-sacrifice doesn’t dissolve the individual self completely into the corporate self. Nor does self-sacrifice recognize temporary self-abnegation on the road to self-assertion as the path to communication. Self-sacrifice takes us beyond both, in that it requires that we admit that content is never what’s communicated. A self with a truly open mindset would have to be able to experience this. Alas, no one will ever understand me!
I noticed two articles on sports concerned with officials’ decisions today, and their juxtaposition raises more questions than either, alone.
The first was an article in The Guardian questioning Hawk-Eye, the technology used in tennis to determine whether balls landed in our out of bounds, thus usually to determine points won or lost. Anyone who watched the Wimbledon Gentlemen’s Final yesterday witnessed Novak Djokavic yelling “What is going on?!” at the umpire when calls — including challenges that were resolved by appealing to Hawk-Eye — seemed continually to go against him.The article describes the results of a paper that critiques Hawk-Eye’s accuracy and ends with an interesting question:
The paper concludes that Hawk-Eye should be used as an aid to human judgement (their italics), and that, if used with a little more nuance, it could provide added enjoyment of the games involved and public understanding of technology, its uses and its limitations. What do you think? Do you want a simple binary decision in your sports, or would you rather know the accuracy of Hawk-Eye’s output?
I’m old enough to remember a time before Hawk-Eye, and I recall when it was introduced. I remember being skeptical at the time. Why think that a model of what happened would be better than human judgment? And, really, is the issue which is more accurate? Part of sports is overcoming bad calls. Our collective mania for objectivity borders on madness.
To answer the author’s question at the end of the article: neither. I’d prefer if we took such technologies, including instant replay, completely out of all game-time decisions in sporting events. If leagues want to review officials’ decisions later, after the game has been decided, fine. But this rush to judgment, as if we have to have the ‘objectively’ correct answer right then and there, is a bane to sports.
Do we really enjoy Djokavic’s comparatively mild protests more than John McEnroe’s? You cannot be serious!
On the same page as the Hawk-Eye article is a link to another article in The Guardian, this one on a recent local soccer match in Brazil during which a referee stabbed a player, then was mobbed, stoned, and decapitated by the angry crowd. Say what?! Look, I’m from Alabama, so I’m well aware of people who do really stupid things in the name of sports. As an Auburn fan, I’m glad no one has decapitated Harvey Updyke, yet. But it’s interesting how the story of the Brazilian double murder (yes, the player stabbed by the ref also died) is treated — as an image problem:
Brazil faces mounting pressure to show it is a safe place for tourists before 12 cities host the 2014 World Cup and Rio de Janeiro the Olympics in 2016. The Confederations Cup in June was marked by violence as anti-government protestors angered by the amount of money being spent on the events clashed with police.
So, unless Brazil can clean up its act and tone down the violence to a level that’s acceptable to tourists, the World Cup and the Olympics are in trouble? Someone must be Djoking!
When did we lose all perspective about what’s important? Sports are a form of entertainment, one that’s more entertaining when we take it seriously. But it’s possible to take sports too seriously. Killing someone is an extreme example, obviously. But treating murder as an image problem reveals that we take sports too seriously in other ways, as well. As if the real problem is whether Brazil will respond to mounting pressure to show it is a safe place for tourists in time to save the World Cup. Imagine the economic fall-out were people to stay away in droves! As if that were the problem, rather than the problem being our thinking of sports in economic — or technoscientific — terms. Or our thinking of the protests as a problem for sports, rather than an expression of a cultural moment.
So, to rephrase the question raised by the initial article: Do you want simple, binary decisions rendered by someone — or something — else, or would you rather do the hard work of thinking?
SPRU Professor Andy Stirling is beginning a series in The Guardian on the precautionary principle. Stirling’s first article paints an optimistic picture:
Far from the pessimistic caricature, precaution actually celebrates the full depth and potential for human agency in knowledge and innovation. Blinkered risk assessment ignores both positive and negative implications of uncertainty. Though politically inconvenient for some, precaution simply acknowledges this scope and choice. So, while mistaken rhetorical rejections of precaution add further poison to current political tensions around technology, precaution itself offers an antidote – one that is in the best traditions of rationality. By upholding both scientific rigour and democratic accountability under uncertainty, precaution offers a means to help reconcile these increasingly sundered Enlightenment cultures.
Stirling’s work on the precautionary principle is some of the best out there, and Adam Briggle and I cite him in our working paper on the topic. I look forward to reading the rest of Stirling’s series. Although I’m a critic of the Enlightenment, I don’t reject it wholesale. In fact, I think rational engagement with the thinkers of the Enlightenment — and some of its most interesting heirs, including Stirling and Steve Fuller, who’s a proponent of proaction over precaution — is important. So, stay tuned for more!
No two snowflakes are alike. No two people are the same.
Snowflakes by Juliancolton2 on flickr
Earlier posts in this series attempted to lay out the ways in which Snowball Metrics present as a totalizing grand narrative of research evaluation. Along with attempting to establish a “recipe” that anyone can follow — or that everyone must follow? — in order to evaluate research, this grand narrative appeals to the fact that it is based on a consensus in order to indicate that it is actually fair.
The contrast is between ‘us’ deciding on such a recipe ourselves or having such a recipe imposed on ‘us’ from the outside. ‘We’ decided on the Snowball Metrics recipe based on a consultative method. Everything is on the up and up. Something similar seems to be in the works regarding the use of altmetrics. Personally, I have my doubts about the advisability of standardizing altmetrics.
— But what’s the alternative to using a consultative method to arrive at agreed upon standards for measuring research impact? I mean, it’s either that, or anarchy, or imposition from outside — right?! We don’t want to have standards imposed on us, and we can’t allow anarchy, so ….
Yes, yes, QED. I get it — really, I do. And I don’t have a problem with getting together to talk about things. But must that conversation be methodized? And do we have to reach a consensus?
— Without consensus, it’ll be anarchy!
I don’t think so. I think there’s another alternative we’re not considering. And no, it’s not imposition of standards on us from the ‘outside’ that I’m advocating, either. I think there’s a fourth alternative.
In contrast to Snowball Metrics, Snowflake Indicators are a delicate combination of science and art (as is cooking, for that matter — something that ought not necessarily involve following a recipe, either! Just a hint for some of those chefs in The Scholarly Kitchen, which sometimes has a tendency to resemble America’s Test Kitchen — a show I watch, along with others, but not so I can simply follow the recipes.). Snowflake Indicators also respect individuality. The point is not to mash the snowflakes together — following the 6-step recipe, of course — to form the perfect snowball. Instead, the point is to let the individual researcher appear as such. In this sense, Snowflake Indicators provide answers to the question of researcher identity. ORCID gets this point, I think.
To say that Snowflake Indicators answer the question of researcher identity is not to suggest that researchers ought to be seen as isolated individuals, however. Who we are is revealed in communication with each other. I really like that Andy Miah’s CV includes a section that lists places in which his work is cited as “an indication of my peer community.” This would count as a Snowflake Indicator.
Altmetrics might also do the trick, depending on how they’re used. Personally, I find it useful to see who is paying attention to what I write or say. The sort of information provided by Altmetric.com at the article level is great. It gives some indication of the buzz surrounding an article, and provides another sort of indicator of one’s peer community. That helps an individual researcher learn more about her audience — something that helps communication, and thus helps a researcher establish her identity. Being able to use ImpactStory.org to craft a narrative of one’s impact — and it’s especially useful not to be tied down to a DOI sometimes — is also incredibly revealing. Used by an individual researcher to craft a narrative of her research, altmetrics also count as Snowflake Indicators.
So, what distinguishes a Snowflake Indicator from a Snowball Metric? It’s tempting to say that it’s the level of measurement. Snowball Metrics are intended for evaluation at a department or university-wide level, or perhaps even at a higher level of aggregation, rather than for the evaluation of individual researchers. Snowflake Indicators, at least in the way I’ve described them above, seem to be aimed at the level of the individual researcher, or even at individual articles. I think there’s something to that, though I also think it might be possible to aggregate Snowflake Indicators in ways that respect idiosyncrasies but that would still allow for meaningful evaluation (more on that in a future post — but for a hint, contrast this advice on making snowballs, where humor and fun make a real difference, with the 6-step process linked above).
But I think that difference in scale misses the really important difference. Where Snowball Metrics aim to make us all comparable, Snowflake Indicators aim to point out the ways in which we are unique — or at least special. Research evaluation, in part, should be about making researchers aware of their own impacts. Research evaluation shouldn’t be punitive, it should be instructive — or at least an opportunity to learn. Research evaluation shouldn’t so much seek to steer research as it should empower researchers to drive their research along the road to impact. Although everyone likes big changes (as long as they’re positive), local impacts should be valued as world-changing, too. Diversity of approaches should also be valued. Any approach to research evaluation that insists we all need to do the same thing is way off track, in my opinion.
I apologize to anyone who was expecting a slick account that lays out the recipe for Snowflake Indicators. I’m not trying to establish rules here. Nor am I insisting that anything goes (there are no rules). If anything, I am engaged in rule-seeking — something as difficult to grasp and hold on to as a snowflake.
Can we talk about this? Or if I suggest standards are a double-edged sword, will no one listen?
“For altmetrics to move out of its current pilot and proof-of-concept phase, the community must begin coalescing around a suite of commonly understood definitions, calculations, and data sharing practices,” states Todd Carpenter, NISO Executive Director. “Organizations and researchers wanting to apply these metrics need to adequately understand them, ensure their consistent application and meaning across the community, and have methods for auditing their accuracy. We must agree on what gets measured, what the criteria are for assessing the quality of the measures, at what granularity these metrics are compiled and analyzed, how long a period the altmetrics should cover, the role of social media in altmetrics, the technical infrastructure necessary to exchange this data, and which new altmetrics will prove most valuable. The creation of altmetrics standards and best practices will facilitate the community trust in altmetrics, which will be a requirement for any broad-based acceptance, and will ensure that these altmetrics can be accurately compared and exchanged across publishers and platforms.”
“Sensible, community-informed, discipline-sensitive standards and practices are essential if altmetrics are to play a serious role in the evaluation of research,” says Joshua M. Greenberg, Director of the Alfred P. Sloan Foundation’s Digital Information Technology program. “With its long history of crafting just such standards, NISO is uniquely positioned to help take altmetrics to the next level.”
The post on Snowflake Indicators is coming …
First, let me say where I am coming from and what I mean by ‘postmodern’. I’m working from Lyotard’s simple “definition” of the term: “incredulity toward metanarratives” (from the introduction to The Postmodern Condition). One interesting question that arises from this definition is the scope of this incredulity — what counts, in other words, as a metanarrative?
Lyotard also distinguishes between what he calls ‘grand’ narratives and ‘little stories’ (les petits récits). Importantly, either a grand narrative or a little story can make the ‘meta’ move, which basically consists in telling a story about stories (where ‘story’ is understood broadly). Put differently, it is not the ‘meta’ toward which the postmodern reacts with incredulity. It is, rather, the totalizing character of the grand narrative that evinces doubt. By its very nature, the claim to have achieved certainty, to have told the whole story, undermines itself — at least from the postmodern perspective.
Of course, the grand narrative is always at pains to seek legitmation from outside itself, to demand recognition, to assert its own justice. Often, this takes the form of appeal to consensus — especially to a consensus of experts and authorities. The irony of the little stories is that they legitimate themselves precisely in not seeking hegemony over all the other stories. Not seeking jurisdiction over the whole, the little stories have the status — a venerable one — of ‘fables’. The little stories are told. We are told to accept the grand narrative.
This will be the first is a series of posts tagged ‘postmodern research evaluation’ — a series meant to be critical and normative, expressing my own, subjective, opinions on the question.
Before I launch into any definitions, take a look at this on ‘Snowball Metrics‘. Reading only the first few pages should help orient you to where I am coming from. It’s a place from where I hope to prevent such an approach to metrics from snowballing — a good place, I think, for a snowball fight.
Read the opening pages of the snowball report. If you cannot see this as totalizing — in a very bad way — then we see things very differently. Still, I hope you read on, my friend. Perhaps I still have a chance to prevent the avalanche.
On the one hand, this post on the VCU website is very cool. It contains some interesting observations and what I think is some good advice for researchers submitting and reviewing NSF proposals.
On the other hand, this post also illustrates how researchers’ broader impacts go unnoticed.
One of my main areas of research is peer review at S&T funding agencies, such as NSF. I especially focus on the incorporation of societal impact criteria, such as NSF’s Broader Impacts Merit Review Criterion. In fact, I published the first scholarly article on broader impacts in 2005. My colleagues at CSID and I have published more than anyone else on this topic. Most of our research was sponsored by NSF.
I don’t just perform research on broader impacts, though. I take the idea that scholarly research should have some impact on the world seriously, and I try to put it into practice. One of the things I try to do is reach out to scientists, engineers, and research development professionals in an effort to help them improve the attention to broader impacts in the proposals they are working to submit to NSF. This past May, for instance, I traveled down to Austin to give a presentation at the National Organization for Research Development Professionals Conference (NORDP 2013). You can see a PDF version of my presentation at figshare.
If you look at the slides, you may recognize a point I made in a previous post, today. That point is that ‘intellectual merit’ and ‘broader impact’ are simply different perspectives on research. I made this point at NORDP 2013, as well, as you can see from my slides. Notice how they put the point on the VCU site:
Broader Impacts are just another aspect of their research that needs to be communicated (as opposed to an additional thing that must be “tacked on”).
I couldn’t have said it better myself. Or perhaps I could. Or perhaps I did. At NORDP 2013.
Again, VCU says:
Presenters at both conferences [they refer to something called NCURA, with that hyperlink, and to NORDP, with no hyperlink] have encouraged faculty to take the new and improved criteria seriously, citing that Broader Impacts are designed to answer accountability demands. If Broader Impacts are not carefully communicated so that they are clear to all (even non-scientific types!), a door could be opened for more prescriptive national research priorities in the future—a move that would limit what types of projects can receive federal funding, and would ultimately inhibit basic research.
My point is not to claim ownership over these ideas. If I were worried about intellectual property, I could trademark a broader impacts catch phrase or something. My point is that if researchers don’t get any credit for the broader impacts of their research, they’ll be disinclined to engage in activities that might have broader impacts. I’m happy to share these ideas. How else could I expect to have a broader impact? I’ll continue to share them, even without attribution. That’s part of the code.
To clarify: I’m not mad. In fact, I’m happy to see these ideas on the VCU site (or elsewhere …). But would it kill them to add a hyperlink or two? Or a name? Or something? I’d be really impressed if they added a link to this post.
I’m also claiming this as evidence of the broader impacts of my research. I don’t have to contact any lawyers for that, do I?
UPDATE: BRIGITTE PFISTER, AUTHOR OF THE POST TO WHICH I DIRECTED MY DIATRIBE, ABOVE, HAS RESPONDED HERE. I APPRECIATE THAT A LOT. I ALSO LEFT A COMMENT APOLOGIZING FOR MY TONE IN THE ABOVE POST. IT’S AWAITING MODERATION; BUT I HOPE IT’S ACCEPTED AS IT’S MEANT — AS AN APOLOGY AND AS A SIGN OF RESPECT.
Anyone interested in research assessment should read this with care.
It’s been presented in the media as an insurrection against the use of the Journal Impact Factor — and the Declaration certainly does … ehr … declare that the JIF shouldn’t be used to assess individual researchers or individual research articles. But this soundbite shouldn’t be used to characterize the totality of DORA, which is much broader than that.
Honestly, it took me a few days to go read it. After all, it’s uncontroversial in my mind that the JIF shouldn’t be used in this way. So, an insurrection against it didn’t strike me as all that interesting. I’m all for the use of altmetrics and — obviously, given our recent Nature correspondence (free to read here) — other inventive ways to tell the story of our impact.
But, and I cannot stress this enough, everyone should give DORA a careful read. I’m against jumping uncritically on the bandwagon in favor of Openness in all its forms. But I could find little reason not to sign, and myriad reasons to do so.
Well done, DORA.