photo by Dustin Gaffke

Don’t Panic! Elsevier’s #OpenAccess policy is not malicious

photo by Dustin Gaffke
photo by Dustin Gaffke

Yesterday, I noticed a comment on Twitter where Erin McKiernan expressed confusion over Elsevier’s Open Access policy:

This was promptly jumped on by other Open Access advocates, who were insisting that it is “not OA”, “nasty”, and even “malice”:

I burned through a few dozen tweet yesterday trying to convince Erin, Mike, & Charles that the exclusive license isn’t a problem for OA at all. This is particularly important because (as of Aug 18) Erin has incorporated this criticism into an Open Letter to the Society for Neuroscience regarding the new Open Access journal, eNeuro, which uses the same licensing scheme (my own emphasis added):

Our first concern relates to the copyright policy of eNeuro. The journal’s policy states that authors will retain copyright but must grant the Society an exclusive license to publish. An exclusive license is in conflict with the tenets of open access, as defined by the Budapest Open Access Initiative (BOAI), and does not reflect the aims of the Creative Commons licenses to allow reuse.

The problem here is that the exclusive license is something that you (the author and copyright holder) grant to the publisher. It is a guarantee to them that you will not take your original content elsewhere. Alone, this would indeed be sketchy, but since both journals turn around and release the article under a CC license, anyone (including the author) can reuse and redistribute the article subject to the terms of the CC license*.

Elsevier’s author agreement actually illustrates this really well:

Author-agreement
used without permission

You grant the publisher exclusive publishing rights then they release the paper under the OA-compliant CC license. There is no adverse impact on the Open Access goals. It is entirely consistent with the Budapest Open Access Initiative. The limitation that it does place is preventing a line directly from the Author to the User. This can actually be useful, as I’ll explain soon.

Above, I used a photo by the very kind-hearted Dustin Gaffke, who shared his photo of Panic! At The Disco on Flickr with a CC-BY license. Thus, I was able to download the photo and upload it here with the proper attribution (noting the copyright holder and providing a link back to the source). This is how the CC-BY license works. I can do all kinds of things with his photo… I can make money off of it, printing it on T-shirts and sell them in the alley outside of Panic! At The Disco concerts, as long as I print “photo by Dustin Gaffke”. I could modify the photo, add my own color effects, mash it up with other photos to make my own Creative Work. Just gotta note the original creator and how it was modified. This is how the CC-BY license works.

However, Dustin is still the copyright holder. He can still do whatever he wants with photo. Maybe Panic! At The Disco likes his photo so much they want to use it for their own promo materials but they don’t want to have to remember to add the proper attribution every time they put it on a mug or poster or concert flyer. He could grant them a license to use the photo without his name (something I can’t do).

Now, the CC-BY license works the same for scholarly works. Releasing under CC-BY doesn’t just let other people read your work without having to pay, but it allows them to redistribute and reuse it. This led to a bit of a kerfuffle last summer, when some scholars realized that their freely downloadable CC-BY papers were getting compiled into textbooks that were being sold for $100+. Now, this anger was largely due to a misunderstanding of what is permitted under CC-BY, but there were some instances where the CC-BY license may have been violated (e.g. poor attribution by omitting a link to the original article and changing the title without noting that the work had been changed). However, we’ll probably never know whether the more dubious practices of Apple Academic Press actually violated CC-BY because they probably will never go to court. Here’s why:

Lets assume for a moment that we see a similar scenario that would be an egregious violation of CC-BY… a textbook gets printed with entire chapters lifted directly from published papers without any attribution for the original author or manuscript whatsoever.

The only way to enforce licensing is litigation, so who sues?

That burden falls on the copyright holders, i.e. the authors of the manuscript. The publisher of the CC-BY paper can’t really do anything to enforce the CC-BY because the original authors (who still have copyright) could have very easily licensed the material to the textbook publisher, as it is wholly within their right to do so, just as it is within Dustin’s right to sell rights to the Panic! At The Disco photo.

Because of this issue,  the publisher doesn’t have a case. I would also guess that the academics probably don’t have the time/money/wherewithal to hire lawyers.

However, if the authors have granted an exclusive license to the publisher as Elsevier and the Society for Neuroscience require, they are guaranteeing to the publisher that they will not license the work to anyone else. The publisher can then distribute the work with the CC-BY license. The exclusive license between the copyright holder and the publisher in no way restricts, modifies, or changes the CC-BY license.**

What the exclusive license does allow, though, is for the publisher to pursue legal recourse in the event of a CC-BY violation. They know that (a) they have an exclusive license from the copyright holder and (b) they distributed the work under a CC license. Remember how there is no direct line between the Author and the User in Elesevier’s handy flowchart? So if the work shows up somewhere without attribution, it is obviously a violation of the CC license. The only additional restriction here imposed by the exclusive license is on the author, who must now conform to the CC-BY license as well. They can no longer stick excerpts into a textbook chapter without attributing the original paper.

Is it confusing? A bit, but Elsevier’s Author Guidelines lay it out pretty clearly.

Is it necessary? Not strictly.

Is it helpful? Yes, in at least one scenario.

Is it still OA? Yes.

Is it “malice”? Absolutely not.


* If you are an IP lawyer, feel free to correct me on any of this.

 

Sebastian Seung and Anthony Movshon engage in a public debate about the connectome

An open letter to Larry Swanson: Why it is important for neuroscientists to debate the Brain Initiative in public

Sebastian Seung and Anthony Movshon engage in a public debate about the connectome

Dear Larry Swanson,

I’m writing in response to your letter to SfN members regarding the Brain Initiative, primarily because I strongly disagree with one of your key points. In the letter, you write:

While we should all continue to explore and discuss questions about the scientific direction, it is important that our community be perceived as positive about the incredible opportunity represented in the President’s announcement. If we are perceived as unreasonably negative or critical about initial details, we risk smothering the initiative before it gets started.

and

SfN encourages healthy debate and rigorous dialogue about the effort’s scientific directions. Testing of assumptions, methodological debate, and constructive competition are central to scientific progress. I urge you to bring all this to the table through our scientific communications channels and venues, including the SfN annual meeting in San Diego this fall and The Journal of Neuroscience.

(my own emphasis added)

To summarize your request, you think that we should disagree only in “our scientific communications channels” while ensuring that, to the taxpayers who will be funding this initiative, “our community be perceived as positive” about it. Not only do I find it offensive and patronizing that you would ask us to be disingenuous to the very public which supports our efforts, but I think that your request is short-sighted and undermines the work of neuroscientists who seek to cultivate a public that is informed and literate in matters of the brain.

The debate has already begun in the public sphere, whether you like it or not. And the public is looking to neuroscientists to make sense of the vague official announcements that have happened thus far. Will we actually fix Alzheimer’s in five years? Will we record from every neuron in the human brain? Why do we want to do this? Without our informed input to the debate, “we risk smothering the initiative before it gets started” due to bad reporting. While you ask us to stick to “our” channels of scientific discourse, like the paywalled journals and exclusive conferences that the public cannot access, it was only 4 days after the New York Times story broke that this gem of fear-mongering claimed that the Brain Initiative would allow Barack Obama to read people’s minds. If we don’t talk about the Brain Initiative, bad reporters will. And if bad reporters talk about the Brain Initiative, we risk creating a public which is fearful of the very work that we do.

That is, while pieces like those by Partha Mitra, Scicurious and other neuroscientist bloggers may not have portrayed a community of neuroscientists that are 100% united behind this initiative, they serve a much greater purpose. As you noted, we are “capturing the world’s attention” and this affords us an “unparalleled” opportunity to educate the public about the actual work that neuroscientists do, the opportunities and the limitations. This type of discourse is critically needed in the public sphere to improve neuroscience literacy, fight bunk neuroscience reporting, and in the long run cultivate a public that is excited, engaged, and ready to fund neuroscience.

Further, most of the debate among neuroscientists on Twitter, Quora, and blogs concerns where the money for this initiative will be coming from and whether it will cut into existing (and already strained) research budgets. I think that we are entirely valid in being concerned that other good neuroscience will be “smothered” by the Brain Initiative, precisely due to the “growing challenges caused by shrinking or flat national government budgets for science research.” And again, putting this debate in the public sphere allows neuroscientists to inform the public on the opportunities and limitations of different experimental approaches, and the importance of investing in basic research for long term health outcomes.

Ultimately, I believe that we should not wait until SfN in November to debate this and that we should not isolate the debate from the public that supports us by locking it off on the pages of “our” journals that are inaccessible to said public. Rather, having this debate in the open will ensure a public that supports and trusts our work for decades to come.

Thank you,

Justin Kiggins

claimtoken-519e641d7f3d9

"Occupy Oakland October 11" CC-BY quinn norton

the “value” of protesting

"Occupy Oakland October 11" CC-BY quinn norton

A quick thought, inspired on twitter:

Money is protected as “free speech”, hence lobbying is legal. We can give money to other persons to speak for us.

Could we nail down the “value” of protesting? That is, what is the amount of money I would need to donate to my cause such that I could stay home and sit on my couch instead of protesting?

If we have a given bill (say to tighten gun controls), lets say that the pro-gun folks dedicated $100 Million against and had 1000 protestors outside of congress, while the anti-gun folks rummaged up only $1 Million, but got 100,000 protestors.

For simplicity, lets say that the vote is a tie, meaning that both sides were equally effective.

We could then just solve for x, where x is the value of a single protestor…

100,000,000 + (x * 1000) = 1,000,000 + (x * 100,000)
999,000,000 = x * 99,000
x = $10,091

… which sounds very high to me (made up numbers, of course).

Analyzed across, say, all of the bills before congress during a given session, though, one could perhaps get an estimate of the value of protesting.

Hmm… surely a political economist has done this?


“Occupy Oakland October 11″ CC-BY quinn norton

"Low hanging fruit" CC-BY Caza_No_7

moving beyond the low-hanging #altmetrics fruit

"Low hanging fruit" CC-BY Caza_No_7

Here are the #altmetrics that I want to see for individual research papers:

  • How many journal clubs are discussing this?
  • How often was this mentioned at campus pubs?
  • How many entrepreneurs saved this into their Mendeley library?
  • How many lives were saved because of this?
  • How many patents were inspired by this?
  • What is the increase in GDP attributable to this research?

Ultimately, any efforts to move beyond basic citation metrics to assess the impact of scientific research should move us closer to actually measuring the actual impact of said research. Scientists often cite the benefit of basic research for the advancement of economic prosperity and human health and this is the kind of impact we should be trying to assign metrics to.


“Low hanging fruit” CC-BY Caza_No_7

"TED Talk" CC-BY Steve Jurvetson

the fundamental “purpose” of brains… maybe I should give a TED talk

"TED Talk" CC-BY Steve Jurvetson

I have this pet theory that nervous systems are fundamentally organs for predicting the future so the organism can respond appropriately. And to do this, nervous systems rely on correlations in the environment and temporal precedence to infer causality. Which is part of why we are so quick to apply causal explanations to spurious correlations.

The major differences in “complexity” or “intelligence” or “consciousness” across animals are largely just differences in the scope and resolution of the inputs and the temporal scales of evidence accumulation and prediction.

This definition of “nervous system” would then expand to include whatever non-neuronal components all organisms (bacteria, venus fly traps) use to accomplish similar inference/prediction goals and would emphasize that the differences between human and non-human animals is one of degree (and specificity), not kind.

But I don’t have any data. Just a pet theory. Maybe I should give a TED talk.

Edit: dammit, someone’s already done it: Jeff Hawkins on How Brain Science Will Change Computing


“TED Talk” CC-BY Steve Jurvetson

"_MG_9934" CC-BY RoyZilla92

in the future, the scientific literature will follow you

"_MG_9934" CC-BY RoyZilla92

Of the 1+ million new scientific papers published each year, which ones should a scientist read?

Its obvious that any single researcher can’t read them all. Nor do we want to… the overwhelming majority aren’t relevant to us or our work. We limit what we read and we employ variety of methods to chose what new research we do read… a strategic foraging task utilizing a hodge-podge set of tools, including subscribing to journals or RSS feeds, saved Pubmed searches

But times are a-changin’ for academic publishing and some of the changes are very similar to those in journalism:

The publisher of a major international newspaper once told me that he delivers “the five or six things I absolutely have to know this morning.” But there was always a fundamental problem with that idea, which the Internet has made starkly obvious: There is far more that matters than any one of us can follow. In most cases, the limiting factor in journalism is not what was reported but the attention we can pay to it.

The goal of personalized news—news that is tailored specificially to me—is hot but unrealized. I recently came across an article by Jonathan Stray proposing three principles that ought to govern personalized news: interest, effects, and agency.

You should see a story if:

  1. You specifically go looking for it.
  2. It affects you or any of your communities.
  3. There is something you might be able to do about it.

This got me thinking about what these principles mean for “following the literature”. In particular, how would one develop a strategy for research literature discovery (typically known as “following the literature”) that embodies these principles, where the literature follows the researcher?

Interest

Anyone who wants to know should be able to know. From a product point of view, this translates into good search and subscription features. Search is particularly important because it makes it easy to satisfy your curiosity, closing the gap between wondering and knowing.

This is perhaps the most obvious minimal requirement… to be able to find things that I’m looking for. This is a passive feature of a system for research literature discovery… I take the action. I decide when I want to know about something. I search for “Lastname et al, 2003″ after seeing the reference at the bottom of a figure in a presentation. I go hunting for a recent paper that someone mentioned to me at the coffee cart.

There are already some very good tools out there for this. Pubmed & Google Scholar are my go-to sources. So, for scientists at least,* I’m going to consider this a problem basically solved and move on…

Effects

I should know about things that will affect me. Local news organizations always did this, by covering what was of interest to their particular geographic community. But each of us is a member of many different communities now, mostly defined by identity or interest and not geography. Each way of seeing communities gives us a different way of understanding who might be affected by something happening in the world.

Did I get scooped? Did someone cite my work? Is there a new paper that changes the way I interpret my not-yet-published results?

These are harder, more time consuming questions to answer by relying exclusively on a search-based interface. Historically, these questions would have been answered by subscribing to specialized “society journals”… The Journal of Obscure Sub-Subfield. More common today are custom filters and saved searches. For example, here’s Bradley Voytek’s strategy:

Similarly, Drugmonkey polled his readers a while back on how they keep abreast of the literature. The responses (summarized here) include everything from tools like pubcrawler to relying on blogs, journal RSS feeds, to making graduate students do it.

Its obvious that everyone is looking out for papers that will affect them… but can we do better? All of these together are (1) cludgy, (2) require a lot of time and effort to setup and tailor, and (3) require the researcher to already know what will affect them. To a certain extent, the third isn’t a huge problem… its obvious that my collaborators’ and competitors’ work will affect me and a huge part of our role as scientists in knowing what the state of the field is and how it will affect our work.

But what about the “unknown unknowns“?

What if a system could leverage existing sources about what will affect me to predict which new papers I should know about? What if it could use a researcher’s publications, personal library of papers, and network of friends & colleagues to assess newly published papers for “likelihood of effect”?

A big breakthrough in this direction was launched recently with Google Scholar’s “My Updates” feature (see Jonathan Eisen’s summary here), which analyzes a researchers past publications to predict relevance of papers to them. One shortcoming of this approach is that its usefulness will be more limited for graduate students (especially those in their early years) who will have fewer publications to their names than a tenured professor.

A similar approach (still in Beta) is Mendeley Suggest, which (apparently) leverages a user’s library, compares it with other users’ libraries, and makes suggestions for relevant papers. Mendeley Suggest certainly brought a few papers to my attention that affect my work, but they were all 5-10 years old. Not exactly a great tool for keeping tabs on new papers. Practically speaking, a better approach might be to take something like the ScipleRSS, which ranks the results of a batch of journal RSS feeds according to a set of weighted keywords, and use the Mendeley API to set the keyword weightings based on a users actual library (if I ever have time, I’d like to actually do this). Having access to the user’s use-statistics for the papers in their library, as well as social aspects (what papers were recently added by Mendeley contacts? by other users in Mendeley Groups?) could make such a prediction system even more powerful. Regardless of whether this is the best approach, there’s room for improvement here and Mendeley seems like a good place to start.

Beyond these emerging projects, its hard to imagine exactly where the future lies… if a system had enough data about you, your research, and the broader context you work in (say, your publications, grants, manuscripts, library, conference attendance info, which posters you stood in front of, the funding climate in your field), would it be able to make broader predictions for which papers would “affect” you? Would it be able to know that some paper in a totally different field solved a problem in a way that informed your own work, even though the research programs might not share any keywords?

Its not clear where the boundary is, but it is clear that this area is ready for some innovation, at least as it relates to improving the efficiency of alerting researchers to relevant new work.

Agency

Ultimately, I believe journalism must facilitate change. Otherwise, what’s the point? This translates to the idea of agency, the idea that someone can be empowered by knowing. But not every person can affect every thing, because people differ in position, capability, and authority. So my third principle is this: Anyone who might be able to act on a story should see it. This applies regardless of whether or not that person is directly affected, which makes it the most social and empathetic of these principles.

Science must also facilitate change… it must change the way that we view the world and thus, the way we respond to it. Our primary goal as scientists is to produce knowledge. But as John Archibald Wheeler described it, “We live on an island surrounded by a sea of ignorance. As our island of knowledge grows, so does the shore of our ignorance.” Every “answer” begets multiple more questions. (So really, we’re talking about a hyper-island in some multidimensional ocean) One of the primary challenges of a scientist is knowing in which direction to extend the island of knowledge.

And yet, all scientists are constrained by resources preventing us from extending the entire island at once. Not just money, but by the technical skills of trainees in the lab, the types of equipment that a lab has access to, the experiments that are available through collaborations. In deciding the direction that a lab should take (which projects to put into a grant, which ones to pilot, which grad students to accept), a researcher has to take all of these resource constraints into account, while recognizing the unique resources and opportunities available to determine the key areas where a lab can be productive.

Ultimately, the need for being able to have access to research we are interested in and to be alerted to research that affects us is meant to support this primary goal of producing new knowledge.

But could a research literature discovery system support this endeavor more directly through ensuring that a scientist knows about research that they might be able to act on?

There is a section in Michael Eisen’s Reinventing Discovery, where he imagines a future scenario in science:

 You’re a theoretical physicist working at the California Institute of Technology (Caltech), in Pasadena. Each morning, you begin your work by sitting down at your computer, which presents to you a list of ten requests for your assistance, a list that’s been distilled especially for you from millions of such requests filed overnight by scientists around the world. Out of all those requests, these are the problems where you are likely to have maximal comparative advantage. Today, one of the requests immediately catches your eye. A materials scientist in Budapest, Hungary, has been working on a project to develop a new type of crystal. During the project an unanticipated difficulty has come up involving a very specialized type of problem: figuring out the behavior of particles as they [diffuse] on a triangular latticework. Unfortunately for the materials scientist, diffusion is a subject they don’t know much about. You, in turn, don’t know much about crystals, but you are an expert on the mathematics of diffusion, and in fact, you’ve previously solved several research problems similar to the problem puzzling the materials scientist. After mulling over the  diffusion problem for a few minutes, you’re sure that the problem will fall easily to mathematical techniques you know well, but which the materials scientist probably doesn’t know at all. [...]

Long story, short, you collaborate & everyone wins. This vision will require some major changes in the way science is done before it will be fully realized. However, a key principle here is that the theoretical physicist is alerted to a problem not based on her interests or whether his research program is affected by the work of the materials scientist in Budapest, but rather because she has agency in the topic. And we can similarly imagine a framework where a scientist is alerted to new publications not just because they are interested in the topic or because it affected them, but rather because they are uniquely situated to do the next experiment. Maybe they are an evolutionary anthropologist with a dataset, who could be alerted to a recently published model of neandertal ancestry that they were uniquely equipped to validate or invalidate. Or a molecular biologist with the perfect combination of reagents, gizmos, and trainees to try to replicate claims of a bacteria being able to survive on arsenic.

Is this possible? I think so. Stack Exchange is already aiming to predict which users are best equipped to answer which questions (a goal with striking similarities to Neilson’s imagined future for scientists). Imagine a system that has knowledge of a lab’s skills based on the CVs or Linkedin profiles of the members and the “Methods” sections of publications, combined with a detailed knowledge of the lab’s inventory (say, based on some lab management software like Labguru or Quartzy) and potential collaborations (based on past collaborations from the publication record and knowledge of the lab’s social network, pulled from Mendeley or Linkedin or ResearchGate). Then cross reference all of this against, say, the methods sections of recent publications… or better yet, against semantic analysis of post-publication peer-review commentary about a publication (which is where most of the ideas about the “next experiment” are likely to be found) such as F1000, Twitter, research blogs, hypothes.is, etc.

The key here is that not only are scientists affected by what others publish, but we have the opportunity to produce research that affects the rest of the world. And maybe we can start building tools that help us know where to focus our energies and resources in order to do exactly that.


* It should be noted that the current model of scientific publishing puts up a barrier to the “Anyone who wants to know” part, as access is currently limited to institutional subscribers or those who are willing to pony up $30+ per article… which isn’t “anyone”. The key barrier here for scientific publishing is enabling “anyone who wants to know” to have access to scientific publications.


“_MG_9934″ CC-BY RoyZilla92

html

Why do scientists tend to prefer PDF documents over HTML when reading scientific journals?

I’m taking a play out of Bradley Voytek’s playbook and re-posting one of my answers to a question on Quora


Why do scientists tend to prefer PDF documents over HTML when reading scientific journals?

For me, there is one primary reason that I prefer the PDF versions of scientific documents:

PDFs have less clutter

They certainly don’t have to, but since publishers apply the standards of print design to the design of their publications, there is more efficiency in the PDF format.

For example, here are the HTML and PDF versions of the topmost portion of a recent Nature paper… this is essentially what I see in my browser.

Circled in blue are the things that I am looking for… This is the actual content of the article in question. In red is all the stuff I don’t care about.

That a lot of red, mostly for either advertisements or other services by the publisher.

Now for the PDF of the same paper…

The only thing in red here is the DOI. This is actually important, but I don’t usually need to read it and I felt bad failing to put any red on the PDF. And the figure shows up on the first page, which means that a quick glance gives me some information about the topic of the paper.

But seriously, look at that difference. In the PDF, nearly the whole space is dedicated to content and what’s not content is well formatted whitespace. In the HTML, I have to actively ignore the advertisements, not to mention the bright red hyperlinks to references in the main text.

What can publishers do?

Format HTML documents for readability and stop trying to distract me into reading something else.

I have to read lots of papers and I need to be able to easily find the things that I’m looking for. Second guessing whether I am reading journal content or an advertisement or navigation doesn’t help at all. Reading the HTML version is like trying to read a fitness magazine, where ads and content blur into one incomprehensible mess.

(A second, less important reason is that PDFs feel more stationary. I have more certainty that a PDF I download today is the same as the one my colleague downloaded last week. I know that PDFs can be generated on the fly, but this is how it “feels”.)

by Justin Kiggins