Richard Tol’s 97% Scientific Consensus Gremlins
By Collin Maessen on commentLast year Cook et al. released a paper that analysed the scientific consensus on anthropogenic global warming in the peer-reviewed scientific literature.
What they did in that study was look at almost 12,000 abstracts from 1991 to 2011 that matched the search “global climate change” or “global warming.” What they found after analysing these abstracts is that among those that expressed a position on global warming, 97% endorsed the consensus position that humans are causing global warming. They also contacted 8,547 authors to ask if they could rate their own papers and received 1,200 responses. The results for this again found that 97% of the selected papers stated that humans are causing global warming.
For anyone who is aware of other studies that did something similar these results weren’t a surprise. As studies like Oreskes 2004, Doran 2009 and Anderegg 2010 showed similar results. No matter how these studies approach this subject they find this level of agreement among experts and in the scientific literature. This remarkable agreement exists because a scientific consensus is reached on the weight and amount of research that is available in the literature. It’s also this scientific evidence that led to the scientific consensus on for example evolution, plate tectonics, the big bang, germ theory, and so on. Such a consensus only arises through meticulous study and hard work by scientists.
Which is also the reason the Cook et al. study is so relentlessly attacked by science deniers and pseudo-sceptics. It’s the only tactic they really have as they can’t base their case on scientific research, they just don’t have the supporting evidence to show that they are right. Most of the time they can only allude to nefarious going ons that prevent such evidence from getting into the literature. But it ignores that falsifying a well established scientific theory or concept advances the career and a reputation of a scientist more than confirming it does.
One attacker of the Cook et al. paper surprised me though: the econometrician Richard Tol. Since the release of the Cook et al. paper he’s been criticising it and attempting to show his criticisms have merit. Most of it has played out via his blog, on twitter, and in the comment sections of various websites. What he said there seems to be the basis for his paper critiquing Cook et al. After 4 attempts at 3 different journals his fifth attempt at getting this paper accepted finally succeeded. After reading it I can understand why he had trouble getting it accepted and I don’t understand how it managed to survive peer-review.
The biggest issue I have with the paper Tol wrote is that it cites some questionable sources or cites material that doesn’t support what he’s claiming. He references one of these when Tol says that “Legates et al. tried and failed to replicate part of Cook’s abstract ratings, showing that their definitions were inconsistently applied.” The paper he’s referring to is ‘Climate Consensus and ‘Misinformation’: A Rejoinder to Agnotology, Scientific Consensus, and the Teaching and Learning of Climate Change‘.
The biggest problem with this claim is that this paper didn’t try to replicate the Cook et al. paper. Legates et al. used different categories than the Cook et al. paper which gave them a lower consensus percentage (which is why they did this). If that wasn’t already bad enough they also excluded several categories when they calculated their consensus percentage based on the entire literature sample, which includes papers that didn’t say anything about global warming. That’s the reason they found a 0.3% consensus in this paper instead of 97%.
But that’s nonsensical, you can’t use papers that don’t say anything about the question you’re trying to answer. Take for example a literature search on HIV to answer the question if HIV causes AIDS. When you do this you won’t only get papers that talk about this link, the majority will talk about something entirely different. For example how HIV is being tested as a possible carrier of genetic material in gene therapy (don’t worry, it doesn’t contain the RNA of HIV so it can’t cause AIDS). A very interesting topic and very promising for helping people with genetic disorders, but it doesn’t tell you if HIV causes AIDS. This simple analogy shows how weak the reasoning in this paper is.
However, what truly amazed me, was that he got away with citing blog posts. For example:
Twelve volunteers rated on average 50 abstracts each, and another 12 volunteers rated an average of 1922 abstracts each. Fatigue may have been a problem,8 with low data quality as a result
For his fatigue point he cites the blog post ‘I Do Not Think it Means What You Think it Means‘ (archived here) that uses out of context quotes from private conversations. These were obtained by a hacker who managed to bypass the security measures of a private forum to gain access to this material. This means that Tol uses illegally obtained material from a private forum that is quoted out of context to support a claim. Not one of these quotes talk about fatigue, he’s basing this particular claim on a partial quote from a comment on that blog post (as explained in his footnote):
Indeed, one of the raters, Andy S, worries about the “side-effect of reading hundreds of abstracts” on the quality of his ratings.
This is the full quote:
Like Sarah, I sometimes get a “deja lu” feeling. But I’m not sure if that’s real or just a side-effect of reading hundreds of abstracts. I’ll maybe note the title when it happens so that John can check the database.
What Andy is talking about isn’t fatigue, just that some abstracts look really similar. Andy wasn’t sure if his impression that some abstracts were repeated in the database was real or just a side-effect of reading hundreds of abstracts. But it was a side-effect, there weren’t any duplicate abstracts in the database. Just abstracts on similar subjects with similar results that had a similar phrasing. That will create a déjà vu feeling when you’re rating abstracts during the course of a couple of months, but that’s not fatigue.
Tol citing stolen material and showing a willingness to use ethically questionable sources is part of a larger pattern. For example he cites a different hack on his blog that exploited a security hole to gain access to proprietary data used for the Cook et al. paper (archived here):
[The hacker] has now found part of the missing data.
Unfortunately, time stamps are still missing. These would allow us to check whether fatigue may have affected the raters, and whether all raters were indeed human.
Rater IDs are available now. I hope [the hacker] will release the data in good time. For now, we have to make do with his tests and graphs.
In an audio presentation (archived here) Tol refers to this hacker as a “researcher”:
Cook’s university is sending legal threats to a researcher who found yet another chunk of data.
The reason this ‘legal threat’ was sent is that the materials that were stolen are proprietary and would violate the ethical approval for the Cook et al paper. Some raters were promised anonymity (several aren’t credited in the Cook et al. paper) This was also done to prevent attacks on raters for how they rated papers as this isn’t relevant to verify if the ratings were done correctly. With the hack of the private forum it’s impossible to anonymize this last bit of data to prevent that. The legal statement by the university that Tol is referring to explains this. Yet Tol still misrepresents this statement and apparently has no qualms about using stolen materials.
But this whole point about fatigue is nonsense. What Tol is referring to is survey fatigue, the tendency of people to quit or get less accurate when they fill in long surveys or a lot of surveys. But this was a team of raters who were free to rate abstracts at their leisure, they could start whenever they wanted, continue as fast or as slow as they wanted, and could take a break when they wanted. There wasn’t a deadline for submission or for finishing the ratings. This is a similar method as used by Oreskes 2004 which Tol refers to as one of the “excellent surveys of the relevant literature.”
You also expect raters to become more proficient in these type of situations. The author of the book that Tol cited to make his fatigue point confirmed that this is the case for this set up. If this wasn’t the case there wouldn’t be a 97% consensus from the abstract ratings and then also a 97% consensus from the authors rating their own papers.
However, one of the biggest points made by Tol is that the 97% consensus is actually a 91% consensus. This is based on a misapplication of Bayesian statistics, which should be a part of Tol’s expertise as an econometrician. He assumes that 6.7% of all abstracts were incorrectly rated, no matter the category they’re in, and then applies this to the entire dataset. However, the same data he used for this error rate tells us that the error rate is different per category. When this error correction is applied correctly it actually reaffirms the found consensus in Cook et al.:
In other words if you account for what the raters actually did and then correct for this the consensus is just as strong. Tol’s incorrect method basically creates almost 300 papers rejecting the consensus from thin air. One hell of a mistake to make for an econometrician who has access to the information that he needs to calculate the different error rates. Especially when this misapplication of Bayesian statistics is already present in rejected versions of Tol’s paper. One of the reviewers had the following to say (bolding mine, archived here):
In large part, these figures suggest mostly that it was human beings doing the research and thus humans can get fatigued in such a large and daunting research project. This can lead to lower quality data, but rarely are data perfectly transcribed and meet all assumptions, especially when humans are categorizing and qualitatively assessing literature, and there’s no indication that this biased the findings. In fact, all indications, such as the self-rating by paper’s authors, suggest that the analysis was robust and the findings not influenced by fatigue or low data quality.
Another comment from a reviewer says the following about Tol’s accusations of hiding data (Tol mentions this in my earlier quotes about stolen materials, and is still present in his paper):
This section is not supported by the data presented and is also not professional and appropriate for a peer-reviewed publication. Furthermore, aspersions of secrecy and holding data back seem largely unjustified, as a quick google search reveals that much of the data is available online (http://www.skepticalscience.com/tcp.php?t=home), including interactive ways to replicate their research. This is far more open and transparent than the vast majority of scientific papers published. In fact, given how much of the paper’s findings were replicated and checked in the analyses here, I would say the author has no grounds to cast aspersions of data-hiding and secrecy.
But the most damning is the response from Environmental Research letters in their rejection letter:
I do not see that the submission has identified any clear errors in the Cook et al. paper that would call its conclusions into question – in fact he agrees that the consensus documented by Cook et al. exists. […]
Yes, you read that right. Tol actually agrees with the consensus, and I quote from the paper that got accepted by Energy Policy:
There is no doubt in my mind that the literature on climate change overwhelmingly supports the hypothesis that climate change is caused by humans. I have very little reason to doubt that the consensus is indeed correct
He has even stated in a earlier version of this paper that “It does not matter whether the exact number is 90% or 99.9%.” In the version of the paper I just cited he also said that “The claim that 97% of the scientific literature endorses anthropogenic climate change (Cook et al., 2013, Environmental Research Letters) does not stand”
So to which Tol should I listen? The one that says that the scientific consensus on global warming is real, or the Tol that says it wasn’t found? The latter is based on errors and unsubstantiated accusations. Which leaves the question why he is doing this, and in this way. Fortunately he has explained why he is doing it this way:
I have three choices:
a. shut up
b. destructive comment
c. constructive commenta. is wrong
c. is not an option. I don’t have the resources to redo what they did, and I think it is silly to search a large number of papers that are off-topic; there are a number of excellent surveys of the relevant literature already, so there is no point in me replicating that.that leaves b
Even for a “destructive comment” the quality is appalling and is the reason this paper got rejected multiple times with scathing remarks from reviewers. Probably the only reason he managed to get this error riddled paper in the scientific literature is that the journal that accepted it, Energy Policy, normally addresses “the policy implications of energy supply and use from their economic, social, planning and environmental aspects.” Which could simply mean they lacked the expertise to pick up on the mistakes Tol was making. However, it does baffle me why they allowed dubious sources and unsupported claims to slip past them.
This paper is just a slightly more formal unfounded attack on the Cook et al. paper. From the very start he has attacked it with the same baseless accusations that are present in this paper. He has said them on his blog, twitter, in comments sections of other blogs, he even showed up here on Real Sceptic to repeat them. They were even repeated during a Committee Hearing where he said:
The 97% estimate is bandied about by basically everybody. I had a close look at what this study really did, and as far as I know, as far as I can see, this estimate just crumbles when you touch it. None of the statements in the paper are supported by any data that is actually in the paper. So unfortunately – I mean it’s pretty clear that most of the science agrees that climate change is real and most likely human-made, but this 97% is essentially pulled from thin air. It’s not based on any credible research whatsoever.
But it’s not pulled from thin air, the data and method it’s based on are robust. And it was confirmed by the author self-ratings that gave the same 97% consensus.
Tol also instantly dismissed one of the first articles pointing out the flaws in his paper based on the reviewer comments, yet Tol said that no errors were mentioned in that article. He has even gone as far as stating on Twitter that his original paper was unfairly rejected by Environmental Research Letters and that he had addressed all points of criticism from their reviewers. But by now it should be obvious that most of these points aren’t addressed at all in his published version. To me it looks like Tol cannot be critical towards himself when he thinks he’s right, which leads to him rejecting valid criticism.
A good example of his behaviour to reject criticism is what happened with his paper ‘The Economic Effects of Climate Change‘ that he corrected. Tol attributed the mistakes that lead to this correction to “gremlins,” but that’s an understatement. The errors were so significant that the found results changed from showing economic benefits from warming below about 2°C to showing that impacts are always negative after the correction. Yet Tol has said that “Although the numbers have changed, the conclusions have not. The difference between the new and old results is not statistically significant. There is no qualitative change either.”
When Andrew Gelman, a statistical heavy weight, further analysed and critiqued the paper Tol got very defensive. To the point that Gelman saw the need to say the following:
There’s no shame in being confused—statistics is hard. But if your goal is to do science, you really have to move beyond this sort of defensiveness and reluctance to learn […] I’m sure you can go the rest of your career in this manner, but please take a moment to reflect. You’re far from retirement. Do you really want to spend two more decades doing substandard work, just because you can? You have an impressive knack for working on important problems and getting things published. Lots of researchers go through their entire careers without these valuable attributes. It’s not too late to get a bit more serious with the work itself.
At one point Tol said to Gelman “You’re a sound statistician. You got all the data. Get to work. Show me how it’s done.” Which is exactly what the team behind Cook et al. have said all this time. Tol has the data to replicate their research, to check their results, yet he didn’t do this. He only managed to produce a paper that Environmental Research Letters described in their rejection letter as “[reading] more like a blog post than a scientific comment.” The now published version of this paper isn’t that much better, it also has a small army of gremlins still present in it. John Cook and his team have already identified 24 errors in Tol’s paper, 11 of these errors were already identified by the reviewers of Environmental Research Letters.
Fortunately science is self-correcting and this paper will fade from the literature in due course, hopefully sooner than later. It won’t prevent the spread of misinformation that is based on this flawed paper by the usual suspects. But if this was the best Tol could do “shutting up” would have been his best option in this case. Especially considering what Tol said would be the consequences of publishing a flawed paper:
If I submit a comment that argues that the Cook data are inconsistent and invalid, even though they are not, my reputation is in tatters.
Featured comments
-
Tol’s figure s4 shows that his larger Scopus sample (the one he prefers) includes far more disciplines less likely to be relevant to human-caused global warming, such as: psychology, arts and humanities, computer science, business management and accounting, medicine, and genetics/molecular biology.
So Tol seems to disagree with you, and with himself.
-
I don’t really care whether or not the taking of private information from SkS was criminal or not. Even if it was technically criminal, it is relatively trivial and prosecuting it would be a waste of time, in my opinion. The interesting thing in this discussion is that the people who published this stuff fail to see any ethical problems.
In my professional life as a consultant and, once upon a time, an executive, I would not dream of using information in reports or presentations that had questionable provenance. By any professional code of conduct, using inappropriately obtained private information for personal or corporate advantage is an ethical breach. It doesn’t matter whether that information came from going through someone’s trash, from eavesdropping in a bar, phishing, hacking someone’s website or breaking into a safe. Using it is wrong and, in a reputable company, it can get you disciplined.
The truly staggering thing is that some people who applaud the publishing of the private information from the SkS hacks, also complain bitterly when public comments are quoted and analyzed, as in the Lewandowsky et al Recursive Fury paper.
“If I submit a comment that argues that the Cook data are inconsistent and invalid, even though they are not, my reputation is in tatters.”
I think Tol’s recent behaviour suggests his reputation is just that.
Dear Collin Maessen, in accordance with your policy factual statements must be backed up when challenged, I must request you support (or retract) several remarks you made in this post.
First and foremost, you labeled me a hacker, claiming I hacked a web server. Such an action would be a criminal offense. That is a serious charge. I do not believe you have any basis for making it. Please explain what I did that justifies labeling me a criminal, keeping in mind the fact I have described exactly what I did that led to obtaining the data in question.
Second, you claim releasing “the materials that were stolen are propietary would violate the ethical approval for the Cook et al paper.” Please provide evidence an ethical approval actually exists. Keep in mind, the University of Queensland refused to provide me it when I requested it.
Third, please explain how my using or disseminating data could possibly amount to a violation of an ethics approval I was not party to. On what basis do you claim an ethics approval between two parties justifies legal threats against a third party?
Finally, you claim I wrote a post “that uses out of context quotes from private conversations.” Please explain which quotes in my post lacked relevant context.
Brandon, you might now be aware that I’m part of Skeptical Science. Which means I know exactly what you did and what you didn’t share about what you did. The details that you didn’t share would make it rather obvious that it was hacking. Even though it was at the script kiddie level.
The site also told you that content was restricted with all the login screens you encountered. But eventually you found content that was accidentally left unprotected. Content you knew public statements were made about that it was meant to be private. Accessing that data would make it “digitale huisvredebreuk” in my country.
As you bring up ethics, the ethical thing you should have done is to notify SkS and have left the data alone. But that’s not what you did, you downloaded it and started using it. Same goes for the private conversations you used. By definition that’s out of context as it was meant to be private and you’re not showing all of it. You also show a distinct lack of understanding of how ethical approvals are used or what their intent is. But that doesn’t surprise me considering your behaviour.
Considering I don’t feel much for another bout of unpleasant exchanges with you I’ll be considering this the end of this discussion. If you have any formal complaints please submit those via my contact form.
Citing blog posts is fine if they are relevant. Is there any evidence that the forum data was ‘obtained illegally’?
The arguments about the statistical methods seem beside the point to me. The first paragraph of section 4 in Tol’s paper provides the key argument against the 97% paper. Simply that the search terms used sucked in a large amount of papers that contained no expertise on anthropogenic attribution. The quality of the sample was sacrificed for quantity imho.
There’s a reason why I include links in my articles, one of them explains how the forum data was obtained.
The SkS document ‘24 Critical Errors in Tol (2014)‘ explains perfectly well why Tol’s claim about the used search term doesn’t hold water:
That’s not what I’m getting at. Let me put it another way: what percentage of the papers do you suppose were written by experts on global warming attribution?
(…the explanation link doesn’t load)
Then you’re asking a question this paper doesn’t even try to answer. What you’re asking about is what do the experts say, that’s more of a survey of people than a survey of the literature. One such study I cited at the beginning of this article, Anderegg 2010. This also finds a consensus of 97% among climate researchers.
There will always be a close match between the level of agreement among experts and the agreement in the literature. I’ve already stated why this is in the above article:
(The SkS website was a bit sluggish for a while as it was getting a lot of traffic.)
Warren,
Maybe you could explain why this is relevant. Nowhere does Cook et al. even suggest that the goal was to determine the fraction of abstracts/papers that attribute anthropogenic influences. It wasn’t a study that aimed to determine the strength of the scientific evidence that specifically addressed anthropogenic warming. As the abstract very clearly says
Now you could quite rightly argue that the Cook et al. study did not address the actual strength of the scientific evidence; it simply illustrated the level of endorsement of the consensus position. However, given that Cook et al. were never intending – or claimed – to address the strength of the actual evidence, that argument would seem to be irrelevant.
Thanks for this comment, v helpful…although I’m now confused about the point of the paper. It might be helpful to use Collins & Evans simple classification of expertise, concisely explained at
http://en.wikipedia.org/wiki/Interactional_expertise#Classification
Contributory experts are the people who know *how* to do the work in a particular field. Interactional experts spend a long time reading and talking to contributory experts, until they can use the same language and show some understanding of the issue. I might suggest that in your time as a climate blogger ATTP, you have moved towards being an interactional expert (as have some other climate bloggers). The wiki piece has a vg example of plumbing. If I hang out with plumbers for a while, I’d develop an understanding of a central heating system (interactional)… But if you wanted one installing, you still wouldn’t get me to do it, you’d ask a proper plumber (contributory). Similarly, if I wanted to know about AGW attribution I’d ask someone who does that work themselves, not someone who is just familiar with the literature.
So, the problem with the methodology is that it’s pulling in loads of interactional experts, who are drowning out the contributory experts. That’s *not* to prejudge what the % would be amongst the latter, it might be more or less than 97%. It’s just to argue that the paper is misunderstanding what expertise is about, and providing data which isn’t particularly relevant to our current understanding of AGW attribution imo.
Warren, might I suggest you read the paper. Currently you come across as not having read it and it’s not really helping the discussion. At the moment you’re raising a point that again is never made in the original paper.
It’s also irrelevant to determining the strength of the consensus in the literature. Experts, even those in other fields, generally speaking know what is or isn’t established. It’s not that hard for them to ask the relevant questions so they get the answers they need. Sometimes this isn’t even needed with well established and confirmed findings as for anthropogenic global warming.
Warren,
Let me see if I can explain the relevance. The point of the paper is not to try and understand AGW attribution. If you want to know that, you can read the IPCC documents or even delve into the literature. The point of the paper was very simply to address claims that there is much more disagreement about the fundamentals of AGW than is actually the case. That’s all that it’s doing. Nothing more, nothing less.
Essentially, it’s not a study that’s for the benefit of scientists or others working or interested in climate science. It’s for the benefit of those who don’t spend their time on blogs or reading scientific papers but might have been to typical mis-information sites (like WUWT or Bishop-Hill) where they’ll be told that the level of agreement is much smaller than it actually is.
You seem to have an issue with using interactional papers. This, however, is essentially the only way to do it. Imagine you’re trying to understand how well scientists accept gravity (apologies for using gravity, but it is a good analogy). To do this you could go through the literature and find all papers that have anything to do with gravity (whether studying it or using it). You’ll discover that virtually all use Newton’s Laws (or – in some cases – GR). There is no accepted alternative. Similarly in climate science. You consider all the literature – or as much as is possible given the resources – and investigate what they “use” when it comes to any aspect that addresses past or future warming. You’ll discover that they largely accept what the IPCC regards as the most likely influence of anthropogenic warming.
So, you seem to be arguing that they should only consider papers that directly address the warming (i.e., investigate what’s causing it). The problem with only doing this is that you won’t get a sense of how well this is accepted by others who need to consider the warming to – for example – investigate possible impacts. So, by not restricting the papers to only those who directly address the cause of warming, you get a much better sense of how well the fundamental ideas are accepted by the overall research community.
Tol’s figure s4 shows that his larger Scopus sample (the one he prefers) includes far more disciplines less likely to be relevant to human-caused global warming, such as: psychology, arts and humanities, computer science, business management and accounting, medicine, and genetics/molecular biology.
So Tol seems to disagree with you, and with himself.
Not sure why this is relevant…but ok!
How do you mean “not sure why this is relevant”? This is a direct response to the following that you said (bolding mine):
In other words what Tol wants to do would increase the sampling of the “papers that contained no expertise on anthropogenic attribution.” This means the suggested change would exactly do what you claim the current search terms did.
This exact point was raised in the 24 errors document that I referenced in the above article. I quoted the relevant part in one of my responses to you.
The BS hacking of SkS may not have been illegal in the sense that it would be criminal by itself (there is an interesting case on this where a financial announcement from a company was URL hacked by a Reuters reporter who scored a scoop, the judgement was that the company did not hide the page well enough), however, the subsequent set of demands on UQ and Cook really pushed the envelope IEHO over to harassment if not extortion, and there is no doubt that the data belongs to UQ. In that case you have lost wallet law, e.g. the wallet still is the property of the loser, not the finder and the finder, if she does not return it, would be subject to civil and in some cases criminal penalties.
Eli prefers the circumlocution extralegal to illegal in that case, because the URL hack itself probably was not criminal
The earlier hack of the SkS forum is a whole other kettle of fish, because the site was password protected.
[snip]
Eli – Actually, the BS hacking was and is _completely illegal_ under current US law.
For a case precedent, the conviction of Andrew ‘Weev’ Auernheimer was for using a script to generate a series of AT&T URLs and harvesting email addresses. While AT&T had not protected those URLs (and apparently was able to correct the security flaw in about an hour after discovering it), this was considered theft under the Computer Fraud and Abuse Act – and, I’ll add, a reasonable expectation of privacy wrt material that may not have been fully locked down (Katz v. United States 1967).
The AT&T case is almost a copy of how BS claims to have hacked SkS, trying various URL permutations (with a script?) to see what might be publicly unlinked but not locked down. While Auernheimer’s conviction was overturned on a jurisdictional technicality, the legal precedent stands – and indicates that BS broke the law. [Auernheimer’s sentence was 41 months and $73,000 in damages, quite arguably extreme, but there you have it…]
So, yes, the URL hack of SkS was wholly illegal as per the Federal CFAA. Similar laws and language are also on the books in many US states, ranging in severity from misdemeanor to felony level. And despite BS’s many protestations regarding legality, ignorance of the law is no defense.
The whole affair has been _fascinating_ to watch…
KR: In addition to federal the federal statultes governing the theft of intellectual property, and data, there are also Illinois state statutes that come into play in the “BS Affair.”.
John Hartz
[snip]
Possibly your vague allusion to “other statutes” is alluding to something else. Obviously, no one can begin to debate whether you are right or wrong about some other Illinois statue if you allude to it so vaguely no one can begin to know.
KR,
If you think “AT&T case is almost a copy of how BS claims to have hacked SkS,” you are either unfamiliar with the details of the AT&T case or unfamiliar with what Brandon says he did. (Possibly you, like Eli, have goofed and taken Weeve characterization of all he did as being the same as the Prosecutors claims.) Good luck guys. But ignorance of the details in the AT&T Weeve case doesn’t make that case the same as what Brandon did.
[snip]
Based on his apparently lax standards, there are likely technical problems with the ethics of Tol’s paper quite apart from the exact “where” of his obtaining the “what” of the “whom” he has chosen to use as research data in support of his paper’s scattershot of hypothesis.
No matter the “where” the source of data involving living human beings or the “why” of its employment, the “how” of obtaining it requires a formal process entailing what’s called “informed consent” from the “whom” producing that information, assuming such person is alive. Folks who are curious about this can check the University of Sussex ethics instructional material for researchers for more background.
Social science researchers are not exempt; human subjects are not only people taking pills, or being opened up.
If no human subjects and their thoughts and feelings were required for Tol’s hypothesis on human nature, presumably no subjects would have been included as part of Tol’s speculations. As it stands, Tol did employ at least one human subject clearly within the scope of ethics guidelines in his work.
So the question is, given that Tol had no compunctions against including material that was stolen from a private communications system in his work, was he squeamish about proceeding without permission from an ethics review board in using that material?
Well, while Tol has acknowledged the latest hacker [snip] in his paper, he’s not included material from the [hacker] foray in T14. Talk about WUWT; why would a person with Tol’s stature include [someone] with too much time and bandwidth on his hands in his acknowledgement?
[snip]
Perhaps some, (KR?) might want to contribute over at Lucia’s echo chamber? Carrick is quite certain that there was no actionable offense.
I wouldn’t bother. They’ve been dismissing and attacking any sane advice from the very start. There are better things you can do with your time than to give someone legal advice they won’t take and who will attack you for giving it.
I personally have no great interest in visiting the Lucia echo chamber. They are quite self-righteous about hacking, and seem to think that the only data illegal to steal is sealed under 10 feet of concrete.
Looking into this a bit further, the CFAA may _not_ be applicable in this case because the SkS hack, while a case of “(A.2) …intentionally accesses a computer without authorization or exceeds authorized access, and thereby obtains […] (C) information from any protected computer”, isn’t past the $5,000 threshold, on a federal computer, or directly affecting interstate commerce. It’s not quite a big enough hack to be covered by Federal laws.
On the other hand, under the relevant Illinois law (where the hacker resides), (720 ILCS 5/) Criminal Code of 2012, the hack would fall under “Computer tampering”:
“(a) A person commits computer tampering when he or she knowingly and without the authorization of a computer’s owner or in excess of the authority granted to him or her:
(1) Accesses or causes to be accessed a computer or any part thereof, a computer network, or a program or data;
(2) Accesses or causes to be accessed a computer or any part thereof, a computer network, or a program or data, and obtains data or services“
Since the data was _well known_ to be protected (not public) information, as per the UQ statements regarding privacy and the ongoing discussion between the Cook et al authors and Richard Tol, since the hacker (see above) bounced off several login screens, and the hacker boasted of having obtained private information, “knowingly” accessing private information has been clearly demonstrated.
Poor security on the part of the website does not make this hack legal – that would be a case of unlawful entry, as opposed to breaking and entering; and such entry is still a crime. Also relevant is a reasonable expectation of privacy, as per Katz v. United States – the URLs the hacker chased were not publicly linked. Forcible entry is not a prerequisite to burglary, although it often makes prosecution easier.
In Illinois the crime of computer tampering is a class A misdemeanor, one step below felony, for the _first_ offense, with imprisonment up to 364 days and/or fines up to $25,000.
—
Directly relevant to various claims put forth regarding the hack and legality is People v. Janisch, 2012, Illinois. To prove guilt in this case, the prosecution had to prove the following:
`First Proposition: That the defendant knowingly accessed data; and
Second Proposition: That the defendant obtained data; and
Third Proposition: That the defendant acted without the authorization of the computer’s owner; and
Fourth Proposition: That the defendant knew (s)he acted without the authorization of the computer’s owner.’
All four propositions hold in the SkS hacking case. Janisch appealed her conviction, and promptly lost.
“…the possessory interest required in the context of criminal law is no longer limited to having a real or personal interest in property; rather, the victim need only have an interest greater than that of defendant, and in defendant’s case, the victim held such an interest and did not authorize defendant’s access to the data.”
—
Most of the various excuses for the SkS hacker come down to claims that unless the data requires explosives and a bulldozer to extract, the hacker is in the clear – in other words blaming the victim. Such claims, as in assault cases, do not hold up in court. The SkS hack was criminal, the hackers ethics are apparently non-existent, and blaming of the victim by the hackers apologists is offensive nonsense.
I don’t really care whether or not the taking of private information from SkS was criminal or not. Even if it was technically criminal, it is relatively trivial and prosecuting it would be a waste of time, in my opinion. The interesting thing in this discussion is that the people who published this stuff fail to see any ethical problems.
In my professional life as a consultant and, once upon a time, an executive, I would not dream of using information in reports or presentations that had questionable provenance. By any professional code of conduct, using inappropriately obtained private information for personal or corporate advantage is an ethical breach. It doesn’t matter whether that information came from going through someone’s trash, from eavesdropping in a bar, phishing, hacking someone’s website or breaking into a safe. Using it is wrong and, in a reputable company, it can get you disciplined.
The truly staggering thing is that some people who applaud the publishing of the private information from the SkS hacks, also complain bitterly when public comments are quoted and analyzed, as in the Lewandowsky et al Recursive Fury paper.
Andy,
You don’t understand the substance of the complaints about Fury. The complaints are not that comments are quoted and analyzed. Their substance differs from that.
As for this
The information Brandon obtained wasn’t obtained in any of these ways. [snip]
Lucia, you may want to read KR’s comment above. Brandon obtained, by his own admission, data that he knew was not and should not be publicly available. By that he fits all four propositions: 1) he knowingly looked for and then accessed this data, 2) he downloaded the data and even spread it to others, 3) he did not receive permission from the data owner to do so, and 4) he knew he did not act with permission from the data owner, since he was aware that UQ had resisted access to further data, and only after obtaining the data contacted the owner (where he did not ask whether he could release the data, but rather demanded a good reason not to release the data).
It isn’t much different from walking into somebody’s house, knowing full well it isn’t your house and having no defendable reason nor explicit/implicit permission to enter the house (*). It doesn’t matter that the door is wide open, and it doesn’t matter whether you do or do not take something. It’s trespassing, and thereby actionable.
(*) Note that “no defendable reason” includes “getting back what he/she had stolen from me”
Marco,
You are mistaken about Brandon’s access meeting this
We were discussing Illinois law and that case was in Illinois (Brandon and I live in Illinois.) [snip]
An Illinois statute defines what is or isn’t authorized access:
The notable part here is that you need authorization from the owner to access data. Public statements were already present that certain data wouldn’t be accessible to the public. For example identifiable information from self-ratings which states the reason why this was done also, it’s also publicly stated which data should be available, and Brandon has written about how he knew that the data he obtained wasn’t meant to be publicly accessible. Which makes it unauthorized access what he did.
Collin,
Your interpretation is nonesense. The ‘public’ — that is everyone– was authorized to access those pages. Good luck trying your theory in court. Google indexed some pages at that site.
You might claim it’s nonsense, but it’s what the statute states (one that you used multiple times in several discussions). Or for example Dutch law that I originally referenced. Permission is what matters, not that something was accidentally not protected from unauthorized access and can be accessed. Especially when Brandon has admitted that he knew it wasn’t meant to be publicly accessible and later stated this accidentally unlocked door was locked on him.
Lucia, your interpretation is very creative, but no judge will accept it. Authorization requires more than just not sufficiently protecting certain URL’s. And as Brandon has essentially admitted, he *knew* this data was not supposed to be openly available. That is, he *knew* it was not authorized for free public access. His public statements make that clear. That shuts the door on any defense using the third proposition.
Collin,
Note the statute says “network”. With respect to your argument, you substituted a concept about “data”. Good luck convincing a judge that he has to understand that the Illinois State Legislature meant “data” when they wrote “network”.
Lucia, I work in IT so I’m not mixing up any terminology here. Please note that the statute you referenced defines what a network is and what kind of capabilities that includes. The very term used when it mentions what would be considered authorized access:
But that’s not the full context, I was talking about what the entire statute says on this matter (although I didn’t state that clearly). For example:
That’s what I’m referring to when I talk about accessing data (the above part is what the network authorization section is referring back to). I’m getting the impression that you’re just selectively reading from these types of laws. I’ve seen multiple examples of this type of behaviour from you and dismissing sound advice from those that have an idea about what they’re talking about. You don’t even see the ethical issues of accessing data that was known to be private and then still accessing it.
Collin,
I’m seeing selective reading of statutes from you and I see you rejecting sound advice from people. If you want to change the subject to your view on ethics– fine. [snip] But your view on ‘ethics’ is separate from Illinois law is. Whether you like it or not, your interpretation of that law does not fly in Illinois courts. If access to the network is authorized and data are placed on that network, access to that data are authorized. That’s the way the courts interpret that. That network was public facing. Access to that network and those pages was authorized to the public. If the owner of that network put information (i.e. ‘data’) on those pages he implicitly authorized the public to access it not withstanding your theory that somewhere else in some other venue he might have been reluctant to hand it over to people.
I have no idea what’s up in the Netherlands, but this is the case in Illinois, the US and Texas ([snip]).
That’s your defence, just saying I do that without any references as to what was selective? After I explained why the passage you’re quoting is selective as it doesn’t take into account what the rest of statute statute says and showing those parts. I can only say good luck if you actually want to use your current interpretation as a defence in court. Because these laws will absolutely not be interpreted in the same way you do. But I won’t spend more time trying to explain this as multiple people already tried this and you’ve so far only rejected what they said. Especially considering what you think would constitute permission to access data when clear statements were already available about what can be access and what cannot.
Also, I already mentioned the ethics of the hack. Both in the above article and in my comments. And yes, it is separate from illinois law. But I never said these two points are one and the same. Every time I raised it this is a separate independent point. But it doesn’t change that if you disregard the legality question (or if it actually was legal) it wasn’t exactly ethical what Brandon did.
Whatever data was obtained, legally or otherwise, why isn’t it being disclosed? [snip]
The reasons for not releasing the data seem trivial. If you are being honest – just put it all out there [snip]
I already answered this point in the above article. I included the following link to a statement from the University of Queensland which states the following:
I also linked to the Skeptical Science page were all this data is available (the statement released by the University of Queensland also does this):
One of the reviewers noticed this and wrote a comment on it:
All the data you need is available. Just not the data that can be used to attack individuals.
Again I ask what’s the problem? The participants identity? This is my point – it seems pretty trivial [snip]. I seriously doubt these participants face prosecution, torture or death regardless of how they responded. At best they may be embarassed about something if the survey isn’t on the level. [snip]
You don’t see the problem after what I in my previous comment explained? Participants were promised anonymity (several raters aren’t mentioned in the paper) and the university has an obligation to honour that. Especially when that data isn’t needed for replication. Also participants wouldn’t face anything as extreme as the examples you mentioned, but neither is the risk as low as you portray it. There’s more than enough examples of scientists, or those involved with science communication, facing harassment.
How papers were rated and the ratings themselves are available. There’s nothing stopping anyone from replicating the research and checking the results. Focussing on irrelevant data for doing that is just a distraction from that.
William,
There’s another issue though. In any research you will generate much more data than you could possibly release. What you need to release is what’s necessary for the result to be tested/reproduced. You’re not obliged to release data simply so that others can audit what you’ve done. In the case of Cook et al. all the necessary data is available. You can actually go to an abstract, read it, rate it, and then compare your rating with the rating given by both of the individual raters and the final rating. You can download all the abstracts and redo the entire study. Continuing to ask for more and more data when the authors have released all that is necessary either shows a complete lack of understanding of the scientific method or an intent to simply continue “auditing” until they find something to criticise. Considering who’s doing it, my guess is a combination of the two.