A simple recipe for the manufacturing of doubt

By Klaus Oberauer
Posted on 19 September 2012
Filed under Cognition
and Stephan Lewandowsky
Professor, School of Experimental Psychology and Cabot Institute, University of Bristol

Mr. McIntyre, a self-declared expert in statistics, recently posted an ostensibly unsuccessful attempt to replicate several exploratory factor analyses in our study on the motivated rejection of (climate) science. His wordy post creates the appearance of potential problems with our analysis.

There are no such problems, and it is illustrative to examine how Mr. McIntyre manages to manufacture this erroneous impression.

Our explanation focuses on the factor analysis of the five “climate science” items as just one example, because this is the case where his re-“analysis” deviated most from our actual results.

The trick is simple when you know a bit about exploratory factor analysis (EFA). EFA serves to reduce the dimensionality in a data set. To this end, EFA represents the variance and covariance of a set of observed variables by a smaller number of latent variables (factors) that represent the variance shared among some or all observed variables.

EFA is a non-trivial analysis technique that requires considerable training to be used competently, and a full explanation is far beyond the scope of a single blog post. Suffice it to say that what EFA does is to take a bunch of variables, such as items on a questionnaire, and then replaces the multitude of items with a small number of “factors” that represent the common information that is picked up by those items. In a nutshell, EFA permits you to go from 100 items on an IQ test to a single factor that one might call “intelligence.” (It’s more nuanced than that, but that captures the essential idea for now).

One core aspect of EFA is that the researcher must decide on the number of factors to be extracted from a covariance matrix. There are several well-established criteria that guide this selection. In the case of our data, all acknowledged criteria yield the same conslusions.

For illustrative purposes we focus on the simplest and most straightforward criterion, which states one should extract factors with an eigenvalue > 1.  (If you don’t know what an eigenvalue is, that’s not a problem—all you need to know is that this quantity should be >1 for a factor to be extracted). The reason is that factors with eigenvalues < 1 represent less variance than a single variable, which negates the entire purpose of EFA, namely to represent the most important dimensions of variation in the data in an economical way.

Applied to the five “climate science” items, the first factor had an eigenvalue of 4.3, representing 86% of the variance. The second factor had an eigenvalue of only .30, representing a mere 6% of the variance. Factors are ordered by their eigenvalues, so all further factors represent even less variance. 

Our EFA of the climate items thus provides clear evidence that a single factor is sufficient to represent the largest part of the variance in the five “climate science” items.  Moreover, adding further factors with eigenvalues < 1 is counterproductive because they represent less information than the original individual items. (Remember that all acknowledged standard criteria yield the same conclusions.)

Practically, this means that people’s responses to the five questions regarding climate science were so highly correlated that they reflect, to the largest part, variability on a single dimension, namely the acceptance or rejection of climate science. The remaining variance in individual items is most likely mere measurement error.

How could Mr. McIntyre fail to reproduce our EFA?

Simple: In contravention of normal practice, he forced the analysis to extract two factors. This is obvious in his R command line:

pc=factanal(lew[,1:6],factors=2)

In this and all other EFAs posted on Mr. McIntyre’s blog, the number of factors to be extracted was chosen by fiat and without justification.

Remember, the second factor in our EFA for the climate item had an eigenvalue much below 1, and hence its extraction is nonsensical. (As it is by all other criteria as well.)

But that’s not everything.

When more than one factor is extracted, researchers can rotate factors so that each factor represents a substantial, and approximately equal, part of the variance. In R, the default rotation method, which Mr. McIntyre did not overrule, is to use Varimax rotation, which forces the factors to be uncorrelated. As a result of rotation, the variance is split about evenly among the factors extracted.

Of course, this analysis is nonsensical because there is no justification for extracting more than one factor from the set of “climate change” items.

There are two explanations for this obvious flaw in Mr. McIntyre’s re-“analysis”. Either he made a beginner’s mistake, in which case he should stop posing as an expert in statistics and take a refresher of Multivariate Analysis 101. Or else, he intentionally rigged his re-“analysis” so that it deviated from our EFA’s in the hope that no one would see through his manufacture of doubt.

Bookmark and Share

654 Comments


1  2  3  4  5  6  7  8  9  10  11  12  Next

Comments 1 to 50 out of 589:

  1. As I have said in other responses here: you can't know someone else's intentions. So I choose explanation 1 - and I'm so looking forward to the Excel experts' response to this one. A beginner's mistake, you say? Stop posing as an expert? Take a refresher of Multivariate Analysis 101? Ouch.
    Look, a squirrel!
  2. I agree with Bluebottle - attempting to understand someone's internal motivations is a fraught exercise.

    However McIntyre would have to be extremely (self snip) naïve to imagine that no-one would notice a deliberate manipulation of the analysis, especially when the error is as egregious as it is. I'm not saying that McIntyre is not so extremely (self snip) naïve, but it would be disconcerntingly surprising were it true.

    This doesn't leave many viable alternatives.
  3. Perhaps one question related to the paper, would it not be advisable to review the name (ie Nasa Faked the Moon landings, therefore [climate] science is a hoax)

    as this seems to be a major cause for concern amongst the sceptics, mainly because the media have chosen to report on the paper with a particular emphasis on this.

    Given that I think there were only 3-4 sceptic responses moon responses in the dataset (subject to some debate) the heading just led to media reporting and activists saying that ALL sceptics are moon landing conspiracy theorists. which of course is not shown,

    The free market results perhaps deserve to be the headline, as this appear to be the sronger result..

    This I think is whythere has been so much attention on this paper, because of how the media (and some 'climate activists have chose to present this paper, to the public and in the media.

    At the Skeptical Science blog - John Cook's website (your co-author of another of your recent papers, which by the way I thought was actualaly very informative) there are some similar observations, about the title being perhaps 'unwise' some put it a bit more strongly.

    (ie Tom 35#, for example and later on)
    http://www.skepticalscience.com/AGU-Fall-Meeting-sessions-social-media-misinformation-uncertainty.html

    or in the comments at this very blog,(comment 82)http://www.shapingtomorrowsworld.org/news.php?p=2&t=83&&n=166#1460

    that the most problematic issue is merely the title.


    The media (and activists) seem to care little for statistics, but they love a good sound bite, perhaps then that the title could be re-considered less your paper be misquoted by the media

    They might then have to actually read it, some just seem to cut/paste from the press release, which is a common complaint these days from many scientists,the quality of the science journalists- (who don't have the time anymore due to commercial pressures).

    Perhaps one of the authors could respond about concerns about the title, after further consideration about this?
  4. There's the first squirrel.
  5. Watching the Deniers at 23:02 PM on 19 September, 2012
    Barry, I think the Tom Curtis comments have well and truly outworn their usefulness (squirrel).
  6. Barry Woods @3
    Are you by any chance the same Barry Woods who said: "the conclusions and title of the paper are utterly fraudlent [sic]"
    here
    http://rankexploits.com/musings/2012/multiple-ips-hide-my-ass-and-the-lewandowsky-survey/#comment-102532
  7. This is really great stuff(for a non-statistician who studied stats so long ago that probably most of what she learned has been superseded).

    Thank you and I hope to see more posts like this and the previous ones.

    (It would be great if we could have one thread on topic without foolish, irrelevant distractions a la #3.)
  8. Watching the Deniers at 23:07 PM on 19 September, 2012
    I believe Barry is just trying to be helpful. Just rename the paper, that's all.
  9. How could Mr. McIntyre fail to reproduce our EFA?

    I feel sure this will be answered by Mr. McIntyre but meanwhile could I suggest Klaus Oberauer link to the specific McIntyre article he is refering to and more tightly responds to specific statements and claims of his?

    Of the two "sides" It is currently easier to see what issues McIntyre refers to and his progress seems to be evidently ongoing from his post I think you are refering to, which dealt with EFA, and says:
    Is Lewandowsky et al 2012 (in press) replicable? Not easily and not so far. Both Roman M and I have been working on it without success so far. Here’s a progress report (on the first part only).

    This makes your statement...
    Or else, he intentionally rigged his re-“analysis” so that it deviated from our EFA’s in the hope that no one would see through his manufacture of doubt.

    ...seem like a projection to me.
  10. 6#

    Indeed - as the problem is the title - and the word therefore.. based on 4 responses (as which we hvae discussed0 may be problemtaic)

    a question, when I quoted from a blog previoulsy - my quote was snipped. I have been very careful since to merely link to the blog and quote a reference..

    may I also quote people, or will 6#'s quote be snipped.
  11. Thanks for the response, Barry. I was just wondering whether you still think the conclusions of the paper are utterly fraudulent, or is it now only the title that is utterly fraudulent?
  12. Until now, I've always thought conspiracy theorists had thick skin. I've come across a few hard-core conspiracy theorists in my travels and mockery hasn't fazed them.

    Is it the fake moon landing conspiracists or the (climate) science hoax conspiracists who are hypersensitive and complaining so loudly and endlessly?
  13. Prof lewandowsky sent out his paper Dr Adam Corner, a month prior to the press release. May I ask if the authors sent it to any other colleagues.

    As Adam Corner wrote the Guardian article, with the sub heading and it received a great deal of attention, but at the time he was unaware of the few moon responses, or the identities of the 'pro-science' blogs.

    A concern amongst psychologist like Adam, is motivated reason and isdeolgy amongst sceptics, but this sure also applys to psychologists themselves.

    Ie Adam was marching at Copenhagen carrying a placard - Act Now - as a then Green party candidate, many sceptics are concerned when they see scientist apparently step across the line with activsim. perhaps this is a valid area of research for your collegues.

    (motivated reasoning?)

    http://t.co/Hdqz9Wbn
    original greenparty source and write up by Dr Adam Corner
    http://t.co/ezqsBusb

    as I have been quoted, I assume I may quote Adam?

    where he tweets about 'deniers' at Copenhagen
    http://twitter.com/AJCorner/status/6429777167

    On reflection comment 6# seems to be against this blogs moderation policy?

    As we have seen, with a large number of articles here and elsewhere, there is a lot of people interested. Is perhaps my concern about the title increasingly adding a polarised debate a fair one.
  14. Another squirrel.
  15. A fake squirrel at that!
  16. 6# does seem to be against policy, - getting close to cyber stalking?

    or I say, that Doug Bostrom, who has made lots of responses here recently, is a team member of Skeptical Science, ie a acsual reader may not be aware of this. Prof Lewandowsky write artcicle at Skeptica science and is a coauthor of John Cook's

    This is as materially relevant comment as number 6#

    (ie delete this comment and 6# or keep both, I don't mind)
  17. mods? 'squirrel' comments? (ref comment policy)
  18. Barry, let's hope the mods come in and clear out all the off topic comments, squirrels and all.
  19. Barry, I believe it is about 3am Perth time, so the moderators are likely asleep. I'll get snipped soon enough if that is what they deem fitting. However, if you say one thing here and you say another elsewhere, it is legitimate in my view to draw that to the readers' attention as well.
    You have raised the issue of the title multiple times on this blog. Can you take please it as read that the people in this discussion are now fully aware of your concerns in this regard?
    I mention squirrels precisely because they keep getting pointed out, as I predicted from the start. Why don't you and I do everyone a favour and step aside and let people who really know something about contemporary stats do, as Sou suggested, and get down to the nitty gritty of the post above. If you can't add to that dialogue, one can only reasonably assume you're more interested in pointing out squirrels.
  20. 19# that is also my point, I quoted people in earlier articles, and they were removed. so I agree 'quotes' may be legitamate, but going around other blogs and asaying are you this peson, is borderline 'cyber-stalking' in my opinion.

    'squrrels' is your opinion..!

    Watching The deniers' seemd to concede my point about the title, which is why we are all probably here. so it seems on topic.
  21. Stephan,

    "in the hope that no one would see through his manufacture of doubt" <<<< would that count as a Conspiracy Theory?
  22. So far there have been at most three comments on topic. The rest are noise and distractions (including this one).

    May I respectfully suggest that from here on in all off topic posts go on another thread? There are enough to choose from. Some of the longer threads have meandered in many directions so it would no longer matter that they are far from topic.

    Hypersensitive conspiracy theorists of whichever flavour who don't like the title might consider posting on one of those threads.
  23. Barry@20
    We happen to read some of the same blogs and you use the same name. That's not cyber-stalking, in my view. I haven't gone out of my way to follow your comments elsewhere - indeed, it's hard not to come across your name because it appears so often and in so many places: a quick Google Blogs search just now finds you mentioned in the past week alone on RealClimategate.org, Climateaudit.org, bishop-hill.net, wattsupwiththat.com, judithcurry.com and joannenova.com.
  24. Thank you, a very interesting description of the data treatment, and in particular the reduction of variables.

    Looking at the McIntyre attempt at replication, he also complains that there are (in your paper) two factors identified as accounting for conspiracist ideation - and that his analysis gave different variances for those factors. However, he forced his analysis here, too:

    pc=factanal(lew[,c(13:15,17:24,26)],factors=3)

    Would that be the cause of his different results? Including a third factor, which I expect has an eigenvalue <1?
  25. From Mr McIntyres post that you refuse to link to, a psychologist writes snip
    Moderator Response: extensive quote snipped
  26. From Mr McIntyres most recent post that you obviously weren't aware of before you started this thread, the very same psychologist writes -
    (snip)
    Moderator Response: Extensive quote snipped
  27. Oh, can anyone explain the quantile plots of the residuals in Mr McIntyres post? What does it mean?

    I'm here to learn.
  28. Please forgive my ignorance, but if you are extracting only one factor using 'EFA', what are you then correlating that one factor with?
  29. Which McKitrick article are you dissecting, Dr Oberauer? It is normal, in scientific discourse and courteous in professional matters, to provide references to articles being discussed.

    My pet statisticians would like to know which method you used to determine the number of factors. You seem to have overlooked mentioning it in your paper. Or did you explore the use of all the common methods and find no difference?

    They're also begging me to ask if you applied the same analyses to white and pink noise to check the "balance" of questions and your analysis chosen doesn't colour the results.
  30. berfel - The relevant page is, I believe, http://climateaudit.org/2012/09/16/trying-unsuccessfully-to-replicate-lewandowsky/ - the various forced factor counts appear in his R code there.

    Given the rather serious error that Dr. Oberauer has noted in McIntyre's analysis, it is entirely unsurprising that McIntyre has failed to replicate any of this papers work.

    Note that this same issue (of inappropriate choices of eigenvalues) also appears in McIntyre & McKitrick 2005, their critique of Mann 1998/1999 - they changed various criteria for the principal component analysis, but did not re-evaluate the resulting number of significant principal component factors; just using the same number as Mann et al did. This led to errors in their analysis and conclusions, as discussed in Wahl & Ammann 2007.

    I find it unfortunate that McIntyre is still misjudging principal components, given the peer-reviewed rebuttal of his earlier work.
  31. Regarding McIntyre and poor choices of eigenvalues/principal components, there is a more succinct discussion by Mann at:

    http://www.realclimate.org/index.php/archives/2005/01/on-yet-another-false-claim-by-mcintyre-and-mckitrick/

    In the Mann 1998 treatment, only the first 2 components are significant at >95%, with PC#1 demonstrating the "hockey stick". In the McIntyre & McKitrick 2005 treatment PC#4 shows the "hockey stick", with the first 5 components being significant at >95%. Yet MM only used the first two components to draw their conclusions, thus excluding the sharply rising signal - misjudging the significance of the various components.
  32. Barry: ...Doug Bostrom, who has made lots of responses here recently, is a team member of Skeptical Science, ie a acsual reader may not be aware of this.

    Hi, Barry! Nice to meet you!

    Anything specific to say about remarks I've made here? That is to say, have I said something that strikes you as being in pursuit of some hidden objective?

    Do you really want to go there, here of all places? :-)

    (snip)
    Moderator Response: Off-topic snipped
  33. Doug, I believe the lunar connection has already been established :D
  34. Brandon Shollenberger at 06:29 AM on 20 September, 2012
    First off, it seems strange to me to discuss something at length without providing any sort of reference to it. As such, the article in question can be found here. Second, Steve McIntyre is criticized because:

    When more than one factor is extracted, researchers can rotate factors so that each factor represents a substantial, and approximately equal, part of the variance. In R, the default rotation method, which Mr. McIntyre did not overrule, is to use Varimax rotation, which forces the factors to be uncorrelated. As a result of rotation, the variance is split about evenly among the factors extracted.


    This makes little sense to me. The implication is McIntyre did not use the "correct" rotation. While I don't doubt that is true, how is that his fault? The rotation used for this analysis was never disclosed. Seeing as there are an infinite number of rotations available, how could we possibly expect him to know which one the authors used when they didn't say it?

    And how does this contradict McIntyre's post which says, "Is Lewandowsky et al 2012 (in press) replicable? Not easily and not so far." I think it's fair to say something is not easily replicable if it requires you do something you aren't told was done.
  35. "Mr. McIntyre, a self-declared expert in statistics..."

    I have to say this section is just wrong, and not needed if the facts in this post can stand by themselves - I make no comment on that part.

    After graduating with a BSc in maths and finishing first in a national competition, McIntyre did attend Oxford University in England and was offered a scholarship in Mathematical Economics at MIT, which he declined - so I think he knows his way around a spreadsheet and statistics more than most, maybe more than the authors of the above comment or maybe not.

    Play and contest the issue Gentlemen, not the man - your case is only made weaker by doing so.
  36. Brandon Shollenberger at 06:40 AM on 20 September, 2012
    Now then, rather than discuss what criteria should be used, such as the Kaiser criterion mentioned in this post, I'd like to focus on what would happen if Steve McIntyre had extracted a different number of factors.

    McIntyre used two factors for the five "climate science" items. Had he used one, he would have gotten an explained variance of 82.7%, different from the published results, 86%. However, when he used two factors, he got an explained variance for two factors of 86%. Seeing as he couldn't get the same result with one factor but could with two, that seems a reasonable table to publish.

    For the conpsiracy items, McIntyre used three factors, getting the results, 18.4% and 33.1% as opposed to the "correct" values of, 42% and 51.6%. Had he instead only used two factors, he'd have gotten 25.1% and 44%. While those are closer to the reported values, they are still not the same.

    What this shows is McIntyre's choice of factors, whether right or wrong, is not responsible for the difference in his results. The difference is entirely due to an undisclosed methodological choice. In other words, he's being criticized for not doing something the authors never said they did.
  37. @zt #28 - this may have been answered already, and I'm not an expert on this type of analysis but i think, in the post, they're discussing the factors associated only with the climate items. There are another set of questions associated with conspiratorial beliefs (and I think a third set associated with free market ideology). So they would have extracted one factor from the climate questions and correlated this factor with factors extracted from the other question types. (This is also implied by KR post at comment #24)
  38. @Brandon #24 - it would be standard procedure not to extract factors which don't account for substantial additional variance. Since only one factor was extracted, no rotation method is necessary. So knowledge of the standard procedures would have allowed one to work out how the data were analyzed and to replicate the analysis outlined in the blog post.
  39. "McIntyre did attend Oxford University in England"

    Philosophy, Politics and Economics, the same as David Cameron. It all makes sense now.
  40. An update to my post @24:

    McIntyres evaluation of the explanatory power for "conspiracist ideation" would be thrown off not only by the inclusion of a 3rd (eigenvalue <1) latent variable, but by the default factor rotation for orthogonality, for zero cross-correlation.

    While I am unfamiliar with exploratory factor analysis (EFA), and in my own work would separate components by orthogonality, to zero correlation, I expect that in psychology it would be quite inappropriate - these factors (after the latent variable distillation of eigenvector extraction) are expected to map to mental concepts, to world models, and those may well be correlated.

    Which reinforces the indications that McIntyre is quite unfamiliar with the field of study, the techniques used therein, and is unsurprisingly having difficulties with the analysis. In addition (IMO) to a rather consistent failure on his part to properly evaluate the explanatory power and significance of principal components or eigenvalues...
  41. Brandon Shollenberger at 08:13 AM on 20 September, 2012
    gm:

    it would be standard procedure not to extract factors which don't account for substantial additional variance. Since only one factor was extracted, no rotation method is necessary. So knowledge of the standard procedures would have allowed one to work out how the data were analyzed and to replicate the analysis outlined in the blog post.


    Your claim is shown false by the fact I did the same analysis with one factor, and I did not get Lewandowsky's results. This shows the number of factors selected is not the reason Steve McIntyre's results were different.
  42. McIntyre is correct.
  43. Brandon Shollenberger at 08:25 AM on 20 September, 2012
    By the way, KR is discussing something I suspect is off-topic, the issue of PC retention in Mann's original hockey stick. I understand if the entire discussion gets removed, but while it's up, I feel I should point something out.

    KR repeats a claim that portrays McIntyre as having selected the wrong number of PCs, and thus getting faulty results. In reality, the PC retention criterion claimed to have been used in MBH was not actually used (this can be seen as it would give different retention rates than the paper used). Moreover, McIntyre never disputed the fact the same signal appeared in the fourth PC. Instead, he often highlighted that fact. The point he made was by fixing a methodological flaw in the paper, the hockey stick shape shifted from a high PC to a low PC. That shows it was present in far less data, and it was not representative of the data as a whole.

    This position is borne out by the fact the hockey stick shape of the graph depends entirely upon a couple proxies. If you remove those, the shape disappears. This means the shape is not robust to the removal of a small amount of data (even though the paper claims it was). This fact was even admitted by Michael Mann himself on page 51 of his book.
  44. @41 Brandon: Well, my claim is still true since it has nothing to do with reanalysis of the data and only whether the aforementioned procedures are standard procedures.

    You claim that my claim is false because of your assertion that you did the same analysis with different results. Which analysis? What did you do? What were your assumptions?
  45. Brandon Shollenberger at 10:10 AM on 20 September, 2012
    gm, you claimed "no rotation method is necessary" and "knowledge of the standard procedures would have allowed one to... replicate the analysis." You now claim you are right because what you said "has nothing to do with reanalysis of the data." That's rubbish. You explicitly said one could replicate Lewandowsky's results.

    Lewandowsky took issue with Steve McIntyre forcing the extraction of two factors for the climate items. I reran the analysis, extracting the number of factors he extracted (1), a number which fits with the Kaiser criterion. When I did, I got different results than Lewandowsky. This shows knowing the standard procedures does not allow one to replicate the analysis.

    That is, unless you and/or Lewandowsky are going to say there's some other problem involving "standard procedures" you've never bothered to highlight.
  46. @- Brandon Shollenberger
    "Your claim is shown false by the fact I did the same analysis with one factor, and I did not get Lewandowsky's results. This shows the number of factors selected is not the reason Steve McIntyre's results were different."

    Your conclusion is not logical.
    That your results differ when using one factor means the difference is the result of something other the number of factors used in your case.
    It does not follow that another analysis that did use two factors also failed to match for the same reason as you.
  47. (snip)
    Moderator Response: Inflammatpry and off-topic snipped
  48. Brandon Shollenberger at 11:39 AM on 20 September, 2012
    izen, I'm afraid I don't understand your response. My test showed Steve McIntyre's attempt to replicate Lewandowsky's results would have failed regardless of the number of factors he used. By definition, this means the number of factors he used is not the reason he failed to replicate Lewandowsky's results.

    It could, of course, be a reason. If McIntyre's effort failed because of the number of factors he used, plus one or more other reasons, that would be perfectly in line with what I found and what I said.

    But if that's the case, that shows Lewandowsky's results are not easily replicable. In fact, it shows there is no way of knowing how to replicate them as he has not provided enough information.
  49. @- Brandon Shollenberger
    "But if that's the case, that shows Lewandowsky's results are not easily replicable. In fact, it shows there is no way of knowing how to replicate them as he has not provided enough information."

    Why should they be 'easily' replicable ?
    This is clearly complex stuff if even McIntyre's effort failed.
    So far we have a sample of two who have failed to replicate the results. But given the difficulty and specificity of these analysis techniques I am sure that number could be increased.

    For the vast majority of us without the mathematical, statistical and theoretical understanding at that level, we have to take it on trust that the original work was done with enough care.to avoid gross error in such an esoteric field. Perhaps differences in result are due to more subtle variations in method and 'standard' approaches.

    Or you could confirm the papers conclusions by theorising some sort of collusive deception? :-)
  50. izen (49),

    but that is not how science works. In science you provide the information to allow other experts to replicate your results thus allowing an advancement of general knowledge. The role of peer review is to identify gross errors and screen substandard work. The real crucible of science is when the information is sent out into the wilds to be carefully scrutinized and replicated by the general scientific community with the expertise to do so.

1  2  3  4  5  6  7  8  9  10  11  12  Next

Comments Policy

Post a Comment

You need to be logged in to post a comment. Login via the left margin or register a new account.