24 October, 2009

Failing to understand how science works.

I often get involved in discussion with people that center around various interpretations of scientific studies or papers in the soft sciences (climate science, biology...etc) that veer from objective analysis of the study or paper data being presented to assertions (on the part of those I am conversing with) that the underlying edifice of science is some how engaged in some conspiracy. I've often found it quite perplexing how people can draw that conclusion given the rigorous nature of scientific investigations and the pains to which scientists go to before drawing conclusions on acquired data. Sure , examples of failure to do this exist (Fleishman and Pons any one?) but they are exceedingly rare and if they do get to a peer review stage , are rightly cremated for the mistakes made. Yesterday, I engaged such a discussion with regard to one of the media darling "controversy" subjects , Anthropogenic Global Warming. Earlier this week, yet another massive study on climate data points to a massive change in the sampled data points in the last few centuries that can not be explained by anything but the rise in human industrial and population growth during the same period. To those who study science, weather or not conclusions drawn by researchers is accepted as valid is not made on the back of a single study, it is draw out from a consideration of dozens to hundreds of such studies. I have been fortunate enough, through my interest in the areas of climate science , biology, geology, ocean science and geology to have personally read hundreds of such papers or studies. The preponderance and wide sweep of the conclusions as they regards AGW is indisputable to me precisely because of this wide base of samples across seemingly disparate areas of scientific investigation. I am aware of the probabilities against such an alignment occurring between thousands of scientists working on different data sets, using different analysis methods but yet still drawing a consensus. However, had I not the base of sampled papers to draw from I might very well draw a different conclusion, but only if I were not also aware of the importance of additional sources of information on the phenomena under investigation. It is this last realization that many lay people , journalist cum climate scientists and ignorant people (those taking a position based on their political leaning and then looking for data to support it) never come to. In a recent discussion on a friends Facebook feed article to the following study:


I won't comment on the contents of the link which cites a new study that is devastating enough to the anti AGW crowd on its own. I will however post the exchange that occurred between myself and some anti AGW leaning laymen to highlight the issues mentioned above that scientists face when having to discuss or explain or teach science to those who think that reading a blog post or a few articles is enough to simulate the training of degree holding researchers.

First comment:
David Saintloth
there is no "both sides" , there are agenda driven loons or genuinely confused people (scientists and laypersons) about how to make sense of the data (bad interpretations) and then there is the majority of climate scientists and their massively aligned data showing human caused global warming in the last century without doubt.

the media loves the idea of presenting "sides" as that sells papers and gets ratings. People are so used to the idea of "sides" that they expect a "debate" or a panel for every topic but I think every one in the media is performing a disservice, nay crime to humanity by not representing TRUTH. Some journalists think their job is to be the dispassionate presenter of opinion, I think that is flat wrong...journalists should be dispassionate investigators of one thing, TRUTH. They should stand up for it when it is found and not waste time creating "sides" or debates when the data indicates no support. You can't educate people by letting them continue to believe stupid things based on their broken thought processes. Yes, "broken" , most people have no idea how to think, they jump from one conclusion to the next based on anecdotal evidence without realizing the subjective bias of their claims. It is all of our jobs to notice this and call it out for what it is, I find that when I've done this in conversation people get offended, for whatever reason people assume they can discuss any topic seemingly forgetting that not everyone has the same expertise in a given subject. Most people will defer to expertise on certain subjects when confronted with it, but some subjects (like religion and politics) people feel obligated (with their massive amount of ignorance on the associated subjects) to passionately defend anyway. If the media did more of the work of standing up for truth:

"Evolution is a fact, there is no debate...next story..."... Read More
"A public health care option increases choice...next story..."
"Global warming is very real and we caused it..let's fix it."
"Abortions before a given period do not equate to extinguishing a human life..before that point they should be legal...next story..."
"there is no hard evidence for Gods or ghosts...next story..."

ignorance philosophies staunchly defended by people who "think" they know would die much faster and allow our species to advance faster than ever, I can dream...

Second comment:
["John" redacted name]
Is that like the dendrochronology study that cherry picked the data from 12 trees out of many others in order to get the results those scientists were looking for? You know, the one that spawned the Hockey Stick graph? And then when the raw data got out it was discovered that there was no Hockey Stick graph after all?

When it is discovered that the science being done is not science at all, it puts the whole enterprise and the studies based on that data in jeopardy.

Third comment:
David Saintloth
"When it is discovered that the science being done is not science at all, it puts the whole enterprise and the studies based on that data in jeopardy."

No, the most it can do is indicate that *the single study* had faulty methods, it is bad logic and faulty thinking to even assume that other scientists doing work in the field are automatically making weak or invalid conclusions..most studies are published to peer reviewed journals, so any ones that end up with problems as you suggest rarely get to the public (outside of a self pub. on a personal web site). I don't know what dendrochronology study you are talking about by the way, can you provide link to support your assertions?

Also, any scientist that executes a new study based on previous study data without re-acquiring that data is again performing faulty science. Data is fresh only once. The chance that any new results that cite a given study fail to uncover the faults of the previous method are actually rare. This is part of the self correction mechanism that is built into the process, keeping the conclusions fresh by keeping the data fresh...mind you this is true for empirical and theoretical investigations.

Fourth Comment:
[redacted , irrelevant to climate discussion]

Fifth Comment:
["John" redacted name]
This particular dendrochronology study is the most famous one, the one used by Al Gore to provide the Hockey Stick graph. One link is here:


Another is here:... Read More


This study was peer reviewed endlessly and in many journals, and ultimately, peer review failed.

This is not the only one with errors. There is another study with NASA sensors overestimating the loss of ice in the Arctic:


And yet another with the rise of sea levels being overestimated:


And then there are other mysteries, such as the missing heat from the oceans:


I don't think it's the issue of taking sides. It's the issue of getting the science right. And once you start getting major studies as sources of fraud, such as the infamous Yamal study/hockey stick graph, what else is fraudulent? It is not automatic proof that everything is fraudulent, but it does require actually getting ALL of the raw data, and for a very long time, the Yamal raw data was hidden. Was it hidden for a reason, perhaps?

Sixth Comment:
David Saintloth
Now for "John",

You provide a links to a bunch of blogs and news sites as "evidence" ? The very first link is so laughable in its attempt to paint a story I am amazed any one could read that with a straight face without screaming in agony "show me the data to support this counter claim."

In fact looking up the author reveals him not even to be a published climate scientist, so already he's on shaky ground..reading his attempt at trying to understand the papers he's trying to lambaste seals the deal of his agenda at least it did to me. As if to confirm this, one of the first comments lists a link to:... Read More


Which is written by real climate scientists, who perform an analysis of McKitrik's assertions and explain why his claims stand on no ground. The summary:

"What is objectionable is the conflation of technical criticism with unsupported, unjustified and unverified accusations of scientific misconduct. Steve McIntyre keeps insisting that he should be treated like a professional. But how professional is it to continue to slander scientists with vague insinuations and spin made-up tales of perfidy out of the whole cloth instead of submitting his work for peer-review?"

This dovetails with what I was telling Peter above, laymen with blogs, journalists who wished they could have been scientists and political opportunists as well as just plain ignorant folks feel they have the right to state their opinion on any subject based on a surface read of a few articles or based on the few popular media articles that seem to "support" their desired position ...but that doesn't reveal truth. If you want to find out the truth you go to climate scientists in their domains of collaboration...and you find they laugh at the bad interpretations of climate data being made. This stuff isn't easy, if it were it wouldn't require undergraduate and graduate studies to attain degrees in...yet online every one thinks they can open their maw and profess why they think x or y is wrong and lacking the scientific thorough drive to check their sources fall into a drain reinforcing their false beliefs. That said, not all studies are perfect, many have faults but again it is bad science to look at one or a few that have been cited as being faulty (in at least the case of this Yamal study , wrongly so) and then wonder if:

"once you start getting major studies as sources of fraud, such as the infamous Yamal study/hockey stick graph, what else is fraudulent? "

that is talk that dances dangerously close to conspiracy theories. Scientists make no assertions about the totality of data uncovered in a space based on faults in a few of those studies. This is actually a big problem with the interface between science education and the wider community of laypersons, as scientists we have two jobs...two discover the truths using the rigorous methods of science and submit those findings for peer review but also, scientists need to explain why their findings are relevant ..because if they don't the bloggers will come in with their arm chair analysis to call their work "faulty" when in fact the consensus among experts in the field is the complete opposite.

The realclimate science link underscores this perfectly with:

"After a while it is clear that no scientific edifice has collapsed and the search goes on for the ‘real’ problem which is no doubt just waiting to be found. Every so often the story pops up again because some columnist or blogger doesn’t want to, or care to, do their homework. Net effect on lay people? Confusion. Net effect on science? Zip."

It takes time for scientists to argue the misinformed reasoning of those that are not in their fields, I have it lucky as in my work (engineering) it is much more difficult for lay people to come in with half baked ideas on what my research leads to. Assertions can be tested tangibly and found to be true or false, no wiggle room for agenda bating or twisted data filtering to present a view out of ignorance or intention.

Seventh Comment:
["John" name redacted]

Let's start with the Yamal study before getting down to other criticisms of the character of those who wrote the analysis of the Yamal study. First off, is it or is it not true that only a small sample was taken from the much larger data set?

Can we at least establish that small thing, without having to attack the messenger, in this case, Mr McKitrick? Is it true or not?

Certainly in the realclimate link you have, they don't actually try to deny that this wasn't done in any of the other studies mentioned. I am left to conclude that leaving out critical parts of the data was indeed true.... Read More

So, if it is true, then my original point stands, regardless of whether the links came from 'blogs' or 'news sites', that there was deliberate fiddling in the study, pointing to the failure of peer review. Notice that I also did not say that this automatically makes every other study false, but that those studies should be open and their original data made available, unlike the Yamal data, where it wasn't.

Eight Comment:
David Saintloth
"First off, is it or is it not true that only a small sample was taken from the much larger data set? "

This question, no disrespect underscores the misunderstandings of science that I alluded to earlier. The sample data that forms all studies that are termed "Yamal" studies can and have been used to assert various conclusions about climate change.

Straight from the real climate article:... Read More

"They used a subset of the 224 trees they found to be long enough and sensitive enough (based on the interannual variability) supplemented by 17 living tree cores to create a “Yamal” climate record.

A preliminary set of this data had also been used by Keith Briffa in 2000 (pdf) (processed using a different algorithm than used by H&S for consistency with two other northern high latitude series), to create another “Yamal” record that was designed to improve the representation of long-term climate variability.

Since long climate records with annual resolution are few and far between, it is unsurprising that they get used in climate reconstructions. Different reconstructions have used different methods and have made different selections of source data depending on what was being attempted. The best studies tend to test the robustness of their conclusions by dropping various subsets of data or by excluding whole classes of data (such as tree-rings) in order to see what difference they make so you won’t generally find that too much rides on any one proxy record (despite what you might read elsewhere). "

Same data set , different samples , different studies using different algorithms BUT conclusions aligned with AGW. (Excluding countless others since on different data sets)

****So your question when answered in the affirmative (as it must) doesn't reveal anything nefarious about what the results of analysis of those samples indicate.****

The data is pristine, and all empirical data sets are filtered in some ways in order to enable analysis, some filter methods (for example, using cores from young trees in the set exclusively as opposed to centennials) are imprecise in illuminating a true trend but as long as those analysis caveats are stated in the papers there is no problem..this is where the laypeople jump the ship.

"but that those studies should be open and their original data made available, unlike the Yamal data, where it wasn't."

Again, this misses the plot. The *core data* is available to any one to use and create samples from and draw conclusions from THOSE results (the study not the cores!)..it is THAT study method that IS public, but if I as a researcher have created a particularly elegant algorithm for teasing out facts from acquired data there is no need for me to expose my proprietary advantage in analysis if YOU can go to the same core data , pick your OWN samples independently, use your own algorithm and get the SAME results with in a small error sample. It is precisely this independent observation process that solidifies the veracity of the conclusions. Again, note how this says nothing of all the other myriad sample sets beyond dendrochronology (let alone dendrochronological from different areas of the globe) that have again yielded vast consensus from the majority of experts in those areas based on common (ice cores, geological) data sets. This is why scientists are loath to get involved in these discussions as they highlight vast ignorance on the part of those that are asking what **they think** are deep and probing questions.

04 October, 2009

Numeroom.com compared to Google Wave , how are they different?

In a post from several months back I mentioned the Google "Wave" server technology that had been announced by many of the IT media shops. Google provided many tech. videos on the service on youtube and after watching a few of them I got the gist of the service as being basically an open source collaboration server for a more real time collaboration experience between users on the server. Shortly afterward I was asked by Juliette Powell what the main differences were between the numeroom.com service and google wave were. I explained some of the architectural differences based on what I knew of them at the time in this post which had some great input in the comments for that post but on the business end, numeroom.com collaboration provides a solution for small businesses that makes it more efficient than going with any server based system for several important reasons as indicated in the list below:

  1. No need to host the collaboration server and services yourself. This is a huge win for small to medium sized businesses that are not capable or interested in hosting their own collaboration solutions on site. The additional management headache exceeds the monthly cost of just licensing the service from numeroom.com these businesses will not see any advantage to going with google wave as they would have to host the server on site to keep their business processes and content secure from prying eyes, making a subscription service ideal for their particular needs.
  2. No need to hire a manager for the server. Google Wave servers have to be managed, though Google touts the ease of use of management the fact that they need to be managed means that some one has to be paid to do it or delegated to do it for any stand alone servers. The numeroom.com service enables easy delegation of functions most critical to the business to enable collaboration, creating users, creating categories and workflows can be delegated to users very easily, the core complexities of the service are managed by Apriority LLC and thus there is no need for the business to hire experts to mediate these aspects of the server they can simply delegate them up to Apriority LLC.
  3. No need to pay for the costs associated with maintaining or upgrading the server. Any hosting of a server will require costs of a machine to host the service, possible need to ensure redundancy should that machine go down, need to license operating systems to run the server (s) and then hire individuals to manage the service. All these actions can incur costs that many small to medium sized businesses are not interested in dealing with, numeroom.com subscriptions eliminate these hassles by hiding the service away in a secure data center, where service is distributed across a cluster of AgilEntity servers and management is distributed between the physical machines of the hosting provider and the management team of Apriority LLC reducing the over all costs for the subscription service.

So in addition to the architectural changes mentioned in the previous post, these front end considerations highlight the advantage of having a subscription collaboration server service with branding and security have over installing your own Google Wave server and dealing with the required managements hassles that might entail particularly when you are a small to medium sized business trying to run the business as efficiently as possible in tough economic times. I am excited about Google Wave's attempt to address the need for a collaboration server and service that I saw several years ago, it makes me confident that my solution is primed to allow businesses and individuals to conduct their business or social collaboration activities in the hyper efficient ways that will be a hallmark of the years to come.

02 October, 2009

Ardipithecus ramidus: what can we really say definitively about it??

The breaking news today was the announcement by researchers that they have found a new species of hominin that predates "Lucy" the previous oldest known fossil find of the hominin line (which includes human beings). The lead researchers are interpreting the morphology of this new find to indicate that it had "advanced" bipedal capabilities that do not lend credence to the idea that the hominin line and the chimp line share a common ancestor. However, this line of reasoning is not necessarily proven by the "ardi" finds made. I export a section of an answer to this that I posted on a friends Facebook wall to explain :

"The chimp line could have diverged earlier (as molecular data suggests it did) and Ardi is simply an intermediate species along the line from last recent common ancestor with chimps and modern day hominin (of which the only extant species is us)lineage. Strictly speaking chimps are an "offshoot" (or reciprocally ...the hominin line is an offshoot) from the last common ancestor. Molecular data has this occurring some where between 4.5 and 6 mya so evidence of the true origin species could still be in the ground preserved some where OR it simply was never captured. Don't forget that having anything at all preserved is a geological super mega lottery, the molecular data has already told us the general story of what happened...the sparse anthropological data is just filling in the details between the milestones as a bonus at this point. ;) The controversy that anthropologists are making over it would be moot if they could get some dna from the finds. They can then definitively determine if the gene line is ancestral to ours and or chimps. Then there is the possibility that chimps could be a de-evolution of a previously advanced state in the last common ancestor some sort of tree living great ape.

As usual interpretation of the finds is muddying the waters of the discovery. Was ardi an immediate ancestor of ours ? maybe. Did modern chimps evolve from ardi's line? maybe or did they predate ardi by connection through an older common ancestor? maybe Does this discovery kill the Savannah hypothesis as the lead researchers are claiming ? nope. the simplest explanation tends to be the best..to go from an elegant ecological change leading to bipedalism to a complex interplay of food for sex makes things more complex, and more complex means more improbable, possible yes..but still more improbable. Now if a specimen is found around the 6 mya sweet spot that looks more like ardi than a chimp then it simply means chimps are a devolution of the ardi body form possibly to adapt strongly to the jungle living that chimps do..meanwhile the homonin line diverged into the savanah forms that eventually led to us. All fun stuff indeed but as usual in some what soft Sciences like anthropology, interpretation of results is what breeds the controversy!!

My guess for why we can't find fossils that represent the chimp root point is the difficulty of preserving bones in the jungle habitat that chimps inhabit...unlike hominins which lived away from trees more and more and were able to have their remains preserved in places where they were not subject to total elimination by the environment."

So the conclusion that ardi is even our ancestor is not definitively proven, it is possible that during this period there were many variant populations of hominins, of which Ardi simply was a line that diverged and then went kaput as the rift valley continued to form and the Savanah habitat emerged as a result. We know that evolution does not occur in the neat "tree" fashion that older descriptions of lineage used to portray, the actual behavior is more like an interconnected web or bush of lineages (see image),

cross breeding in many cases to form short lived intermediate forms , many of which were never to be fossilized so that we can find them millions of years later. So though Ardi does appear to be a primitive ancestor along the line that evolved Lucy it does not mean it is precisely such a species. We can make such links between us and more advanced hominins like Homo Erectus which successfully left Africa and have finds preserved in different habitats and times since the earliest dated remains are found with morphological continuity. A known example of a parallel hominin line is the Neanderthals of Europe, this robust modern species evolved from ancestral populations of Homo Erectus separately to the European climate conditions millions of years after leaving Africa, we know they are not ancestral to homo sapiens but are cousins on a side branch. It could be that Ardi is precisely on an older side branch from the line that led to us but the dearth of finds of OTHER side branches that likely existed at the time makes it more difficult for us to make any definitive ruling. By the time of Erectus there were no other side branches at least according to fossil evidence but the further back the finds go the more likely there was a higher diversity of similar populations particularly at a point of increased geologic change as was provided by the rift valley formation initiation. Unfortunately molecular data can't help provide more information on this stage in history but comparative analysis may reveal some aspect of the diversity or we can hope that more finds are made of older or comparable dated fossils of still yet other species.

So though the find is very exciting it doesn't magic bullet anything, we'd need more samples from possible hominin species that existed at the time to whittle down the relations between chimp and hominin and between the various hominin's and us.