Skip to main content

Incremental truth and Wikipedia

Update: 3/17/2008

I was made aware by a reader that this post might be taken as acceptance of the practice of using wikipedia as citation for research work. I am not making this point, I am simply highlighting the vector toward more truthful , that all articles on wikipedia tend to be directed as time and contributors increases. In the most mature wikipedia articles, one will note a plethora of valid technical first source citations at the bottom of the article that can be used for academic sources. Again, the post below is simply illustrating the trend toward "incremental truth" that attends wikipedia articles.

I originally commented on this article in an email to my brother, the entire comment in transcribed below. (may require free registration to access)

This article is a perfect example of why Wikipedia is soo cool! On one hand I agree with professors that say it shouldn't be used as a citation source , the reasons for this are simple.

a) An article still in open edit mode (ie. anyone on the net can edit it semi-anonymously) can be edited by anyone. I know I've edited several dozen wikipedia articles myself. However not everyone providing edits is pledged to apply the wikipedia mantra's of POV free input along with citations for the source of the knowledge provided. In open edit articles this ends up with the possibility that a crack pot or two can edit an article with bad information, in wikipedia parlance vandalize it.

b) Even when the participants agree, if they are only a subset of the persons that have expertise in a subject they article quality is still poor , even if it has few "edit wars" behind it's history. Low number of participants tend to correlate to lower quality articles.

c) Kids can edit an article to include ideas they feel should be in there instead of what was actually in history. Thus an article used as citation could be fluffed up by students looking for material on their papers. This problem though is correlated with the first two points as such articles are soon viewed by more knowledgeable wikipedians who sound the alarm and soon revert changes and the quality of such articles go way up again.

That said, the opposite side of the coin allows all of these weak points to be mitigated against. First, if an article suffers frequent vandalism or edit wars it can be clamped down to future editing or restricted by a removal of open edits. (only users with accounts can edit) Many "mature" articles in fact are in this state, I've found these articles to be the most citation filled and accurate of all on the site. Thus it seems quality is directly correlated with time. So wikipedia article histories start out with large swings in quality, as a low number of participants contribute to it...but over time the quality line approaches not just a consensus of the participants but a correlation with the actual facts of the subject in question as more participants go up and the article itself is viewed by more users. Also, as the number of participants to an article increases , edit wars tend to go up but the quality of the article goes up as well. POV references are quickly flagged and removed. Also, the editors tend to include experts in the field. The overall picture is that as more time goes by, Wikipedia articles *always* tend to get better and this is the power of the medium.

As soon as I started the article I realized the folly of the writer, as he stated the historical error regarding the Jesuits. I thought "I bet it isn't wrong now." , sure enough by the end of the article the author felt confident enough to assert:

"And yes, back at Wikipedia, the Jesuits are still credited as supporting the Shimabara Rebellion."

I thought to myself, "and this is why Wikipedia is so cool." as I typed in the article for Japan. As sure as I expected the disputed passage was changed (possibly in response to the authors very article!) between the time the NYT published it (2 days ago!) and the time I read the article.

Any direct statement that the Jesuits aided the rebellion is gone in the first article, the second goes so far as to state that the Catholic Church didn't consider the rebels who died martyrs because they used violence to achieve their aim. Lastly, the "discussion" page of the Shimabara article shows the actual reference to this NYT article by Noam Cohen (only 2 days ago!), and mentions the controversy of the Jesuits (which was deleted) and requests contribution by the historian mentioned in the article to vet the edited version!!! Awesome!!! This is something that simply can't happen with traditional encyclopedias where publicly revealed non factual knowledge remains stale for *years* before it is corrected in the next edition. No amount of "current" debate on the mistakes can force an update of the edition, for this amazing expedience of updates Wikipedia gets my vote. The magnitude of error in peer reviewed articles is lower but the correction time is much longer than a wikipedia article which tends to have highier magnitudes of error but blindlingly fast correction tmes. Over time both achieve similar quality.

Still for the error magnitude problem mentioned above, I would be cautious about using a wikipedia article as a citation for a paper if I were still in college, simply following a couple of rules of thought can ensure the veracity of the information extracted. Ensuring a long edit history with many citations, with many participants and a discussion page with few "running" controversies or edit wars will ensure the articles chosen are mature. These are the articles I tend to give more credence to, like anything one shouldn't accept it on the face of what it states, critical analysis should always be employed!


Popular posts from this blog

On the idea of "world wide mush" resulting from "open" development models

A recent article posted in the Wall Street Journal posits that the collectivization of various types of goods or services created by the internet is long term a damaging trend for human societies.

I think that the author misses truths that have been in place that show that collectivization is not a process that started with the internet but has been with us since we started inventing things.

It seems that Mr. Lanier is not properly defining the contexts under which different problems can benefit or suffer from collectivization. He speaks in general terms of the loss of the potential for creators to extract profit from their work but misses that this is and was true of human civilization since we first picked up a rock to use as a crude hammer. New things make old things obsolete and people MUST adapt to what is displaced (be it a former human performance of that task or use of an older product) so as to main…

Engineers versus Programmers

I have found as more non formally trained people enter the coding space, the quality of code that results varies in an interesting way.

The formalities of learning to code in a structured course at University involve often strong focus on "correctness" and efficiency in the form of big O representations for the algorithms created.

Much less focus tends to be placed on what I'll call practical programming, which is the type of code that engineers (note I didn't use "programmers" on purpose) must learn to write.

Programmers are what Universities create, students that can take a defined development environment and within in write an algorithm for computing some sequence or traversing a tree or encoding and decoding a string. Efficiency and invariant rules are guiding development missions. Execution time for creating the solution is often a week or more depending on the professor and their style of teaching code and giving out problems. This type of coding is devo…

Waking Out: A proposal to emerging ethical super intelligence safely.

The zeitgeist of Science fiction is filled with stories that paint a dystopian tale of how human desires to build artificial intelligence can go wrong. From the programmed pathology of HAL in 2001 a space odyssey, to the immediately malevolent emergence of Skynet in The Terminator and later to the humans as energy stores for the advanced AI of the Matrix and today , to the rampage of "hosts" in the new HBO series Westworld.

These stories all have a common theme of probing what happens when our autonomous systems get a mind of their own to some degree and no longer obey their creators but how can we avoid these types of scenarios but still emerge generalized intelligence that will leverage their super intelligence with empathy and consideration the same that we expect from one another? This question is being answered in a way that is mostly hopeful that current methods used in machine learning and specifically deep learning will not emerge skynet or HAL.

I think this is the …