[p2p-research] the wikipedia decline

Paul D. Fernhout pdfernhout at kurtz-fernhout.com
Thu Nov 26 16:04:18 CET 2009


J. Andrew Rogers wrote:
> On Wed, Nov 25, 2009 at 9:22 PM, Paul D. Fernhout
> <pdfernhout at kurtz-fernhout.com> wrote:
>> The deeper issues is also just the semantic web.
>>
>> I set up a Halo Semantic MediaWiki for OSCOMAK, but it just seemed really
>> awkward. But, that might be a next logical step for Wikipedia too, and maybe
>> some of the semantic stuff might help building these new tools for the
>> community?
> 
> 
> The semantic stuff is currently limited by theoretical computer
> science, as in we lack algorithms required to make it scale beyond toy
> datasets. Old, old issues. To date, this has been the real problem
> with semantic technology -- it sucks if you want to do anything useful
> with it.
> 
> On the upside, a lot of research dollars are being spent on semantic
> web technology and significant progress is being made.  On the
> downside, almost none of the theoretical computer science technology
> that will make it possible is open source (or in many cases, even
> published). As usual, life is a mixed bag.

Our tax dollars at work. :-(
http://www.pdfernhout.net/open-letter-to-grantmakers-and-donors-on-copyright-policy.html
http://www.pdfernhout.net/on-funding-digital-public-works.html

But there is some open stuff out there. But they demand too much of users. 
Related:
   "Metacrap: Putting the torch to seven straw-men of the meta-utopia"
   http://www.well.com/~doctorow/metacrap.htm
"""
Metadata is "data about data" -- information like keywords, page-length, 
title, word-count, abstract, location, SKU, ISBN, and so on. Explicit, 
human-generated metadata has enjoyed recent trendiness, especially in the 
world of XML. A typical scenario goes like this: a number of suppliers get 
together and agree on a metadata standard -- a Document Type Definition or 
scheme -- for a given subject area, say washing machines. They agree to a 
common vocabulary for describing washing machines: size, capacity, energy 
consumption, water consumption, price. They create machine-readable 
databases of their inventory, which are available in whole or part to search 
agents and other databases, so that a consumer can enter the parameters of 
the washing machine he's seeking and query multiple sites simultaneously for 
an exhaustive list of the available washing machines that meet his criteria.
   If everyone would subscribe to such a system and create good metadata for 
the purposes of describing their goods, services and information, it would 
be a trivial matter to search the Internet for highly qualified, 
context-sensitive results: a fan could find all the downloadable music in a 
given genre, a manufacturer could efficiently discover suppliers, travelers 
could easily choose a hotel room for an upcoming trip.
   A world of exhaustive, reliable metadata would be a utopia. It's also a 
pipe-dream, founded on self-delusion, nerd hubris and hysterically inflated 
market opportunities. ...
"""

I don't completely agree though. I think better tools can help (structured 
arguments, structured interlinking for related data within a tool), and the 
community will co-evolve with those tools. There was a time when people 
though it too much trouble to give titles to emails too.

Nonetheless, Cory Doctorow's points about people all need to be appreciated 
and addressed in the design of systems. The list of points from the table of 
contents:
"""
# 2. The problems
     * 2.1 People lie
     * 2.2 People are lazy
     * 2.3 People are stupid
     * 2.4 Mission: Impossible -- know thyself
     * 2.5 Schemas aren't neutral
     * 2.6 Metrics influence results
     * 2.7 There's more than one way to describe something
"""


Part of his conclusion: "This sort of observational metadata is far more 
reliable than the stuff that human beings create for the purposes of having 
their documents found. It cuts through the marketing bullshit, the 
self-delusion, and the vocabulary collisions. "

One might also conclude that the cost to the "Noosphere" of a competitive 
economic system that leads to confusion through spam and deceptive 
advertising, and limits the amount of time people have to contribute free 
content related to their interests and understanding, those costs may now 
vastly outweigh any potential benefit one might suggest comes from a system 
that rewards individual entrepreneurs for operating businesses or individual 
workers for going to work. So, the semantic web itself may demand a change 
in our economic organizing principles. :-)

--Paul Fernhout
http://www.pdfernhout.net/



More information about the p2presearch mailing list