[p2p-research] the wikipedia decline
Paul D. Fernhout
pdfernhout at kurtz-fernhout.com
Thu Nov 26 16:04:18 CET 2009
J. Andrew Rogers wrote:
> On Wed, Nov 25, 2009 at 9:22 PM, Paul D. Fernhout
> <pdfernhout at kurtz-fernhout.com> wrote:
>> The deeper issues is also just the semantic web.
>>
>> I set up a Halo Semantic MediaWiki for OSCOMAK, but it just seemed really
>> awkward. But, that might be a next logical step for Wikipedia too, and maybe
>> some of the semantic stuff might help building these new tools for the
>> community?
>
>
> The semantic stuff is currently limited by theoretical computer
> science, as in we lack algorithms required to make it scale beyond toy
> datasets. Old, old issues. To date, this has been the real problem
> with semantic technology -- it sucks if you want to do anything useful
> with it.
>
> On the upside, a lot of research dollars are being spent on semantic
> web technology and significant progress is being made. On the
> downside, almost none of the theoretical computer science technology
> that will make it possible is open source (or in many cases, even
> published). As usual, life is a mixed bag.
Our tax dollars at work. :-(
http://www.pdfernhout.net/open-letter-to-grantmakers-and-donors-on-copyright-policy.html
http://www.pdfernhout.net/on-funding-digital-public-works.html
But there is some open stuff out there. But they demand too much of users.
Related:
"Metacrap: Putting the torch to seven straw-men of the meta-utopia"
http://www.well.com/~doctorow/metacrap.htm
"""
Metadata is "data about data" -- information like keywords, page-length,
title, word-count, abstract, location, SKU, ISBN, and so on. Explicit,
human-generated metadata has enjoyed recent trendiness, especially in the
world of XML. A typical scenario goes like this: a number of suppliers get
together and agree on a metadata standard -- a Document Type Definition or
scheme -- for a given subject area, say washing machines. They agree to a
common vocabulary for describing washing machines: size, capacity, energy
consumption, water consumption, price. They create machine-readable
databases of their inventory, which are available in whole or part to search
agents and other databases, so that a consumer can enter the parameters of
the washing machine he's seeking and query multiple sites simultaneously for
an exhaustive list of the available washing machines that meet his criteria.
If everyone would subscribe to such a system and create good metadata for
the purposes of describing their goods, services and information, it would
be a trivial matter to search the Internet for highly qualified,
context-sensitive results: a fan could find all the downloadable music in a
given genre, a manufacturer could efficiently discover suppliers, travelers
could easily choose a hotel room for an upcoming trip.
A world of exhaustive, reliable metadata would be a utopia. It's also a
pipe-dream, founded on self-delusion, nerd hubris and hysterically inflated
market opportunities. ...
"""
I don't completely agree though. I think better tools can help (structured
arguments, structured interlinking for related data within a tool), and the
community will co-evolve with those tools. There was a time when people
though it too much trouble to give titles to emails too.
Nonetheless, Cory Doctorow's points about people all need to be appreciated
and addressed in the design of systems. The list of points from the table of
contents:
"""
# 2. The problems
* 2.1 People lie
* 2.2 People are lazy
* 2.3 People are stupid
* 2.4 Mission: Impossible -- know thyself
* 2.5 Schemas aren't neutral
* 2.6 Metrics influence results
* 2.7 There's more than one way to describe something
"""
Part of his conclusion: "This sort of observational metadata is far more
reliable than the stuff that human beings create for the purposes of having
their documents found. It cuts through the marketing bullshit, the
self-delusion, and the vocabulary collisions. "
One might also conclude that the cost to the "Noosphere" of a competitive
economic system that leads to confusion through spam and deceptive
advertising, and limits the amount of time people have to contribute free
content related to their interests and understanding, those costs may now
vastly outweigh any potential benefit one might suggest comes from a system
that rewards individual entrepreneurs for operating businesses or individual
workers for going to work. So, the semantic web itself may demand a change
in our economic organizing principles. :-)
--Paul Fernhout
http://www.pdfernhout.net/
More information about the p2presearch
mailing list