[p2p-research] A free software model for open knowledge

Sun Mar 21 18:25:40 CET 2010

  Sent to you by Ryan via Google Reader: A free software model for open
knowledge via Open Knowledge Foundation Blog by jwalsh on 3/16/10

Notes describing the talk on the work of the Open Knowledge Foundation
given last week at Jornadas SIG Libre.

I was happily surprised to be asked to give this open knowledge talk at
an open source software conference. But it makes sense - the free
software movement has created the conditions in which an open data
movement is possible. There is lots to learn from open source process,
in both a technical and organisational sense.

In English we have one word “free” where Spanish like most languages
has two, gratis and libre, signifying separately “free of cost” and
“freedom to”. The Open Source Institute coined Open Source as a
branding or marketing exercise to avoid the primary meaning “free of
cost”. So whenever I say “open” I want you to hear the word “libre”
[Later i was told that libre can have meaning in at least 15 different
ways]

The best way to talk about the work of the Open Knowledge Foundation is
to look at its projects, which form an open knowledge stack similar to
the OSGeo software stack.
Open Definition
The Open Knowledge Definition is based on the OSI Open Source Software
Definition (which OSGeo uses as a reference for acceptable software
licenses). No restrictions on field of endeavour - non-commercial-use
licenses are not open as in the OKD. An open data license will pass the
cake test.
Open Data Commons
Open Data Commons is run by Jordan Hatcher, who started work on the
Open Database License with support from Talis, later extensive
negotiation with the OpenStreetmap community. ODbL is a ShareAlike
license for data, that obviates the problems of inapplicability of
copyright to facts, and greediness of the ShareAlike clause when it
comes to use of maps in PDFs, etc.

PDDL is a license that implements the Science Commons protocol for open
access data, explicitly placing it in the public domain.

The Panton Principles are four precepts for publishers of scientific
research data who wish that data to be freely reusable. Being openly
able to inspect, critique and re-analyse data is critical to the
effectiveness of scientific research.
Open Data Grid
The Open Data Grid is a project in early incubation; based on the Tahoe
distributed filesystem. It’s in need of development effort on Tahoe to
really get going. Provide secure storage for open datasets around the
edges of infrastructure that people are already running.

People are handwaving about the Cloud, but storage and backup are not
problems that it is really meant to solve. People make different claims
about the Cloud - cheaper, greener, more efficient, more flexible. Can
we get these things in other ways?

There is a saying, “never underestimate the bandwidth of a truck full
of DAT tapes”
Comprehensive Knowledge Archive Network (CKAN)
CKAN is inspired by free software package repositories, perl’s CPAN,
R’s CRAN, python’s PyPi. It provides a wiki-like interface to create
minimal metadata for packages with a versioned domain model and HTTP
API.

CKAN supports groups, which can curate a package namespace - e.g.
climate data - and assess priorities for turning into fully installable
packages.

CKAN’s open source code is being used in the data package catalogue for
the data.gov.uk project, part of the Making Public Data Public effort
in the UK.
datapkg
The Debian of Data - datapkg takes Debian’s apt tool as inspiration for
fully automatable install of data packages, with dependencies between
them. This is currently in usable alpha stage with a python
implementation.
Where Does My Money Go?
The next challenge really is to bring the concerns and the solutions to
a mainstream public. Agustín Lobo spoke of “a personal consciousness
but not an institutional consciousness” when it comes to open source
and open data. Media coverage, exemplary government implementations,
help to create this kind of consciousness.

Pressure for increased open access is coming from academia - for the
research data underlying papers, for the right to data mine and
correlate different sources, for library data open for re-use. Pressure
is also coming from within museums, libraries and archives - memory
institutions who want to increase exposure to their collections with
new technology, and recognise that open data, linked to a network of
resources, will work for sustainability and not against it.

The next generation of researchers, who are kids in school now, will
grow up with an expectation that code and data are naturally open. It
will be interesting to see what they make!

Meanwhile OpenStreetmap is feeding several startups, and more
commercial presence in open data space will be of benefit. Illustrative
that one does not have to be proprietary to be commercial.

Now higher-profile government projects opening data are helping to
mainstream. To what extent is open a fashionable position, to what
extent is open reflected throughout the way of working?

Open process; early release, public sharing of bugs, public discussion
of plans - everything in Nat Torkington’s post on Truly Open Data. The
opportunity to fail in public, to learn from others’ problems, and
self-interestedly collaborate.

I had a great time at SIG Libre 10. Oscar Fonts’ talk on OpenSearch
Geospatial interfaces to popular services has me itching to add an
OpenSearch +Geo interface to CKAN, as well as to work on getting the
apparent version skew in the Geo extensions resolved amicably.

Genís Roca spoke thought-provokingly on Retorno y rentabilidad (there
isn’t really an equivalent English word - “rentability” - less
exploitative or focused than profitability). Rentability, especially
for online services, can come in ways that sustain an organisation
predictably, and don’t involve fishing in the pockets of ultimate
end-users.

Ivan Sanchez showed areas of OpenStreetmap Spain with stunning level of
detail, trees and fences, MasterMap-quality coverage. I’m inspired to
pick up JOSM and Markaartor to add building-level detail from out of
copyright 1:500 Edinburgh town plans at the National Library of
Scotland’s map services.

Agustin Lobo talked about the distributed work and cross-institutional
support and benefit of the R project, and the impact of open source on
open access to data in science. He mentioned a Nature open peer review
experiment that was discarded - am thinking it wasn’t curated enough.
The talk helped me to connect the OKF’s work to the rest of the
Jornadas.

The shiny slides prezi.com which many people asked for details of -
this should show embedded in the page I hope. I stupidly forgot to put
URLs on the slides which is partly why i have written this blog.

A Free Software Model for Open Knowledge on Prezi

Share This

Related posts:
- Keeping “Open” Libre
- Open Software Service Definition Launched
- Free Knowledge Institute is launched

Things you can do from here:
- Subscribe to Open Knowledge Foundation Blog using Google Reader
- Get started using Google Reader to easily keep up with all your
favorite sites
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listcultures.org/pipermail/p2presearch_listcultures.org/attachments/20100321/1aad74b1/attachment.html>