Update on the Wayback Machine which archives all information posted to
the web since 1996;
http://www.newscientist.com/news/news.jsp?id=ns99991534
Web mega-archive raises legal questions
09:55 09 November 01
Jeff Hecht, Boston
A website called the Wayback Machine last week opened a gateway to more
than 10 billion archived Web pages. But it also opened a can of worms.
The site has been using software robots to record Web pages since 1996.
But these include pages that were later removed by site owners because
they contained material that was pirated, illegal, or deemed too
sensitive.
When the archive went live last week, the US Nuclear Regulatory
Commission had to ask archive founder Brewster Kahle to pull sensitive
information the site had resurrected on America's nuclear reactors -
which the NRC had removed from its website after the 11 September
attacks.
New Scientist's search of the Wayback Machine confirms the NRC material
has gone. But the potential problems don't end there. We also found
copyrighted material that had been removed from a site because it was
posted without its owner's permission.
"There are definitely some potential legal issues with this archive,"
says Cindy Cohn, legal director at the Electronic Frontier Foundation.
Dumb bot
The Wayback Machine's robots periodically record every page on the web,
unless they are stopped by bot-blocking software. But the bots won't
know if, for example, a site has been taken down because a judge has
ruled it defamatory, or to settle a lawsuit out of court. Nor can they
identify pirated material or child pornography.
Like the Wayback Machine, the Google search engine also stores copies of
Web pages in case servers fail. But this "cache" is flushed every three
weeks, says Google, adding that it removes sensitive material when
asked.
US law imposes fines between $200 to $150,000 for copyright
infringements. And even though unwitting infringers usually pay the
lower amount, the presence of thousands of copyrighted works stored on
pirate websites could be ruinous for the Wayback Machine's backers, who
include the Library of Congress and the National Science Foundation.
"It would be a real shame if copyright concerns were to deprive us of
this valuable public service," says Fred von Lohmann, a senior lawyer at
the EFF.
This archive was generated by hypermail 2b30 : Sat May 11 2002 - 17:44:26 MDT