Re: Transparency and IP

From: Samantha Atkins (
Date: Thu Sep 14 2000 - 01:52:37 MDT

"Michael S. Lorrey" wrote:
> Samantha Atkins wrote:

> > If you know better I would very much like to know how to improve on
> > this. I've been wanting my library online for many years now.
> Actually, I've done this numerous times on a Xerox Docutech system, it rips
> through 'de-spined' books fast, and if you are smart you can 'de-rip' the
> resulting file on the Docutech back to the network, then do a PS to HTML
> conversion, and voila.

Ah, well, if I ever get my own data center or work in a major corp that
will let me borrow theirs I guess I can finally get my books online.
Otherwise a Docutech is outside even my bloated tech-toy budget.

> You can also buy scanners that have feeders that are adjustable. I have one here
> at Datamann that can scan 21 pages per minute at 600 dpi and produces scans that
> can be very easily OCR'd.

That might work. What price range are you talking about? Any
particular products or additonal things to look for?
> The only errors you really get with these processes are with older books of
> heavy serif type that is closely kerned, or with faded type. There do tend to be
> errors as a result of those features...

I've used flatbeds with pages cut out of books and experienced 3-8%
error rates with common OCR packages. These were modern books.

- samantha

This archive was generated by hypermail 2.1.5 : Fri Nov 01 2002 - 15:30:58 MST