On Wed, Mar 01, 2000, Roland J. Cole <cole[_at_]spi.org> wrote:
>
> On Wed, Mar 01, 2000, Don Roemer <droe2[_at_]earthlink.net> wrote:
> >
> > On Tue, 29 Feb 2000, Roland J. Cole <cole[_at_]spi.org> wrote:
> > >
> > > We at SPI do thousands of pages of OCR on public domain documents.
> > > We know all the careful and sometimes very creative work that goes
> > > into choosing how to OCR, what to OCR, how to represent in text what
> > > was a picture, etc. I have no question that you have a copyright in
> > > the electronic version as a derivative work.
> >
> > It is a shame that few courts agree with you. Where is your originality?
> > You discuss "sweat of the brow" issues that were dispensed with a long
> > time ago. The overruling of "copying a copy" in Alfred Bell sounded
> > the death knell for rights in a work that may have cost someone untold
> > amount of dollars to [re]produce.
>
> 1. I agree that it is a shame that few courts would agree with me.
> However, I am not sure you and I are talking about the same thing.
>
> 2. I agree with Feist (a simple alphabetical phone book), and I oppose
> much of the thinking behind the "Collections of Data Antipiracy Act" --
> simply assembling facts (or simply copying something) should not give
> the assembler or the copier a copyright in otherwise public domain
> material.
>
> 3. Rather my point was a different one, albeit a point that you may
> still disagree with. My point is that, at today's state of the art,
> going from a paper document to a fully-corrected, properly formatted
> web site with links, headings, etc. requires enough creativity to
> meet what I think either is or should be well beyond the minimum
> creativity required for copyright. In my own case, we make thousands
> of judgments about what to correct, what to omit, etc. that should
> fully qualify for "selection, arrangement, and display" protection
> without any claim to the words thereby selected, arranged, and
> displayed.
>
> Yes, there are programs that purport to "automatically" turn a paper
> document into a web site. Some of them do come close, although all
> I have used require at least some tweaking. There are programs that
> turn a paper document into a word processing document (usually
> Microsoft Word). Some of those do work with little or no tweaking
> on some documents.
>
> Thus, I would be tempted to ask for some more analysis beyond
> "digitizing/OCR is or is not mere copying." In my experience, it
> sometimes is almost or entirely pure copying, with or without pure
> hand-correction but many more times than not, producing a finished
> product requires a number of activities that I think should (and I
> hope/predict in some case would) receive copyright protection.
Part of the debate here may stem from different premises. Clearly, if someone takes an ancient text and converts it to electronic format, making corrections, placing illustrations around it, making notes, etc., those annotations are protected. But the value to book publisher was not (I think) in these annotations, but in the text itself. Presumably they are not interested in reproducing screen shots of the web pages, but rather taking the text and electronically typesetting it. This essentially strips the text of all the value-adding/copyright protected elements added by the OCR process -however involved that may be -- and then adding new ones. It doesn't matter to the publisher whether the text is in an ornate and well designed series of web pages filled with hyperlinks to other original sources and brilliant commentary or the text is an ASCII file with no formatting whatsoever. In fact, they probably prefer the latter. It is the text that falls into the public domain, not just the image of the text. While the new web-based image of the text may be protectable in as much as it creates something new, the underlying text is still in the public domain. The mechanical process of reproducing the text, even if difficult, is not protectable. In this context mechanical does not mean done automatically by a machine. Comparing post-scanning electronic text to the printed original to check for mistakes made by the software is an arduous task, but it is mechanical in the sense used here, and, like it or not, it is not protected in the United States.
Obviously, some of the confusion stems from the fact that placing a text in digital format is different from reproducing it in physical form. In physical form, any reproduction of the text requires either (a) the degrading of the original (a photocopy is a lesser form than a book, a photocopy of the photocopy is harder to read) or (b) going through a large effort to render the text in an equivalent format (retypesetting the book from scratch). This allows some economic advantage to those who control the text in its most desired form (those who already have it typeset / printed) who may then exploit this advantage to get people to pay for their product (otherwise, people would just hand copy any text they needed, and no books would ever be sold). Digitized text, however, does not suffer from these limitations. The bounds of copyright law were created in a paradigm bound by the physical constraints of a non-digital world. There was no point in protecting certain modifications of text in the public domain because the cost to a potential copier was the same whether they went to the original version or the new one; a copier still had to retypeset the text. The law therefore doesn't recognize the difference between a copy made from a physical object and a copy made from a digital file, even though making the copy from one is much easier than from the other.
It is easy to see the value of text placed in digital format. The law just doesn't protect the process of doing so.
David R. Hale, Esq.
Astrachan, Gunst, Goldman & Thomas, P.C.
20 S. Charles Street, 6th Floor
Baltimore, Maryland 21201
(410) 783-3539
(410) 783-3530 (facsimile)
dhale[_at_]aggt.com
Received on Thu Mar 02 2000 - 19:37:41 GMT
This archive was generated by hypermail 2.2.0 : Mon Mar 26 2007 - 00:35:38 GMT