Re: Databases, creativity, and copyright -- what's missing is wrong?

From: Michael Scarpitti <MScarpit[_at_]asnt.org>
Date: Mon, 28 Sep 1998 15:57:43 -0400

On Sun, Sep 27, 1998, Karsten M. Self <kmself[_at_]ix.netcom.com> wrote:
>
> I've just been reading an analysis of Feist Publications v. Rural
> Telephone Service, which established that a phone directory is not
> subject to copyright protection.
>
> Opinion at:
> http://caselaw.findlaw.com/scripts/getcase.pl?navby=search&linkurl=
> <%LINKURL%>&graphurl=<%GRAPHURL%>&court=US&case=/data/us/499/340.html
>
> The analysis suggests the case was ruled for Feist based on the
> principle that:
>
> > (a) Article I, 8, cl. 8, of the Constitution mandates originality as a
> > prerequisite for copyright protection. The constitutional requirement
> > necessitates independent creation plus a modicum of creativity. Since
> > facts do not owe their origin to an act of authorship, they are not
> > original, and thus are not copyrightable. Although a compilation of
> > facts may possess the requisite originality because the author typically
> > chooses which facts to include, in what order to place them, and how to
> > arrange the data so that readers may use them effectively, copyright
> > protection extends only to those components of the work that are
> > original to the author, not to the facts themselves. This
> > fact/expression dichotomy severely limits the scope of protection in
> > fact-based works. Pp. 344-351.
>
> with the body of the opinion suggesting that Rural's white pages listing
> fell short of this mark being a alphabetic compilation of all phone
> listings within Rural's service area, as required by Kansas state law.
>
> Among the fallout of this ruling is the database bill of the current
> legislative session.
>
> As someone with a background in database compilation and analysis, I
> can't help but wonder whether the court was looking in the right place
> in determining "originality" and "a modicum of creativity" in Rural's
> white pages. There are two tricks in compiling a substantatial
> database:
>
> - Getting the facts
> - Getting the facts right
>
> I'm currently making a good wage validating a third-party's attempt
> to clean up dirty data in a 100 million record financial accounts
> database. An error rate of 0.1% could result in more than $100 million
> in mis-attributed charges, not to mention legal risks for incorrectly
> reporting or acting on information. Simply compiling data is not
> sufficient -- the data must be accurate.
>
> Errors can be introduced in many ways -- data may be miskeyed, falsely
> provided, supersceded, or out of date. Test, training, and system data
> may enter a production database. Incorrectly classified data may be
> included in a file.
>
> Cleaning and validating the data alone cost over $1 million, validating
> the validation has occupied three analysts for over a month. Much of
> the processing is automated, but manual intervention, interpretation,
> and further analysis are required. Some of the techniques are quite
> inventive. Though a small fraction of records are disposed of
> differently, the value added is tremendous. Our results suggest that
> cleaning and validation affected 2-5% of records, or up to $5 billion in
> charges.
>
> In focusing on the data presented in a compilation, rather than the data
> and errors removed, by concious design, did the Court misplace its
> attribution of creativity and originality in compiling a database?
>
> To borrow from the Tao Te Ching:
>
> > THE VALUE OF THE UNEXPRESSED.
> >
> > The thirty spokes join in their nave, that is one; yet the wheel
> > dependeth for use upon the hollow place for the axle. Clay is shapen
> > to make vessels; but the contained space is what is useful. Matter is
> > therefore of use only to mark the limits of the space which is the
> > thing of real value.
>
> I'm suggesting that it's the missing data, and the lack of errors, in
> a strict compilation, which both provide value, constitute the unique
> attribute, and are the originality of a database.
>
> If data selection, quality assurance, and validation routines are
> original and creative activities, were they raised in the arguments in
> this case? I find no indication that they were considered significant
> in the ruling. If these activities are sufficient to impart originality
> to the resulting database, shouldn't databases be considered
> copyrightable, and protected (as a whole) under the existing 1980
> Copyright Act?
>
> Is the 1998 Database Act really necessary?

It would seem that to the clients, such a service as has been provided is the value: that there are as few errors as can be managed. Whether this reflects work by the "cleaners" is irrelevant. If I went through Webster's New International Dictionary (released 1934) and corrected a few typos in the 600,000 entries (if I could find any) that would not affect the copyrightability of the original work one way or the other. If I worked for Merriam-Webster, such information might be of value, though.

If I took the trouble to write my own 600,000 entry dictionary, derived from my own sources, I could indeed copyright it.

Producing a new, fully annotated and revised edition of Shakespeare's works, however, as I have argued previously, would pass the test of originality. It is not "merely" a database.

Michael A Scarpitti
Assistant Editor
Materials Evaluation
(800) 222-2768 X207
(614) 274-6003 X207
e-mail mscarpit[_at_]asnt.org Received on Mon Sep 28 1998 - 20:01:28 GMT

This archive was generated by hypermail 2.2.0 : Mon Mar 26 2007 - 00:35:32 GMT