Yahoo, European Commission Back Digital Library Plans
Search giant Yahoo and PC maker Hewlett-Packard are at work on a permanent online archive of multilingual text and multimedia content, the firms revealed Mon. The announcement came as rival Google continues to battle with publishers over its year-old Google Print Library Project and the European Commission vows to launch its own digitization project.
Sign up for a free preview to unlock the rest of this article
If your job depends on informed compliance, you need International Trade Today. Delivered every business day and available any time online, only International Trade Today helps you stay current on the increasingly complex international trade regulatory environment.
The Google and Yahoo schemes differ in scope and slant. Google began by assuming that unless denied access, it would digitize material. Yahoo is digitizing only public domain material or material it has permission to archive.
The Yahoo/HP-backed Open Content Alliance (OCA) aligns worldwide govt., high tech, nonprofit and cultural organizations. Members include Adobe, the European Archive, the Internet Archive, the U.K. National Archives, O'Reilly Media, the Prelinger Archives, the U. of Cal. and the U. of Toronto. Archived documents soon will be accessible through OCA’s site -- www.opencontentalliance.org -- and Yahoo’s search engine.
OCA aims to “encourage the greatest possible degree of access to and reuse of collections in the archive,” while respecting content owners and copyrights, it said. Those who contribute to the OCA must agree to principles in the alliance’s detailed guidelines -- which pleases publishers. The idea for OCA was hatched early in 2005 by the Internet Archive and Yahoo as a way to “offer broad, public access to a rich panorama of world culture.”
OCA differs from Google’s project mainly in that Yahoo and partners are working with publisher permission in every case, which pleases copyright owners, said Sally Morris, head of the Assn. of Learned and Professional Society Publishers (ALPSP). Google, on the other hand, “always has sought publisher permissions and the main reason why this has not continued appears to be… the feeling that it would not be possible to secure permissions for more than 15% of the works Google wanted to cover,” Morris told us.
Another key distinction is that OCA participants are “focusing on works that their copyright owners wish to make freely accessible in their entirety,” Morris said: “There is a huge wealth of content which publishers and other owners already make freely available and that this project will perform a great service in making it all visible and readily available.” An Assn. for American Publishers spokeswoman told us Mon. her group is “very encouraged by what we've heard” about OCA. “It seems as though they are going about this the right way, respecting the rights of creators to determine how their works will be used,” she said.
Google, which wants to index millions of copyrighted books from a few university libraries -- including Harvard, Stanford and Michigan -- has won fewer friends with its effort. Last month the Authors Guild filed suit against Google; in Aug., the company set a self-imposed moratorium on scanning copyrighted materials until Nov. 1 in response to negative reaction from publishers (WID Aug 15 p2).
The Internet offers a chance to realize the dream of the Library of Alexandria for all-inclusiveness and extend it with universal access, Internet Archive Dir. Brewster Kahle said on Yahoo’s blog. Kahle’s organization will host OCA’s material and sometimes help with digitization while Yahoo indexes the content and funds the group’s inaugural project -- digitizing American fiction and nonfiction collected by the U. of Cal. Adobe and HP are contributing processing software as U. of Toronto and O'Reilly Media add books to the heap, he said. Prelinger Archives and the U.K. National Archives will contribute movies and more. The first round of digitized material will be available this year, Kahle said.
OCA’s guiding principle is to offer high-resolution, downloadable, reusable files of material in the public domain, he said: “When we are dealing with in-copyright materials, the Internet Archive has been leveraging the Creative Commons licenses to great effect.” Kahle said copyright issues remain, but “at least we can get substantial work going on the public domain.” He said donors should have the option to restrict bulk re-hosting of substantial parts of collections. U. of Cal. and Yahoo have decided not to impose any restrictions, Kahle said. That means another library’s website could rehost these works and other search engines could integrate them into their page-flipping systems.
Yahoo’s foray into digital archiving is “a sign of the great need for this kind of information” above all else, Electronic Frontier Foundation (EFF) staff attorney Jason Schultz told us. “It’s undeniable that when Yahoo and Google both get involved in giving people access to this content, there’s a high demand,” he said. OCA’s focus on works in the public domain is positive because users will be able to get an entire book, not just snippets, which is what Google plans. Google’s approach will “show people some of the effects of the draconian copyright extensions we've had” because the resource well will be pretty shallow, Schultz said. OCA’s approach is “the safest, most sure road you can take because you've either got explicit permission or it’s in public domain.”
Other aspects of OCA’s work that interest EFF are the add-on services and 3rd-party applications expected to appear once users get free access to the materials. “If anyone can access the database and utilize the information, there’s a lot more room to play and experiment; whereas with the Google system, it’s very locked down and the only innovations you'll get are Google innovations,” Schultz said. The sky’s the limit for incorporating OCA material into educational curriculum, Wikipedia entries and other “really exciting aesthetic additions,” he said. Users’ ability to “annotate, remix and communicate” will be greater with OCA. But those innovations may have less effect than Google’s work due to OCA’s more limited scope. “Google’s doing this for a very good reason, so it’s not that one is good and one is evil,” Schultz said. “They are different approaches and I hope they'll be complementary. There’s something to be said for having access to as many books as possible.”
Push for European Digital Libraries Begins
The European Commission (EC) also is giving chase to Google. It unveiled a strategy Fri. to make all of Europe’s written and audiovisual heritage available online. Digitizing the Continent’s historic and cultural heritage will “make it usable for European citizens for their studies, work or leisure and will give innovators, artists and entrepreneurs the raw material that they need,” the EC said.
“Without a collective memory, we are nothing, and can achieve nothing. It defines our identity and we use it continuously for education, work and leisure,” Information Society and Media Comr. Viviane Reding: “The Internet is the most powerful new tool we have had for storing and sharing information since the Gutenberg press, so let’s use it to make the material in Europe’s libraries and archives accessible to all.”
EC policymakers will take comments on their proposal in an online consultation ending Jan. 20. The results will be shaped into a proposal on digitization and digital preservation to be presented in June. Education and Culture Comr. J?n Figel stressed the importance of European cooperation. The project is “an obvious necessity,” he said: “It is about ensuring preservation and access to our common cultural heritage for the future generations.”
The EC archiving project includes books, film, photos, manuscripts, speeches and music. Compilers will have to mine a mountain of work. Europe’s libraries hold about 2.5 billion books and bound periodicals, plus millions of hours of film and video in broadcast archives. EC guidance covers digitization, online accessibility and digital preservation. Some EC member states have proposed similar actions, but the approach is fragmented, officials said. To avoid redundancy and incompatibility, the EC urged countries and major cultural institutions to join its effort.
The Commission will add coordination and funding through its research programs and the eContentplus program. Consultation results also will be incorporated into related initiatives like a 2006 review of EU copyright rules and the 2007 implementation of community R&D programs. A high level group on digital libraries will advise the EC on how best to address EC-level challenges. Collaboration among countries will be facilitated by an update of the Lund action plan, which defines how to foster and improve digitization of cultural and scientific heritage. To measure progress, the process will use quantitative indicators, the Commission said.
The EC is contributing 36 million for research for the rest of 2005, with digital preservation efforts to expand considerably in years to come, the EC said. The eContentplus program, a community effort to make digital content in Europe more accessible, usable and exploitable, will contribute 60 million toward making national digital collections and services interoperable and aiding multilingual access and use of cultural material.