Democratizing and Preserving Knowledge

Technology has brought to much of the world a true “digital commons,” creating a virtual public square that scoffs at geographical boundaries and resists the ravages of time and destruction. But ceaseless innovation — think “born-digital” data like email, websites, blogs, text messages, and podcasts — has made saving our cultural legacy more urgent and complicated, not less

None

KEY TAKEAWAYS

• Libraries today are playing an even greater role in society than ever, functioning as a “digital commons” by providing free and open access to ideas. But at the same time, they face many new challenges, including questions of how to transfer massive amounts of information into functional archives that will stand the test of time

• Joining other foundations, Carnegie Corporation of New York has helped libraries navigate this digital landscape with 16 worldwide grants totaling $18 million disbursed over the last three fiscal years

One Corporation-funded initiative, the Afghanistan Project, created a multilingual website gathering in one place precious historical materials from that country dating from the 1300s to the 1990s

Mass digitization proceeds apace. Egypt’s Bibliotheca Alexandrina, for example, has developed cutting-edge software to capture Arabic script, enabling it to digitize 130,000 volumes since opening in 2002

• The question of sustainability is particularly acute for data that is “born digital,” such as email correspondence and bureaucratic record keeping, but libraries are applying a whole range of new skills and technologies and resources to actually preserve and steward born-digital information properly

Digital files are actually more fragile than paper since they can easily disappear when the platform on which they are stored becomes obsolete. As a result, libraries must always make choices that are both flexible and reversible


The stunning Tianjin Binhai Library opened in the cultural center of Binhai district in Tianjin, a coastal metropolis outside of Beijing, China, in October 2017. A “social space that also promotes reading and inspiration,” according to its director, the library quickly became a social media sensation, the unique wavelike, terraced bookshelves encircling the massive atrium from floor to ceiling making the perfect Instagram backdrop. The eye-poppingly beautiful design is the work of the Dutch firm MVRDV, and the library has proven to be a smashing success with the public, Tianjin Binhai Library having become “the urban living room it was intended to be.” (Photo: Zhang Peng/Lightrocket via Getty Images)

The New York Public Library (NYPL), one of the world’s biggest research libraries with an annual 18 million patrons walking through its doors and additional millions more who access its resources online from other parts of the country and the world, sees its mission as threefold: to create and nurture lifelong learners, to advance knowledge by providing free and open access to materials, and to promote full citizenship by strengthening communities and giving people the resources they need to understand and engage with the societies in which they live.

These are ambitious goals, ones shared by many major research institutions. They also give a hint to anyone who has been paying attention in recent years of the ways in which libraries see themselves as keystones of open societies. With the expansion of digital technologies, libraries have even more tools at their disposal to achieve their missions.

“If you think about it,” says William P. Kelly, the Andrew W. Mellon Director of the Research Libraries at NYPL, “we have the very real opportunity now to create our own Libraries of Alexandria.” That legendary place, founded by Ptolemy I around 290 BCE, was celebrated as one of the centers of scholarship in the ancient world, housing much of the world’s early recorded knowledge until its destruction during various sieges of the port city over the next few centuries. And with the possibility of gathering and distributing information, not just in physical but in digital form, there is an opportunity to share such electronic resources among global networks of research institutions, making them available not only to people who can travel to a particular place but to anyone with means to access the Internet. Thus, the “public” that The New York Public Library serves is indeed potentially boundless.

“In an era in which the public sphere is disappearing under waves of privatization, librarians insist that knowledge is for everyone, and that they are called to share rather than to horde. That is where a lot of libraries are at the moment. It’s a driving vision that unites us all,” says Kelly.

As Vartan Gregorian, president of Carnegie Corporation of New York, writes in the introduction to this issue of the Carnegie Reporter, Andrew Carnegie understood more than most “the value of libraries as the primary institution for the cultivation of the mind and the development of the community”; this understanding led the philanthropist to establish over 1,600 free libraries across the U.S. during his lifetime, and to advocate for the importance of these institutions in bolstering democracy by nurturing informed citizens in their quest for knowledge.

In an era in which the public sphere is disappearing under waves of privatization, librarians insist that knowledge is for everyone, and that they are called to share rather than to horde. That is where a lot of libraries are at the moment. It’s a driving vision that unites us all.

— William P. Kelly, The New York Public Library

But in our digital age, that quest for knowledge has become in some ways more complicated, not less. While one of the Internet’s greatest gifts has been the democratization of information, it brings with it new challenges, such as preserving the immense universe of digital data; ensuring access to the Internet regardless of income at a time of sharp inequalities; and creating an online infrastructure that is not only able to support an enormous amount of information but can organize it into navigable and readable formats, such that we can all use it to better understand our world.

Kelly puts the problem in concrete terms. “Compare the archives of two presidents, Lyndon B. Johnson and Barack Obama. Johnson’s archive consists to upwards of 45 million pages, practically every written document created by his office. Each of these must be cataloged and preserved so that the record of his administration remains discoverable. That’s a massive job.”

“Now,” he goes on, “consider President Obama’s archive, which primarily includes material that was ‘born digital’ — communications, such as email, without a print existence. That archive includes more than 1.5 billion ‘pages.’ Granted that some of these records are inconsequential — emails ordering lunch, for example. But think of the challenge involved: someone has to decide what’s important to preserve, and then organize that vast body of information in a usable manner.”

The difficulties and opportunities inherent in this new digital landscape have driven Carnegie Corporation of New York to continue its founder’s legacy of supporting libraries as part of its larger mission to advance and diffuse knowledge and understanding. In the last three fiscal years alone (FY2016–18), the foundation gave 16 library-related grants totaling $18,010,000 in support of that goal. The Corporation shares this commitment with a number of other nonprofits and foundations that are also working toward preserving and disseminating the holdings of libraries and archives in a sustainable digital environment.

Must All Good Things Come to an End? “Destruction” is the fourth in the series of five large canvases called The Course of Empire, an ambitious work by the great American landscape painter Thomas Cole (1801–1848). In this allegorical cycle, Cole sought to depict the life cycle of a great civilization from its glorious rise to its devastating, ignominious fall. In “Destruction,” with storm clouds gathering ominously above, a once magnificent city is being destroyed by war and savaged by fire. In the 21st century, the destruction of cultures by war (civil and otherwise) is still sadly with us, and furthermore, today the vandals are both real and digital. The ravages of time will always take their toll. Does digitization offer the chance to staunch — or at least slow — the loss of history, the destruction of cultures? (Photo: Collection of the New-York Historical Society. Digital Image Created by Oppenheimer Editions)


Preserving Cultures by Preserving Knowledge

In September 2018 a fire engulfed the National Museum of Brazil. When it was brought under control six hours later, practically all of the museum’s holdings — including the oldest human remains ever discovered in the country, dinosaur fossils, and the last surviving audio and textual traces of some of Brazil’s extinct indigenous languages — were destroyed. There were no back-up copies of any of these artifacts in the form of photographs, 3D scans, digital audio files, or the like. This was due in part to the sheer difficulty of digitizing a collection that comprises 20 million discrete objects (the collection of linguistic materials alone encompassed more than 100,000 documents), but also because of funding cuts to the museum by the national government. The loss of the linguistic material in particular means that not only is vital data unavailable to future researchers, but the very DNA of the cultures represented in the collection has vanished.

Tragedies like this one — and the loss of information and knowledge that they represent — bring into focus the many efforts to wield the power of digitization to preserve and maintain access to cultural knowledge. Whether caused by fires or floods, political unrest, budget cuts leading to institutions’ inability to maintain fragile materials, or simply the ravages of time, the question of how best to safeguard information in libraries worldwide is on a lot of peoples’ minds these days, as are questions about how to use a vast array of new digital tools and methods to do so.

Daniel Reid, executive director of the Whiting Foundation, cites this urgency as the reason why his organization, which focuses on support for writers and scholars, established its Cultural Heritage program in 2016. The program extends grants to organizations doing the work of documenting and digitizing materials around the world that are threatened by man-made or natural causes. They have joined forces in this effort with other foundations working in the sector, including the Prince Claus Fund in the Netherlands, which offers “emergency first aid” to cultural heritage, and the Arcadia Fund in London. The Andrew W. Mellon Foundation and Carnegie Corporation of New York are also active in the arena.

“Our program started in response to the very prominent media coverage of the cultural destruction that has been happening as part of conflicts around the world, particularly in the Middle East,” Reid said. “We started investigating this as a possible area of support when we were seeing all too many video clips of destruction by ISIS in Iraq and Syria specifically. But those weren’t the only places where it was happening. Boko Haram in Nigeria and plenty of other extremist groups around the world are specifically targeting cultural heritage. And often, particularly, they’re targeting documentary cultural heritage like manuscripts and archives, since the written word is such an ideologically charged kind of heritage.”

Subscribe today to receive more stories from the Carnegie Reporter

Reid emphasizes that, as important as it is to record at-risk cultural heritage, it is vital, too, to disseminate it in ways that prioritize the needs of the stewards of that culture — those in and from the troubled regions. He points to a meeting organized by the Smithsonian Institution at the Iraqi Institute for the Conservation of Antiquities and Heritage in Erbil, Iraq, for communities — many of them religious minorities — that had been affected by ISIS. “The participants had many, many needs for recovery, but at least for some of them, there was a strong desire to prioritize aspects of their cultural heritage. They felt passionate about preserving their distinctive heritage for themselves and for their kids,” Reid says. “But there was also a real desire to share each culture as widely as possible with the world, so that people would understand what it is about, what these people have gone through and done, both now and in the past.”

Open Societies Need Libraries

A recognition of the role libraries play in creating open societies is perhaps what led young people to form a human cordon around the relatively new Bibliotheca Alexandrina to protect it from mobs during the Arab Spring uprisings in Egypt in 2011. The city of Alexandria is a stronghold of the Muslim Brotherhood in the country, and the library, which opened in 2002 with a dedication to maintaining a secular, humanistic approach to knowledge, was an inevitable target of extremist ire.

The library’s founding director, Ismail Serageldin, has spoken of what the library, built very close to the ancient site of the original Library of Alexandria, symbolizes for some in his country. “The extremists and the Islamists dislike the library very much. But that’s okay, that’s normal, because we stand for exactly the opposite of what they stand for,” he said. Despite this antipathy, Serageldin told a reporter after the unrest in the spring of 2011 that “not a single stone has been thrown at the glass façade. The population loves the library and they protect it. It says a lot that people reacted this way.”

Serageldin attributed the reaction to the library’s important role as “a focal point for the promotion of reform and for civil liberties.” He noted, “We’re spreading the values of democracy, freedom of expression, tolerance, diversity, and pluralism that I’m hoping are taking root in the younger generation.”

As the “New Library of Alexandria,” the Bibliotheca Alexandrina aims to achieve the same importance today, both as a living space and as a repository of physical materials, and to maintain what it calls the “lingering ambiance and hue of history” through its digital archives, which make historical documents, books, pictures, and much more available to specialists and culture enthusiasts from around the world. Along with a library holding 1.6 million volumes (including a sizable gift of books from the Bibliothèque nationale de France), the Bibliotheca Alexandrina comprises four museums, a planetarium, a children’s science center, a library for the blind, and eight research institutions. This “arena for cultural pluralism” is a magnet for 1.5 million annual visitors — students, scholars, and a general public hungry for access to its significant offerings.


Digitized Past and Born-Again Future

New website provides an exciting portal into Carnegie Corporation of New York’s philanthropy from the 1870s to the 21st century

Read more


But the library also strives to be at the forefront of advances in information technology — with Serageldin describing the library as “born digital.” The Bibliotheca Alexandrina has developed cutting-edge optical character recognition software to capture Arabic script, making it possible to digitize great portions of its collections. Staffed by 120 employees over two shifts a day, seven days a week, the onsite digitization lab has processed 130,000 volumes, making it the largest collection of digitized Arabic books in the world. Its Reissuing Modern Classics initiative, supported by a grant from Carnegie Corporation of New York, aims to make freely available in digital form between 100 and 150 books that date from the late 19th and 20th centuries, a period in which the Muslim world faced challenges to its traditional culture, including questions over women’s public life, freedom of conscience, the rise of democracy, secularism, and so on.

In a 2010 interview, Serageldin told American journalist Caryle Murphy that the initiative was meant to bring balance to the narrow view of Islam that circulates online in large part because it is much easier to find digital copies of works supporting radical viewpoints than it is to find modern challenges to extremist ideas. Speaking with exasperation of a 14th-century Islamic thinker whose work fuels contemporary extremist ideologies, Serageldin exclaimed, “There are umpteen zillion editions of Tamiyya’s fatwas … for heaven’s sake. But you can’t find Qasim Amin? The guy who wrote in 1898 The Liberation of Women from an Islamic perspective?”

Virtual Repatriation

The idea of “virtual repatriation” — returning to a country its cultural patrimony as a way to help it rebuild after centuries of colonial occupation, armed resistance and war, and civil strife — is at the heart of the Afghanistan Project. Funded by Carnegie Corporation of New York and led by the Library of Congress, the three-year initiative aimed at creating a multilingual website that would gather in one place documents, many of them one-of-a-kind treasures, related to Afghanistan, a country Vartan Gregorian has called “the vortex of all cultures.” Materials dating from the early 1300s to the 1990s, including precious illuminated manuscripts, maps, books, prints, photographs, newspapers, and periodicals, were gathered for scanning from collaborating institutions, such as the Library of Congress, the British Library, the National Library and Archives of Iran, and UNESCO.

In 2016 the Afghan Minister of Information and Culture Abdul Bari Jahani and other government officials were presented with hard drives containing high-resolution digital reproductions of more than 163,000 pages of documents digitized through the project. These will be used by libraries and universities throughout Afghanistan, accessible even without an Internet connection — an important consideration in a country whose infrastructure is still in need of repair after decades of military conflict.

At the presentation, Gregorian (himself an expert on Afghan history) remarked, “You can conquer Afghanistan, but you cannot dominate Afghanistan. The spirit of independence, freedom, and self-respect is there. Why not have the entire history of Afghanistan repatriated? These documents are the repatriation of the Afghan legacy, the Afghan memory, and that is why we started the project.”

Library of Wonders Opened in 2002, the Bibliotheca Alexandrina was erected near the site of the fabled Library of Alexandria, established in the third millennium BCE and one of the great centers of scholarship in the ancient world. (This legendary library was eventually destroyed over the course of several centuries after repeated sieges of the coastal city of Alexandria). Seeking to recapture the spirit of the ancient library, the Bibliotheca Alexandrina “aspires to be the world’s window on Egypt, Egypt’s window on the world, a leading institution of the digital age, and a center for learning, tolerance, dialogue, and understanding.” The building itself is quite striking, its main reading room standing beneath a high, glass-paneled roof that tilts out like a sundial and with walls of gray Aswan granite inscribed with the characters of 120 different human scripts. The library complex houses a conference center, several museums and galleries, research centers, a planetarium, and more. (Photo: Derek Hudson/Getty Images)


Mary-Jane Deeb, chief of the African and Middle Eastern Division at the Library of Congress and a curator of the Afghanistan Project, echoed these sentiments, highlighting the often overlooked losses to a people that result from armed conflict. “It’s terrible when you don’t have a record of your past and of your history, you lose your sense of identity, of who you are,” Deeb said. “Because identity is rooted in the history of the country, of the ancestors, of the stories — real or mythical — of your culture. Those are critical elements of identifying, and wars have a way of destroying those.” For his part, Minister Jahani looked ahead. “It is for the future generations,” he observed. “The future generations should be and must be thankful for this collection.”

Cooperation Across Borders

The Afghanistan Project was executed as part of the World Digital Library (WDL). Launched in 2005 by the Library of Congress with the support of UNESCO to promote international and intercultural understanding and to help narrow the digital divide for underresourced countries in a rapidly changing world, the WDL represented, according to its website, an important “shift in digital library projects from a focus on quantity for its own sake to quality.” The idea of harnessing the power of the digital to secure historical and cultural materials, especially in areas that have been subject to colonialism, war, internal strife, and natural disasters, was first floated by the then head of the Library of Congress, James Billington, in the wake of the September 11 attacks. Growing out of the Library of Congress’s bilateral digital collaborations with Russia, Spain, France, and other nations, the World Digital Library now partners with more than 190 archives, museums, and libraries from 81 countries, and has amassed a virtual library of 19,000 works in 132 different languages. With Carnegie Corporation of New York as one of its earliest supporters, the WDL has served upwards of 80 million virtual visitors since its launch.

In some countries, potential partners lacked adequate infrastructure and funding, so the World Digital Library made the key decision to supply equipment, software, training, and financial support to help them commit to the long and arduous work of digitization. “We realized that in order for it to be a true World Digital Library, we were going to have to shoot for universal participation,” says John Van Oudenaren, the recently retired director of the WDL who began work on the project in 2005. “The barriers were that some countries had no capabilities, so we were going to have to offer technical assistance to get people involved. The first Carnegie grant provided us with funding to set up a digitization operation in Uganda, and then we had other funders who supported similar operations in Egypt and Iraq.”

The National Library of Uganda proved a notable success, eventually contributing about 100,000 pages of content. Van Oudenaren explains how this was accomplished: “It organized a competition so that people and organizations all around the country could come with their artifacts, get preservation treatment, and have them scanned by the equipment that we provided them with. We had the tribes, the churches, the universities, the government agencies all come in. Not all of it was appropriate for our purposes because of the type of content or copyright issues, but it has the makings of Uganda’s own digital library, for them to use in their own schools and research centers.”

Collaborating partners are largely left to decide exactly which materials they want to send to the World Digital Library. Staff in Washington then create metadata so the materials can be cataloged, in addition to writing richly informative descriptions of each item. All of the new texts — metadata and descriptions — are then translated into seven languages.

The digitization of libraries and archives fosters the expansion and diffusion of knowledge through cultures and even across borders. But while the promise of these initiatives is great, there are also challenges.

Van Oudenaren points out that the decision to create the World Digital Library under the auspices of UNESCO allowed it to approach organizations in countries with which the U.S. has strained diplomatic relations. “We were able to get Iran on board — they didn’t make a huge contribution, but the National Library and Archives of Iran was a contributor. We worked with the National Library of Cuba early on. We had the Israelis sitting next to the Arabs, we had Russia and Ukraine. We have all these countries that either don’t get along with each other, or don’t get along with us, or both, actually kind of working side by side in this project. Flying the UNESCO flag was helpful in that regard.”

The Value of Open Access

A cataclysmic event led to the founding of another similar resource, the Digital Library of the Middle East (DLME), conceived in the wake of the 2015 attacks by ISIS on the Mosul Museum in Iraq, and, though less publicized, on libraries in the city. Expected to launch in 2020, the DLME is an initiative of the Council on Library and Information Resources (CLIR), in partnership with the Qatar National Library, the Antiquities Coalition, the Digital Library Federation, Stanford University Libraries, and other global institutions, and is supported by grants from the Mellon Foundation and the Whiting Foundation, among others.

Charles Henry of CLIR has written about the purpose of the DLME, which he hopes will make the trafficking of a country’s cultural patrimony into the global art market more difficult by recording the existence of collections and objects. “The DLME is envisioned as both a technical marvel but also a virtual place that facilitates social justice and provides a sustained, evolving platform for worldwide access as a public good,” he said. “The crisis in the Middle East is urgent and heartbreaking; our immediate goals are to construct a digital library that will inhibit looting, track material objects of cultural significance, and help to safeguard one of the world’s greatest cultural repositories. Over time, we hope for peace, when the DLME can engage a new generation of scholars and readers who can gaze anew on such stunning evidence of our collective human achievement.”

An American university’s establishment of a global outpost — in this case, the Abu Dhabi campus of New York University (NYU) — led to the creation of another important open-access resource focused on the Middle East: Arabic Collections Online (ACO), a publicly available digital library of Arabic-language content. As Sally Cummings, a development communications manager at NYU Libraries, explains, “There is a wealth of Arabic language material in academic libraries that could have an appreciative global readership if we were to make it accessible, even in places where libraries are few, or where travel is constrained. In addition, for areas that are war-torn, it’s hard to build and maintain libraries.” Yet, she adds, “There is Internet access just about anywhere, and there are many, many speakers and readers of Arabic around the world. We thought it would be really great to get some of the privately held volumes in distinguished Arabic-language collections digitized and put them online for everybody, no matter if they are in Cincinnati or in Beirut. Not just technical stuff, not just textbooks, not just scholarly philosophy, but a broad range of poetry, economics, business, and fiction.”

Arabic Collections Online is aiming to put 23,000 Arabic-language books online, chosen from NYU’s collections and those of its partners — Princeton, Cornell, Columbia, The American University in Cairo, American University of Beirut, and United Arab Emirates National Archives. The choice of books aims for variety, and the main goal is access, explains Cummings. “We want private readers, casual readers, as well as doctoral candidates, teachers, and professors at universities. We want them all to use it — that is our hope.”

Digital Isn’t Forever

The digitization of libraries and archives fosters the expansion and diffusion of knowledge through cultures and even across borders. But while the promise of these initiatives is great, there are also challenges. Done at scale, the process involves scanning hundreds of thousands, if not millions, of books, creating descriptions that will allow the volumes to be findable via search engines, and storing vast quantities of information. All of this is very costly.

Moreover, and perhaps surprisingly, these digital formats are in fact both less stable and more fragile than the paper documents they are meant to preserve in perpetuity. Paper archives can survive years of benign neglect, as long as they are shielded from damp, pests, fire, and so on. Digital files, by contrast, are subject to one of the greatest threats of our technological age — that of obsolescence. As hardware and software develop apace, digital files must be migrated from platform to platform, with the potential for corruption of data or loss of information along the way. Richard Ovenden, the head of the Bodleian Library at the University of Oxford — an institution that opened its collection of books, manuscripts, and other printed matter that spans centuries if not millennia to scholars in 1602 — explains that this poses a particularly intractable problem. “Alongside the process of digitization comes a growing sense of urgency over the preservation of digital material, and how do we manage that when it’s actually much harder to preserve than paper and parchment are?”

The sentiment is echoed by Louisa Yates, director of collections and research at Gladstone’s Library in Wales, which was founded in the 1880s by British Prime Minister William Ewart Gladstone to house his own collection of 32,000 books. Today, the library also comprises about 300,000 handwritten documents, including Gladstone’s papers and letters and other manuscript materials. It is one of the world’s few “residential libraries,” where visiting researchers can stay in one of 26 “boutique bedrooms” in the building itself while conducting research alongside fellow resident scholars, other visitors, and the general public. With the help of funding from Carnegie Corporation of New York, the library has set out to digitize about 15,000 of Gladstone’s letters and another 5,000 of his hand-annotated books, making these unique items available to a far greater audience than can be accommodated by the library’s handful of librarians in its (generally fully booked) reading room, which boasts 26 desks and “a growing number of comfortable armchairs.”


Bravo! Carnegie Hall Goes Digital

Thousands of treasures from the legendary hall’s rich history — programs, photographs, flyers, posters, one-of-a kind rarities, and more! — are now available through the just launched Carnegie Hall Digital Collections

Read more


Building capacity for the Gladstone digitization project has been an eye-opener for Yates, who explains that instead of outsourcing the actual scanning and acquisition of metadata as is often the practice for archives and libraries, it was decided that a digitization studio would be set up in the library itself. Yates and her team have developed techniques for harvesting metadata from its archival materials that can be used by a team of volunteers — who at the moment are local but will eventually come from around the world — with the hope of eventually amassing digital transcriptions of every letter and document in the archive.

As part of the learning curve, Yates notes, “One thing we found out is that digital objects require much more intervention — and much more early intervention — than physical objects. Even cloud storage has only really become viable within the last 18 months, and it’s not certain where we’ll be in the next 18 months. So one thing that’s been very interesting coming out of this process is that it’s much better to make choices that are flexible and reversible than to commit to a single approach, because committing to one storage facility or one type of file may not be sustainable as technology changes.”

The question of sustainability is particularly acute for born-digital data, such as email correspondence and bureaucratic record keeping, says the Bodleian’s Ovenden. He offers the example of the papers of the Prime Minister Edward Heath, who brought Britain into the European Union. “His archive contains a huge variety of digital media that I’d almost forgotten about. Five and a quarter-inch floppy drives! Twelve-inch floppy drives! Taking data off these formats and making it readable and understandable today requires us to use devices, both hardware and software, and forensic techniques that seem more associated with the CIA or MI-5 than with libraries.”

However, Ovenden says the task is not insurmountable. “All this information can be retrieved if you have the know-how. But it does mean that we have to apply a whole range of skills and technologies and resources to actually steward information properly, and to give it back to the people.”

Donald Waters, formerly the senior program officer for Scholarly Communications at the Mellon Foundation, which has focused on the ways technology can bolster humanities research, emphasizes that the future of scholarship as such relies on the preservation of digital materials — and not simply on the preservation of physical documents or objects. “We need a very robust infrastructure to keep that stuff alive. It involves preserving journals, it involves preserving news, it involves preserving information captured on websites and social media, it involves preserving software. The objective here is to provide robust infrastructure so that scholars can confidently work in the digital medium, and know that their work will survive.”

The World Digital Library is an important object lesson in this case. In 2016 the Library of Congress decided to deprioritize the WDL in favor of other digitization and preservation projects that focused more exclusively on U.S.-based resources. As of now, until it finds new sources of funding and support, the World Digital Library is in limbo, with no new materials being added to its collections.

“We always thought that the Library of Congress was going to keep the lights on, and that the World Digital Library was just a smaller version of the massive digital preservation and digital archiving functions that the Library of Congress undertakes,” Van Oudenaren explains. “But the library management is not keen on hosting the content of other institutions, which is what the World Digital Library essentially does. A lot of content is from the Library of Congress, but a lot of it is from other institutions. And of course, the WDL’s software and platform are different from the library’s own. So that’s why it is in danger.”

NYPL’s Kelly concurs about the challenges faced by libraries and archives in creating a sustainable digital future, but points to both a new ethos in which communities are increasingly aware of libraries’ role in creating democratic spaces, and new technological advances — text recognition software, machine learning, crowdsourcing, and artificial intelligence, among others — that will facilitate the process of making the world’s knowledge available to the greatest number of people. “When you start out from the principle of full and open access,” he insists, “a lot of other goals and decisions fall into place.”

Maintaining Authenticity

The question of sustainability extends to one of authenticity when it comes to digitizing library collections. While there is an inevitable risk of degradation or deterioration in the conversion of paper archives into digital copies, in ideal scenarios one might be able to create a new version using the original physical object. And for born-digital material — where there is no “original” hard copy — the problem is compounded.

If libraries don’t preserve the original digital objects when they are migrated to new platforms as technologies evolve, there is the loss of behavior as well as the loss of information to take into consideration. New technologies often don’t interpret digital information in the same way as older ones did, and in this process of translation an accurate sense of how a particular web object might have once “worked” on-screen is lost. For example, certain types of animated data visualizations or 3D renderings of objects may cease to function in an online science article if particular software applications become obsolete, creating lacunae for future researchers trying to access past scholarship.

According to a 2007 report put out by the Rand Corporation, Addressing the Future of Preserving the Past, “If preservation methods cannot preserve their full range of behavior, the future scholarly record will bear only a static, snapshot representation of the first generation of these inherently digital objects, which are likely to become increasingly numerous and important to scientists and scholars over time.”

Another issue is that digital objects — say, websites — change over time; while this poses obvious advantages over the analog era (no need to print a new issue every day!), it does make preservation challenging. In the face of this rapid and relentless updating of information, a number of libraries worldwide, including the Bibliotheca Alexandrina and the Bodleian Library, have undertaken a project of “archiving the whole web,” in which they make static copies of every page produced on the Internet to keep as a historical record.

The increasing role of private companies when it comes to housing our digital information is developing into another quagmire, warns Ovenden. And as more and more of our personal data in the form of photographs, music, or other material is hosted by Silicon Valley giants who are not obligated to retain materials in perpetuity, preservation is increasingly at the whims of profit and loss. It’s not only future scholars and researchers who may lose out — it’s all of us.

“In the process of turning information into knowledge, we used to rely on libraries and archives and museums to act as trusted independent entities that were not there to make profits. That process of turning information into knowledge was part of the commonwealth,” Ovenden says. He adds, however, that at some point “society began to believe that because the so-called free services were being provided by the tech companies, that such processes can be essentially outsourced. It was so easy to do when it first became available, and big tech companies were offering the possibility of doing it at scale. But it’s becoming clearer that those companies were not benign, that we needed to keep an eye on them — and now we are seeing the consequences of not regulating that when Myspace and Google+ and Vine were shut down, resulting in the loss of massive amounts of information.”

“One of the problems we have is the tendency to try and reproduce the so-called Amazon Effect — ‘people who bought this also bought that.’ I think what we actually need is the opposite — ‘people who search for this will never have thought of searching for that, so why don’t you take a look?’ Almost like an anti-referral engine.”

— Richard Ovenden, Bodleian Library, University of Oxford

Don Waters from the Mellon Foundation concurs. “The problems that we’re dealing with have to do with building infrastructure that survives, and the sustainability of that infrastructure has a variety of dimensions. The mechanisms required to preserve content have to be embedded in organizations — and services — that can survive financially, technically, and organizationally.”

The New Pleasures of the Archive

Writing in this magazine in 2017, Carnegie Corporation of New York’s Gregorian spoke of the process of taking all of the information now available to us and synthesizing it in a way that makes it meaningful. “Thanks to technology,” he writes, “never has the world been more accessible to scholars — to all of us — willing to master the skills and languages to allow for the inquiry into societies only superficially resembling our own culture.… Access to knowledge is no longer an obstacle. If we hunger for it, we can find it.” The way to find such meaning, Gregorian argues, is by a process of “intellectual wandering”— making connections between and across realms of thought and expertise.

One of the real pleasures of manually digging through libraries and archival collections, of course, is something akin to the wandering Gregorian describes: poring over book spines on packed, seemingly endless rows of shelves, digging through boxes of yellowing correspondence not knowing what one will find, flipping through crumbling sketchbooks and diaries that, until that moment, have remained closed for decades if not centuries.

The happenstance of coming across something that awakens our fascination and curiosity is something that can turn the least scholarly among us into researchers, too, as long as we can do it with the right tools and skills to understand what we’re seeing, to analyze its veracity and import, and to place it in a larger context. In the age of paper and parchment, or bricks and mortar, that kind of support was offered by librarians. In the digital landscape, it is provided by a variety of finding tools and search functions, some as simple as a keyword query, and others driven by the most complex predictive technologies.

While digital research may not offer quite the same pleasures as physical libraries, many of those involved with the digitization of collections and archives are keenly aware of the other serendipities — even scholarly miracles — that new technologies can deliver.

For Richard Ovenden, innovative search and discovery tools are crucial when it comes to thinking about what sort of insights digital archives might produce in the future. “I think this is something that we worry about,” he admits. “I think one of the problems we have is the tendency to try and reproduce the so-called Amazon Effect — ‘people who bought this also bought that.’ I think what we actually need is the opposite — ‘people who search for this will never have thought of searching for that, so why don’t you take a look?’ Almost like an anti-referral engine.”