Digitizing Books, Obscuring Women’s Work: Google Books, Librarians, and Ideologies of Access

By Raina Bloom and Anna Lauren Hoffmann

  


From a broad historical and cultural standpoint, Google Books concerns the imposition of ideals of technological rationality and efficiency typical of search engine technology onto entire collections of recorded human knowledge. As a large-scale information infrastructure, it radically reorganizes relations between the technologies, institutions, and individuals that work to preserve, organize, and make available the world's library collections. These activities have historically involved a wide range of actors—from authors to publishers to preservationists to, most importantly for our discussion, librarians. While highlighting the promotion of values like information equality and individual liberty, the dominant narrative of Google Books foregrounds scanning technology and the Books platform as a kind of technological solution, obscuring or erasing the efforts of many other kinds of information workers and professionals, both historical and current. Similarly, the focus of critics has been on the potential negative impact Google's platform, and scanning efforts might have on values like privacy and intellectual freedom. While important, these critiques still uncritically accept Google Books' solution-oriented stance, focusing instead on the potentially destructive social consequences of the project.

Here, we begin by recounting the dominant narrative of Google Books and the ideology of access it embodies. We will surface an alternative account of the Google Books project that questions the implications of its role as a large-scale collector, organizer, and disseminator of information. If we interpret the Google Books project using the gendered history of librarianship as our lens, we can identify a different way to consider and perform the notion of access to information—in this case, access to the information contained in library books. The gendered history of librarianship grants discursive space to interrogate the Books project's technologically rational ideology of access, pervasive in discussions of information and communication technologies. This reinterpretation invites us to consider the ways in which education, service, and community are absent from our commonly-held ideology of access and what we stand to lose through failing to note their absence.

Google Books, Briefly: Moral Imperatives and Technological Solutions

Proponents of large-scale book digitization projects routinely position digitization as a moral imperative. Evoking ideals of plentitude and egalitarianism, they argue that collections of digitized books stand, through their indefinite reproducibility, to promote information equality and cross-cultural awareness and understanding (World Digital Library n.d.; Hart 1992; Open Content Alliance n.d., n.p.). Digitized books are also assumed to be less susceptible to damage or decay. University of Michigan President Mary Sue Coleman (2006)—defending her institution's decision to participate in Google's book scanning efforts—argued that 'nature, politics, and war have always been the mortal enemies of written works' and 'by digitizing today's books, through our own efforts and in partnership with others, we are protecting the written word for all time' (265-266). Here, the University of Michigan's partnership with Google is cast as a morally righteous collaboration; the project is positioned as a great and noble preserver of valuable cultural resources.

Early in the Google Books project's development, former Google CEO Eric Schmidt implored the public to 'imagine the cultural impact of putting tens of millions of previously inaccessible volumes into one vast index, every word of which is searchable by anyone, rich and poor, urban and rural, First World and Third, en toute langue—and all, of course, entirely for free' (Schmidt 2005, para. 9). This dream of an 'egalitarianism of information' based on the digitization and indexing of books even precedes Google itself; before developing a Web search engine, co-founders' Larry Page and Sergey Brin sought to develop a sort-of Web crawler that would index the contents of digitized books and analyze the connections between them (Google Books n.d.). 'Even then,' the company claims, 'Larry and Sergey envisioned people everywhere being able to search through all of the world's books to find the ones they're looking for' (Google Books n.d., n.p.).

Google approached the Books project with a decidedly engineering-based mindset. From the start, the primary obstacles to the creation of a massive, keyword-searchable collection of digital books were perceived as technological. The company needed, first and foremost, to overcome then-current methods for digitizing books that risked damaging books in the scanning process and had proven limited in their ability to handle the irregularities of print books, especially older texts. Further, in order for modern Optical Character Recognition (OCR) software to work, book pages need to be relatively flat; however, the physical reality of books and their bindings makes flat pages difficult (previous methods for scanning had resorted to using glass plates to press pages of books flat or tearing out books' bindings altogether).

Given these constraints, one of the driving questions asked early on—'how long would it take to digitally scan every book in the world?'—neatly betrayed the company's fundamentally engineering-based approach (Google Books n.d.). Notably, professionals at the University of Michigan estimated that scanning the entire university library collection would take 1,000 years, while Larry Page insisted it could be done in six—a technical breakthrough, indeed (Coleman 2006, 264). Eventually, Google developed a method for streamlining the scanning process by eliminating the need to physically flatten pages. This innovation not only opened up the possibility of Google Books as the 20 million-plus volume collection we know today, but also considerably sped up the scanning process.

Google Books required more than just engineering savvy; it also needed content. Initially, the company sought support from publishers willing to contribute in-print books for scanning (Newman 2011; Grimmelman 2009). However, Google's collection did not start to expand rapidly until 2004, through partnerships with the New York Public Library and libraries at the University of Michigan, Harvard, Oxford, and Stanford (Google, Inc., 2004). The 'Google Print' Library Project (as it was then called) held more promise than partnerships with publishers, since it afforded them access to more than 15 million titles with only a handful of partnerships, compared to just hundreds of titles made available by thousands of publishers (Newman 2011, 5). In 2005, Google Print was renamed Google Books to better communicate the initiative's mission to the public (Google, Inc., 2005).

While the Library Project rapidly expanded Google's collection, it was perceived as a potential threat to the copyright interests of authors and publishers. Interest groups and publishers argued that Google's development of a vast archive of library collections for commercial benefit violated copyright law, and various lawsuits were filed against the company (Newman 2011; Samuelson 2009; Grimmelmann 2009). However, as none of the lawsuits forced Google Books offline, the project was able to move forward despite litigation. During that time, Google Books failed to have the wholly transformative impact on the publishing industry anticipated by proponents and critics alike. Eventually, Google Books settled into the broader context of the Web. As Grimmelmann (2013) summarizes, 'what was once viewed almost as science fiction has become part of our daily reality— everyone, it seems, has used Google Books…' (n.p.).

This everyday utility of Google Books proved central to the November 2013 and October 2015 rulings that Google's book scanning efforts are protected by fair use. In the 2013 ruling, Judge Chin argued that the platform 'expands access' to books, offers new possibilities for digital and historical research, as well as increasing materials available disabled persons (as with text-to-speech capabilities for digitized text) (Author's Guild v. Google, Inc. 2013, 9-12). Ultimately, Judge Chin affirmed the optimism of the project's biggest proponents by asserting that 'indeed, all society benefits' (Author's Guild v. Google, Inc. 2013, 26). Judge Chin's ruling was later re-affirmed by 2nd Circuit Court of Appeals (Author's Guild v. Google, Inc. 2015).

Understanding Google Books: Dominant and Obscured Narratives

The preceding discussion of Google Books sketches certain dominant features of the project's narrative that help to identify Google's overarching ideology of access. Overall, the story of Google Books tends toward the roughly chronological and teleological, as told in the preceding section and elsewhere (for example Google Books n.d.). This sort of narrative foregrounds what the project 'does' from an aspirational standpoint: through the power of engineering, Google Books overcomes technological obstacles and enables a heightened egalitarianism of information through the widespread searchability of digitized versions of library books. Judge Chin's 2013 ruling builds on this narrative, highlighting the broad role of fair use in enabling innovative digital information initiatives for both private and public benefit.

When attending to normative dimensions of information systems, however, it is important to pay attention to the types of narratives being constructed (Star 1999, 384). Dominant narratives often work to prioritize some issues, values, and stakeholders while obscuring others. For example, understanding Google's choice to partner with publishers foregrounds political and economic considerations of information control and intellectual property, while the library partner program highlights the struggle to balance efficiency (libraries had more to offer Google than publishers in terms of volume) with legal concerns (potential copyright infringement). Critics of the project extend this narrative by focusing on the same issues of control and intellectual property (see: Grimmelmann 2010; Vaidyanathan 2011; Zimmer 2012). Overall, the dominant narrative presents a more-or-less unified ideology, either in line with or opposed to Google's—to borrow language from Star (1999)—'presumably monolithic agenda' (385) to overcome technical hurdles and reshape information policy in its efforts to organize and make accessible the world's information.

In view of these foregrounded themes and values, we identify Google's 'ideology of access' as part of a broader ideology of information technology. According to Birdsall (1997), an ideology of information technology is 'a conjunction of neo-conservative politics, laissez-faire free market economic values, and technological determinism,' the logic of which transforms citizens into consumers and information into a commodity to be bought and sold in open and competitive markets (54-55). Waller (2009), for example, has shown how the Books project betrays Google's conception of information as only valuable insofar as can be harnessed for marketing purposes. In addition, Agre's (1995) discussion of ideologies that produce particular conceptions of 'information' helps us to see further how, for Google, access depends, in part, on a framework of information as an explicit and commodifiable good. Finally, Google's insistence on a technologically rational notion of universal access further trades on ideas of the so-called 'Californian ideology' typical of Silicon Valley companies and entrepreneurs that tout cultural cache and hype the liberating potential of information technology (Barbrook & Cameron, 1996). The commitment to Google's scanning technology as liberatory is particularly evident in the moral justifications employed by Schmidt, Coleman, and others. Google's ideology of access is a technorationalist one, centered on distributive notions of access, that is, the idea that the presence of resources, made fundamentally discoverable through an uncomplicated search interface, constitutes access, full stop.

The pervasiveness of this idea—of universal access as both liberatory and a technical problem to be solved—speaks to the discursive power wielded by technological solutions in our post-Enlightenment world, where 'the pursuit of technology and science' is synonymous with 'human betterment…and material prosperity' (Smith 1994, 3). However, critics since Rousseau have challenged this straightforward relationship between technology and progress, emphasizing the ways in which technology can have both positive and negative effects (see, in particular: Mumford, 1964; Winner, 1986). Notably, French theorist Jacques Ellul (2003) argued that human beings, to engage technological systems and artifacts, modify their value systems to be consistent with technological ideals (rather than develop technological systems in line with human ideals). In turn, technology becomes 'the creative force of new values, of new ethics' (Ellul 2003, 396).

Common to these critiques is a rejection of the idea that technology is value-neutral. Rather, technological systems and artifacts exhibit value systems and ideals that exert some influence on the value systems and ideals of the societies within which they are embedded. Consequently, the social meanings we ascribe to technology and the rationality inscribed in the design of technological systems are not mutually exclusive (Feenberg, 2003, 608). In setting out a feminist definition of technology and society, Bush (2009) describes technology as encompassing all of 'the resources, tools, processes, personnel, and systems developed to perform tasks and create immediate particular and personal and/or competitive advantages in a given ecological, economic, and social context' (121). Wajcman (2004) further describes technological systems as 'never merely technical,' since 'their real-world functioning has technical, economic, organizational, political, and even cultural elements' (35). Following Bush and Wajcman, we view Google Books not as an emancipatory technical solution, but as an ideologically-driven system that organizes and makes possible certain technical, organizational, and political relationships while foreclosing or obscuring others.

Surfacing a Feminist Ideology of Access Through Librarianship

Like any ideology, Google's ideology of access presents access to information as a natural or given category rather than as a concept contingent upon the institutions, structures, and values that permit access to information in practice. If we return to Eric Schmidt's vision of Google Books' potential and again 'imagine the cultural impact of putting tens of millions of previously inaccessible volumes into one vast index, every word of which is searchable by anyone, rich and poor, urban and rural, First World and Third, en toute langue—and all, of course, entirely for free' (Schmidt 2005, para. 9), we can see how Schmidt's professional experience and value systems permit an emphasis on certain dimensions of access (cost, searchability) while disregarding others (user-end technological limitations, information literacy skills, lack of full-text availability).

Following Star and Ruhleder's (1996) definition of infrastructure, Google Books is both shaping and shaped by communities of practice; its scanning initiative is informed by partner libraries and, in turn, informs and overcomes the localized practices of these libraries to make their collections 'universally accessible.' Importantly, this process of overcoming localized practices includes removing collections of books from contexts traditionally informed by gendered work and subjecting them to the technical rationality of Google. Building on this, we can emphasize—borrowing from Agre (1995)—that the problem of access is 'an object of certain professional ideologies' that 'cannot be understood except through the practices within which [they are] constructed by the members of those professions in their work' (225). In the case of Google Books, this means that if we reconsider the project not as distinct from—but continuous with—the past and current work of women in the form of librarians and other library workers, we will arrive at a different and necessary understanding of the way that Google Books participates in the construction and restriction of the meaning of access.

Librarians have constructed a markedly different ideology of access from Google's. In-library philosophy and practice, access to information is understood as a complex, considered, local endeavor, grounded in professional practice that privileges notions of service without a profit motive, answering as exclusively to user needs as possible. In her germinal book Librarianship: The Erosion of a Woman's Profession, Roma Harris argues that librarianship's values are centered around service, community, and an ethic of care (Harris 1992). This value system is a gendered phenomenon and requires a discussion of the gendered nature of the profession itself—especially as it has been theorized and practiced since the end of the 19th century in North America.

Historically, libraries have played an important role as 'place' (Weigand 2003, viii). Within this place, librarians do what is generally thought of as women's work, engaging in tasks that we associate with paid and unpaid feminized labor. Mary Ritter Beard, writing in 1915, includes librarians in her survey of 'the labors of women for civic improvement of all

22 Leave a comment on paragraph 22 2 In ‘Power, Knowledge, and Fear: Feminism, Foucault, and the Stereotype of the Female Librarian,’ Radford and Radford (1997) use Foucault’s theories about power and knowledge to argue that the stereotypical image of the female librarian is a function of the larger culture grappling with its anxiety about powerful, knowledgeable women who do not conform to other culturally-produced images of women. In their content analysis of New York Times obituaries of librarians from 1977 until 2002, Juris Dilevko and Lisa Gottlieb (2007) note a bias toward male librarians (a full 63% of the obituaries are for male librarians, despite Census data from the same period indicating that the profession was 80 – 85% women, a statistic that remains true today) (United States Census Bureau 2011). They use their findings as a springboard to a discussion of academic and public librarians, gender, and anxiety over the ‘traditional service and care-based ethic’ of the profession (Dilevko and Gottlieb 2007, 176).

23 Leave a comment on paragraph 23 0 This gendered history is vital for understanding the context of the collections upon which Google Books is built. Like most other library practices, access is framed by the service and care-focused orientation that underpins the entire profession. Harris (1992) characterizes this sort of service as a kind of democratic professionalism, in which practitioners do not make prescriptive assertions, but use specialized knowledge and skills to assist patrons in discovering their information needs.


24 Leave a comment on paragraph 24 3 Just as processes of professionalization pushed women out of computing professions starting in the 1970s and most markedly in the 1980s (Misa 2010), moving books out of the library and onto Google’s servers works to obscure the professional contributions made by women in information technology. By extension, obscuring these contributions also obscures the professional ideologies that gave rise to them, including libraries’ community-oriented, care and service-centered ideology of access.

25 Leave a comment on paragraph 25 4 Google’s vision of information access, manifest in both its Books project and its search engine, does not account for the need to educate users in information seeking and evaluation, instead addressing these difficulties with a simplified interface and results ranking that attempts interpretive work to understand an individual user’s needs. While Google insists on technological marvel as a solution to a complex problem, librarians such as those associated with the ERIAL (Ethnographic Research in Illinois Academic Libraries) Project report that ‘the majority of students – of all levels – who participated in this study exhibited significant difficulties that ranged across nearly every aspect of the search process’ (Duke and Asher 73).

26 Leave a comment on paragraph 26 5 Contrary to Google’s repeated assertions, their notion of universal access does not mean access for all. Education, as we have just noted, imposes a considerable barrier not accounted for in Google’s ideology, as do barriers that do not account for community needs, geographic or cultural. We can look beyond the relatively narrow focus of the Books project to understand how far-reaching the implications of this ideology can become. Large-scale information communication technology companies, like Google and Facebook, have repeatedly failed to appropriately respond in the face of ‘real name’ policies that undermine the ability of communities of potential users, like Native Americans and transgender people, to access information (boyd 2012; Snyder, 2014). Google’s notion of universal accessibility also fails to acknowledge varying levels of access to ready and reliable ICT infrastructure across the globe. When there exist significant global disparities in internet availability, cost, and speed (International Telecommunications Union 2015), Schmidt’s vision for the Books project transcending social class, and geography becomes distant and shortsighted.

27 Leave a comment on paragraph 27 3 Libraries specifically seek to affirm and reaffirm access in instances where a user’s age, sexual orientation, or gender presentation might create cause for challenges to materials or practices in an individual community. In sharp contrast to Google’s universalized ideology of access, the American Library Association’s Library Bill of Rights is disseminated with interpretive documents on, among many subjects, access to digital materials for specific groups such as minors and those who may encounter obstacles on the basis of sex, sexuality, and gender presentation (American Library Association, 2015). The word ‘access’ is not used in the Library Bill of Rights itself, but these interpretive documents deploy it, making the meaning of the word in a library context clear. Access is situated as a process affected by medium, locality, and the demographic realities of individual users. In these documents, access is also not synonymous with information technology. These interpretive documents make assertions that include contact with and support from librarians part of the spectrum of library access. Moreover, it is stressed that access for all library users, when affirming the rights of minors and sexual minorities, should be free.


28 Leave a comment on paragraph 28 4 Librarians, like Google, stake their reputation on connecting people to the information that they need. The difference in how this is done by librarians versus how Google does it is radical and driven by the contextual distinctions in which each infrastructure frames its ideal notion of access. The Books project can only be construed as a solution to the problem of access to information with the support of an ideology that permits it to be perceived as such. Just as Feenberg (2003) once showed how the Fordist assembly line is only understandable as ‘progress’ within the logic of capitalism and technological rationality, Google Books is only ‘beneficial’ (to borrow Judge Chin’s description) when framed by an ideological commitment to values of universality, efficiency, and technological rationality. Poor scan quality, low-quality metadata, and stringent limitations on resource use put in place by both market forces and the force of law are mere trade-offs, less urgent in the face of a powerful technological solution that provides simple, commercialized access to information.

29 Leave a comment on paragraph 29 2 Against the simplicity and universality of Google, librarians offer complex, localized engagement with information. By accepting an uncomplicated narrative of the Books project as benefitting all society, as if society was a homogenous monolith with universal, uniform needs, we ultimately accept Google’s ideology of access, pushing the work and values of librarians aside, further marginalizing the role of librarians in wider cultural conversations about information. This dismissal leads to the tragic loss of ways of thinking about and realizing access that resist and reject Google’s vision.


