Women’s Ways of Structuring Data

1 Leave a comment on paragraph 1 0 By Christine Masters Jach

2 Leave a comment on paragraph 2 2 Smoothly functioning infrastructures are invisible. Examples of infrastructures range from those physically constructed, such as transportation and public utility systems, to those that are more elusive or fluctuating—systems of economic exchange, for example. When systems work well, people do not realize their immersion within them because they facilitate the ease of daily experiences. For example, we are not always aware of how much we rely on the power grid until a transformer breakdown causes our lights to go out. Infrastructures are complex and sometimes require work to understand and map out, yet once we are aware of how they exist, we find it hard to believe how we could have overlooked them in the first place. According to Geoffrey Bowker and Susan Leigh Star (1999), “the trick [to seeing infrastructure] is to question every apparently natural easiness in the world around us and look for the work involved in making it easy” (p. 39). A definition of infrastructure as has several qualities: “embeddedness,” “transparency,” “reach or scope,” “learned as part of membership,” “links with conventions of practice,” “embodiment of standards,” “built on an installed base,” “becomes visible upon breakdown,” and “is fixed in modular increments, not all at once or globally” (Star and Ruhleder, 1996). Information systems scholars have examined infrastructures within a variety of contexts, working towards revealing both their material and their symbolic natures.

3 Leave a comment on paragraph 3 3 Just as infrastructures themselves are often invisible, women’s roles within them have been rendered even more invisible. Whether or not it has been articulated with this particular vocabulary, a goal of feminism has been to make visible the ubiquitous cultural, political, social, and economic infrastructures and the roles of women within them. While infrastructures are often transparent, the structures that arise from them can be more consciously designed. Of particular importance for us now that are informational infrastructures have become globalized are the structures that collect and store data. Popular web applications—from Wikipedia to Pinterest to Facebook—are built upon huge data structures. The content of these sites often come under scrutiny—for example, activist groups have attempted to address and correct the ways that women are underrepresented on Wikipedia’s pages (Wadewitz, 2013). Even beyond questions of content, however, we might ask how the underlying classification and organizational schema themselves might be gender-biased. We could also look at how the categories residing in data structures perpetuate Western-centric values. Because data structures are shared world wide, we need to consider questions of privilege and power within them.

Addressing Gendered Standards and Classifications

4 Leave a comment on paragraph 4 3 The gender problem within data classification systems has been around for a long time. Working from the field of library information systems, Hope Olson (2001) describes how the original architects of library classification systems decided on organizational schemes. Charles Cutter, who published Rules for a Dictionary Catalogue in 1904, advised for uniformity of categories except in cases where it would be more convenient for “the” public to have things listed in a non-uniform way. Olson argues that his language indicates a belief in a singular public whose members all share the same worldview; in other words, “a universality is present in Cutter’s view, but it is the singular public who defines it” (p. 642). Problematically, asserts Olson, Cutter’s “singular public” is not inclusive of all community members, but rather, it is “a particular part of humanity that shares cultural, social, or political interests. That idealized community excludes individuals and groups who do not share its interests” (p. 643). Some of the earliest cataloguing systems, upon which much of current practices are based, privilege hierarchical relationships; broader terms channel versus narrower terms underneath them (p. 644-645). Sub-categories are not evenly distributed and favor a male-privileged worldview. Olson gives the following example to illustrate how this happens:

5 Leave a comment on paragraph 5 0 The subdivision “- Relations with women” subtly reinforces the subject/object roles of men and women. There is no parallel under “Men” (one cannot express Simone de Beauvoir’s relations with men as one can express Jean-Paul Sartre’s relations with women). This anomaly reflects mainstream culture’s positioning of men as knowing subjects in our society and women as objects to be known, the objects of men’s relationships. (p. 647)

6 Leave a comment on paragraph 6 1 Another result of this categorization system is that works “embodying multiple marginalizations” are “either ghettoized in an obscure corner of the catalog (all women or all African Americans lumped together) or dispersed in a diaspora of little ghettos. Separated from mainstream subject classifications, where they are pushed to the margins, they will not disturb library users looking for books on ‘real’ topics” (p. 658-659).

7 Leave a comment on paragraph 7 1 Olson does not stop at critique, however. Instead, she looks for alternative systems of organization and ways of searching for library information that would avoid the problems of marginalization and ghettoization, often due to hierarchical classification structures. She contemplates the benefits and problems that come with “free text searching”—remarking that this strategy could be useful in finding “topics not representable in a controlled vocabulary” but would also return too many results (p. 660). Another suggestion is to use alternative names so that, for example, a search for “wimmin” would return the same values as a search for “women” within out instructing users to search again for “women.” Yet another possibility would be to use past transaction logs to aid with current searches (p. 661). She emphasizes that change will come only when women work to modify the already established systems. By suggesting these alternative search structures, Olson argues that cataloguing systems should stop assuming that there will be just one type of user who represents a singular public. Library catalogues need to relinquish some of their structural power to users of all identities. Such a hypothetical catalogue would communicate ideas of inclusivity and equality.

8 Leave a comment on paragraph 8 1 Olson’s analysis of library cataloguing systems provides just one example of how we might think about reorganizing data structures to reflect gender and race equality. Another example of efforts to reframe women’s historical writings in feminist terms can be found in a feminist-oriented, curated data structure: the Orlando database.

Feminist Databasing

9 Leave a comment on paragraph 9 1 Published online by Cambridge University Press in 2006, Orlando: Women’s Writing in the British Isles from the Beginnings to the Present, provides information on 1,300 women writers and bibliographic references on over 25,000 titles. It does not provide the texts themselves, but it does provide “new biographical and critical accounts of the lives and works of its subjects, together with contextual materials relevant to critical and historical readings” (Brown et al, 2006). It was created and edited by a team of three women, Susan Brown, Patricia Clements, and Isobel Grundy, along with a large team of co-investigators, technical personnel, research associates, post-doctoral fellows, and research assistants (Brown et al, 2006). According to the scholarly background webpage, efforts towards recovering the work of women writers have been underway as the “Orlando Project” since the 1960s. As the site conveys, “This phenomenally vigorous scholarly work of inclusion—of writers omitted from traditional historical accounts, at least partly by reason of gender or race or class—is arguably the major feature of recent literary historical scholarship” (Brown et al, 2006). Within the scholarly introduction, the following describes how Orlando positions itself and its purpose:

10 Leave a comment on paragraph 10 1 Orlando focuses on gender, and it emphasizes the intellectual, material, political, and social conditions (including writing by men) that have, over time, helped to shape writing by women. It sees gender as an indispensable tool for historical analysis that helps to shape the questions we ask about the production, reception, and features of written texts and about the ways in which these have been understood throughout the history of women’s writing. (Brown et al, 2006)

11 Leave a comment on paragraph 11 2 Here, the editors of Orlando conveys its basic rhetorical context, audience, and purpose: it arises out of a need to recover women’s contributions to literary history, it seeks to emphasize the conditions that have shaped women’s writing and reframe historical analysis, and its primary audience appears to be literary critics and scholars. While it primarily focuses on “literature,” the database does include “women known as writers of science, household advice, or popular genres, and those known (if at all) mostly for non-literary reasons who also left significant writing” and some male writers who provide textuality (Brown et al, 2006). Looking at what the database intends to communicate through its stated purposes, however, provides only one level of understanding. Looking at how the data are sorted and classified garners a more thorough analysis.

12 Leave a comment on paragraph 12 0 Orlando is organized not hierarchically, but through a system of tagging. The editors fully realize that this system of tagging is highly interpretive, based on what the historians, as architects of the system, prefer to prioritize and communicate as most important. They note this realization in their writings about the process of tagging during the project (Brown et al, 2004; Butler et al, 2000). Butler et al (2000) explain that their work does not involve applying tags to existing texts. Rather, they tag the descriptive histories that they compose in the database. Using SGML, they create three distinct document types (DTDs): biography, writing, and events. They model these structurally after the Text Encoding Initiative (TEI), adding interpretive tags as they see fit. They write,

13 Leave a comment on paragraph 13 0 For example, the biography DTD has tags for birth, family, education, and political affiliations; writing documents use tags for such specific information as genre, intertextuality, literary awards, and relations with publishers; events documents contain chronological events that have such information as organization names and places tagged. (Butler et al, 2000, p. 112)

14 Leave a comment on paragraph 14 1 As Butler et al describe, the process of applying tags to their interpretive histories is complex and problematic. Because they have so many different people working on tagging, many of them postdoctoral students who average a little more than one year working on the project, it is nearly impossible to achieve consistency. They report that as of 2000, there were 238 “unique element types” in their DTDs and 230 “unique attributes.” The process of deciding on criteria is described as collaborative—“we had the sense of a shared common understanding of what each tag and attribute meant” (p. 112). They provide an example of one DTD element, “political affiliation,” that encapsulates and documents the process of creating it and testing it (p. 113). However, they encountered a need to edit for consistency among variables and automated this process using a database. They found that beyond core attributes such as names and places, it was often extremely difficult to systematically manage various tags.

15 Leave a comment on paragraph 15 2 The Orlando editors offer considerable reflection on the tagging process, but they do not offer an extended discussion of why or how certain terms were chosen and applied. For example, they do not offer an explanation of the possible genres that works have been assigned or the thought process behind assigning them. There does not appear to be any reflection on the ways that genre can be rhetorical or reflective of a particular worldview, or as a type of social action (see, for example, Carolyn Miller, 1984). The non-core tags and lesser attributes are likely composed according to a sort of folksonomy comprised of a certain group of literary scholars. There is nothing wrong with that—Orlando is an edited collection and its structures arise based on the knowledge, input, designs and expectations of its collaborators. Tag data are “cleaned up” via automated database algorithms, and taggers must go through a training session where they are probably taught specific protocols to follow. However, many of the classifications and standards that organize the Orlando database are invisible. In the case of genre identification, the taggers are probably presumed to share a common understanding of genre. There are parts of the tagging system that are consciously articulated as standardized—Figures 1, 2, and 3 serve as examples, but there are other tag diagrams as well. The total of all the mapped nodes do not represent all possible tags within the database. Other classifications are left up to the discretion of the tagger, presuming a shared knowledge system, as in the genre example.

16 Leave a comment on paragraph 16 2 The consciously articulated purpose of this data structure is to facilitate easy searches on the part of the user. However, there is a purpose not articulated by the editors, probably because it is unconscious—editors seek to reinforce existing knowledge structures within the culture and society of literary scholarship. These tag diagrams, provided as keys within the pages of Orlando, allow viewers to click on individual terms and view descriptions for each attribute. For example, if we click on “Cultural Formation” within the “Life” tag diagram, a listing appears that contains a definition of the term, related tags, and examples. In this case, “Cultural Formation” has two sub-elements: 1) “class issue, nationality, issue, race and ethnicity, religion, sexuality,” which contain discursive accounts of these categories; and 2) “race, colour, class, national heritage, nationality, geographical heritage, ethnicity, denomination, language (within cultural formation), political affiliation, and sexual identity,” which “are designed to name or define aspects of identity” (Figure 4). Further clicking on the terms for these levels reveals more information but not a complete list of options for labeling. Presumably, taggers would assign labels as they see fit, based on their knowledge of the author and their literary works. In this way, links between the infrastructure of the Orlando database and the larger infrastructure of the literary community become created and reinforced. Because this infrastructural work is historical—it involves mapping out knowledge of the past—the taggers must be interpretive and reprocess already collected data. The Orlando editors recover women’s writings and establish validity by building a knowledge structure around it. In order to persuade an audience that this knowledge is valid, the editors use already existing standards and classifications that have currency in larger literary or cultural circles. By using a system of tags that are already familiar to literary scholars, Orlando legitimizes women’s history by fitting it into an existing structural framework that has been traditionally male-centered. This process accomplishes important feminist recovery work, promoting awareness of women writers within a traditionally male-dominated cultural infrastructure. At the same time, it raises questions about the data-structuring process itself. Can there be specifically feminist ways of working with data? Can there be such a thing as a feminist data structure?

Conclusion: Conscious Structuring

17 Leave a comment on paragraph 17 4 The word “structure” holds similar connotations to the word “system.” In a recent interview for “DOCC 2013: Dialogues on Feminism and Technology,” Lucy Suchman and Katherine Gibson discuss the intersections of feminism, technology, and systems (Suchman and Gibson, 2013). Suchman proposes that the term “system” itself has a modernist, rationalist association to which she feels ambivalent. Gibson agrees, offering that an alternative would be to view things in relation. Gibson emphasizes that the term “economic system” has been used as a master signifier to describe one mode of economics—capitalism—as the dominant economic reality against which every other type of economic activity always gets positioned. In actuality, she explains, there are many other forms of economic activities and relations that are pervasive but not focused on extensively, many often associated with women, for example, gift-giving and reciprocal economic activities that involve spheres of production and reproduction. She has become increasingly interested in relational types of thinking that eschew a belief in one dominant “system” (Suchman and Gibson, 2013). Interestingly, the concept of infrastructure is quite relational rather than “systematic,” at least in the modernist, rationalist way that Suchman and Gibson understand that term. The term “infrastructure” connotes a web of interdependencies that are contingent upon relationships and that build with practices over time. Systems are in fact infrastructures; yet when we think there is only one overarching primary system, we tread into dangerous waters. If someone claims that a system is somehow “natural” without attempting to invert it and see how it arises through a multitude of dependencies, red flags should go up. What we think of as “natural” is also transparent and infrastructural. If a structure is to reflect feminist principles, then, it should work towards being infrastructurally non-transparent—in the sense that it is outlined and viewable—yet transparent in the sense that it does not hide its motivations.

18 Leave a comment on paragraph 18 3 Ultimately, a feminist data structure might take cues from what Jo Freeman (aka Joreen) advocates in “The Tyranny of Structurelessness” (1970-1973). Writing about group organization within the feminist movement, Joreen notices that the ideal of “structurelessness” does not work; a few “informal elites” always end up directing what happens unless a group adopts principles of democratic structuring. If we carry this line of thinking into the realm of organizing data, feminist data structure would be one where classification categories are consciously articulated and decided as democratically as possible by those who will access or interact with it, not just by an elite few. The recent Feminist Wikipedia movement follows this model. For literature and writing scholars, feminist structuring might mean not taking categories within genre classifications as given or natural, especially when genres have arisen within historically Western and male-dominated literary contexts. Assumptions about categories should not be taken for granted, but constantly questioned. Feminist data structuring processes, with equality as the goal, would involve a great amount of reflection, articulation, and collaboration. Ideally, these strategies could apply to address racial and global marginalizations within data structures.


19 Leave a comment on paragraph 19 0 Bowker, Geoffrey. (2005). Memory Practices in the Sciences. Cambridge, MA: MIT Press.

20 Leave a comment on paragraph 20 0 Bowker, G. and Susan Leigh Star. (1999). Sorting Things Out: Classification and Its Consequences. Cambridge, MA: MIT Press.

21 Leave a comment on paragraph 21 0 Brown, Susan, et al. (2004). “Intertextual Encoding in the Writing of Women’s Literary History.” Computers and the Humanities 38. 191-206.

22 Leave a comment on paragraph 22 0 Brown, Susan, et al. (2006). Orlando: Women’s Writing in the British Isles from the Beginnings to the Present. Cambridge: Cambridge University Press. Retrieved from: http://orlando.cambridge.org/ezproxy.lib.purdue.edu/.  

23 Leave a comment on paragraph 23 0 Butler, Terry et al. (2001). “Can a Team Tag Consistently? Experiences on the Orlando Project.” Markup Languages: Theory and Practice 2.2. 111-125.

24 Leave a comment on paragraph 24 0 Freeman, Jo. (1970-1973). “The Tyranny of Structurelessness.” Jo Freeman.com. Retrieved from http://jofreeman.com/joreen/tyranny.htm.

25 Leave a comment on paragraph 25 0 Miller, Carolyn. “Genre as Social Action.” Quarterly Journal of Speech 70 (1984): 151-167. Print.

26 Leave a comment on paragraph 26 0 Olson, Hope. (2001). “The Power to Name: Representation in Library Catalogs.” Signs 26.3. 639-668.

27 Leave a comment on paragraph 27 0 Star, Susan Leigh, & Ruhleder, Karen. (1996). Steps Toward an Ecology of Infrastructure: Design and Access for Large Information Spaces. Information Systems Research, 7(1), 111–134.

28 Leave a comment on paragraph 28 0 Suchman, Lucy and Gibson, Katherine (2013). “Feminism, Technology, and Systems 2: Infrastructures.” DOCC 2013: Dialogues on Feminism and Technology. Anne Balsamo, Producer. Retrieved from http://vimeo.com/79740274#at=0.

29 Leave a comment on paragraph 29 0 Wadewitz, Adrienne (2013). “Wikipedia’s gender gap and the complicated reality of systemic gender bias.” HASTAC Scholars Blog Posts. HASTAC. Retrieved from http://www.hastac.org/blogs/wadewitz/2013/07/26/wikipedias-gender-gap-and-complicated-reality-systemic-gender-bias.


30 Leave a comment on paragraph 30 0 Figure1

31 Leave a comment on paragraph 31 0 Figure2

32 Leave a comment on paragraph 32 0 Figure3

33 Leave a comment on paragraph 33 0 Figure4

34 Leave a comment on paragraph 34 0 Version of Record: Masters, Christine L. (2015). Women’s Ways of Structuring Data. Ada: A Journal of Gender, New Media, and Technology, No.8. doi:10.7264/N37M066H

35 Leave a comment on paragraph 35 0 Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Page 45

Source: http://adareview.fembotcollective.org/ada-issue-8-gender-globalization-and-the-digital/womens-ways-of-structuring-data/