Who ’ s Talking about Scholarly Communication ? An Examination of Gender and Behavior on the SCHOLCOMM Listserv

INTRODUCTION This study analyzes the gender dynamics of the American Library Association’s SCHOLCOMM listserv in order to determine the accuracy of concerns expressed by participants in early 2016 regarding the dominance of male voices on the listserv. METHODS  Utilizing the SCHOLCOMM listserv archive, openly available online, the authors analyzed metadata related to individual messages in order to create a comprehensive list of participants, which was then analyzed to determine gender identity. The authors utilized this information to correlate the frequency of new messages and replies sent to the list with the gender identity of participants. RESULTS While men represented 35% of the SCHOLCOMM list’s participants, they contributed over half of the messages sent to the listserv and two-thirds of those sent as replies on existing message threads. DISCUSSION  The opinion of several SCHOLCOMM participants that male voices were overrepresented in listserv discussions proved to be true. The gender identity breakdown of those most active on the list may also influence the perceptions and/or behaviors of other listserv participants, however, and should be investigated further. CONCLUSION While this study substantiates the opinion of several listserv participants that male SCHOLCOMM participants account for a disproportionately large amount of listserv discussion, we argue that the dynamics of the listserv can and should be changed in order to better represent the participant population.

provided by the ALA on every post to the list from its inception in February of 2003 to the end of December in 2015.By identifying each participant uniquely and correlating their gender identity with the frequency with which they posted and replied to the list, it seeks to determine what difference, if any, exists in how frequently men and women participate in discussions on the SCHOLCOMM list.

LITERATURE REVIEW Gender Distribution in LIS and Scholarly Communication
For the purposes of this study, it is important to begin to parse the distribution of gender identities within the field of library and information science (LIS) broadly, and more specifically within scholarly communications librarianship.According to the Oxford University Press Librarian Census, which surveyed the field of librarianship from 1880-2009, women made up 83% of the field at the end of the study, leveling down from 92% in 1930(Beveridge, Weber, & Beveridge, 2011).Likewise, a 2014 survey of ALA members found that 81% of respondents were women (American Library Association, 2014).This suggests that there is still a strong bias towards women in the profession, though this varies based on type of library and position.Within academic libraries, where the field of scholarly communications librarianship is largely situated, gender distribution is not as clearly delineated.Based on the most recent openly accessible Association of Research Libraries survey of member institutions (Kyrillidou & Young, 2006), women made up 63% of all professional staff within academic libraries, but only 39% of academic librarians.Within scholarly publishing, a field with close ties to academic and scholarly communications librarianship, a 2014 analysis of the Society of Scholarly Publishers (SSP) members found that 58% were women (Kane & Meadows, 2014).As an analysis by West, Jacquet, King, Correll, and Bergstrom (2013) found, however, even in academic fields where genders were relatively evenly distributed, gender distribution within subfields can vary widely.It is therefore unclear which trends, if any, are applicable to scholarly communications librarianship, which has yet to be singled out for analysis.

Online Communication in Scholarly Communications Librarianship
Established in the mid-1970s, the subfield of scholarly communications librarianship has a robust community of practitioners and, as with other academic research areas, various mechanisms for online interaction across the greater community.A study by Sugimoto, et al. (2012) of librarians' information dissemination and consumption practices found that, while the main modes for sharing information are the more traditional conference presentations and research articles, many librarians are now engaging with the professional commu-Journal of Librarianship and Scholarly Communication nity via social media, listservs, and blogs (p.151).Within scholarly communications librarianship, several online resources are deeply embedded in the community's practices around sharing knowledge, including organizational blogs such as the SSP's Scholarly Kitchen,1 individual university blogs such as that of Duke University2 or Indiana University,3 independent blogs such as In the Open,4 and the SCHOLCOMM listserv. 5There is likely heavy crossover within these online communities in terms of users and levels of participation; the SCHOLCOMM listserv was considered for this study because it is open and available to any individual who would like to participate.

Gender in Online Scholarly Communities
The issue of gender dynamics in online communities has been widely studied across a variety of fields and disciplines (e.g., Herring, 1992;van Doorn & van Zoonen, 2009).Many of these studies are unique to their realms of investigation and may not necessarily apply directly to scholarly communications librarianship; they do provide, however, a clearer view of the gender dynamics at play in online scholarly communities, especially communities of librarians.According to Herring, while earlier research presumed that the mediation of the online world would eliminate gender differences, it has been shown through various studies that gender continues to have a deep impact on the ways in which users interact with one another (as cited in Sierpe, 2001, p. 340).This impact varies based on study, however, which is perhaps due to the differences in topics of conversation and user contexts; these can play a role in defining the general style and frequency of communication (Baym, as cited in van Doorn & van Zoonen, 2009, p. 262).While McGee and Briscoe (2003) found that women were more active on a general faculty listserv (p.139), Sierpe (2001), using a methodology similar to that of the current study, found that men contributed nearly 59% of all messages on an LIS forum, though they comprised only 40% of those subscribed to the forum (p.345).They also found that male top contributors were more likely than female top contributors to be active in many discussions, and that men were more likely to contribute multiple times to the same discussion (p.346).In terms of more traditional publishing in scholarly communications librarianship, the gender gap is clearer.Gul, Shah, Hamade, Mushtaq, and Koul (2014) found that only about one-quarter of the authors published by the Electronic jlsc-pub.orgeP2017 | 5 Library Journal were women, or teams of women (p.496).

METHODS
The ALA website provides a public-facing archive of all emails distributed via the various ALA discussion lists; SCHOLCOMM is no exception.These archives are formatted as a list wherein each line represents an individual message and consists of the subject, author, and timestamp separated by commas.Each line links to the content of that particular email, though only the metadata (subject, author, and timestamp) were collected for this study.This formatting is very convenient in that these lists can be easily copied and pasted into a text editor (such as SublimeText, which was used in this study) and saved as a Comma-Separated Values (CSV) file.After a small amount of clean-up, these CSV files can be imported into Microsoft Excel or a statistical analysis tool, such as the R statistical software, in order to evaluate the data contained therein.
Initially, one CSV file was created for each month in the SCHOLCOMM archives' existence; these files were combined into thirteen Excel workbooks, each covering an entire year.From these workbooks, a master list of SCHOLCOMM participants was established.This was essential, as a particular author's name did not always remain uniform throughout the archive; one author, for example, had at least five distinct versions of their name appear in the data.As a result, a fair amount of sleuthing needed to be done in order to establish a single identity for each author.
Once each participant was uniquely identified, pivot tables were used to count the number of emails sent to the SCHOLCOMM list by each author for each year.This was done in order to gauge overall participation in the list.It was important, however, to attempt to address how frequently male and female participants interacted with others on the list; many of the emails sent to the list are of a closed nature, such as job postings, calls for papers, or conference announcements.In order to get a rough idea of interactions via the list, the number of "reply" emails sent by each participant, i.e. emails sent as a response to a posting on the list, were also counted.This was established using an Excel formula which noted whether or not "re:" appeared in the subject field of each message.Some manual checking was required at this point but, thanks to the fact that the list's archive is organized into conversation threads by default, this was not as onerous as it could have been.These yearly data were organized into a single Excel workbook, and pivot tables were again used to combine the yearly data into a single spreadsheet covering the entire history of the SCHOLCOMM list.
It is important to note that not all SCHOLCOMM interactions appear in the archive.
Replies can and do happen off-list, and there is no viable method for collecting off-list mes-

| eP2017
Journal of Librarianship and Scholarly Communication sages for inclusion in the data.Such messages fall outside of the scope of this study, however, as public participation in the SCHOLCOMM list is of primary interest.It should also be noted that, on occasion, a line reading "Message not available" appeared in the archive for the listserv.There is no way of determining why a particular message may not appear in the archive, according to the list's Server Administrator, but possible explanations include deletion of messages by administrators due to privacy concerns and a variety of technical issues (personal communication, December 1, 2016).No such "Message not available" instances exist before 2011, and all "Message not available" items appear to be replies sent to the list.In total, 132 of these errors appear in the SCHOLCOMM archive for the years 2003 to 2015.This would constitute approximately 2.6% of total messages, and they have not been included in the final data set analyzed as part of this study.After establishing yearly counts for both total number of messages sent to the list and number of replies sent to the list, the gender identity of each participant was coded for analysis.
Solely for the purposes of this study, four gender categories were established: M, signifying male-identifying participants; F, signifying female-identifying participants; U, signifying participants who either do not identify as either male or female or whose gender identity could not be established; and N/A for non-human participants such as organizational and SPAM emails.Positively identifying each participant's gender identity with absolute certainty would require asking them directly; with 650 list participants, over a third of which had only ever sent a single email to the SCHOLCOMM list, this was not perceived to be a viable option.Instead, each participant was placed into one of the above categories as follows: Names which the authors considered to be regularly associated with either the male (e.g.John or Robert) or female (e.g.Christina or Maria) gender, based largely on common first names in the United States, were categorized accordingly; this accounted for the majority of participants.For any name that could conceivably be considered agendered, or for any name that was unfamiliar to the researchers, a title or place of employment was established based on messages sent to the list if possible.Searches were then performed through online search engines or the participant's institution or employer in order to find any biographical information; the gender identity of the participant was then based on exclusively any gendered pronouns contained in said biographical information.A total of 375 female and 228 male participants were identified.In situations where it was not possible to locate such information (e.g., participants did not include a full name, a title, or a place of employment in their messages; there was no biographical information available; or biographical information did not include any gendered pronouns), the participant was placed in the "U" category as described above.Though the authors had intended the "U" category to include any gender nonconforming individuals, no participants were positively identified as such.A total of 21 participants were assigned to the "U" category.Finally, institutional or organizational participants were easily identified by the name given in the data as a post's author, e.g.NISO or ARL Communications.SPAM emails were not as immediately obvious as they were often sent from accounts with human names but were generally evident from the message's subject line.These were all grouped into the "N/A" category, which totaled 26 participants.
Though the authors have determined that the method outlined here is adequate for the purposes of the current study, it is fully acknowledged that this is an oversimplification of gender identity and sex.In particular, despite the biological connotations that may accompany the terms "male" and "female", they are used to signify gender identity in the present study.Furthermore, as discussed above, research for this study was undertaken without contacting the listserv for permission or for further information, such as preferred gender identities, in order to avoid unnecessary time constraints or skewing of the data due to low participation rate.The section below discussing possible avenues for future research outlines steps which may be taken to build on these findings while also supporting a more nuanced understanding of gender dynamics on the listserv.

RESULTS
The number of unique male and female individuals who posted to SCHOLCOMM along with the total number of posts from male and female participants each year is presented in Table 1, with percentage representations of those data given in Table 2. Overall, male participants comprised 35.1% of individuals posting to the SCHOLCOMM list but contributed 51.2% of the messages sent to SCHOLCOMM from 2003 to 2015.In comparison, female participants comprised 57.7% of the individuals posting to the list and accounted for 45.8% of the messages posted to the list.
This gender disparity becomes more pronounced when considering only those messages that were sent as replies to other posts on the list.Table 3 displays the number of unique male and female individuals who posted replies to the list in a given year along with the total number of replies posted by male and female participants in that year, with percentage given in Table 4. Female participants accounted for 53.8% of individuals replying to the list, but only accounted for 32.9% of reply emails.Male participants, however, accounted for 66.7% of reply emails while only comprising 43.8% of individuals replying to the list.Some simple statistical treatments, performed with the R statistical software, were used in order to examine the correlation between gender and interaction with the SCHOLCOMM list.Two measures of list interaction were used, both based on replies posted to the list: a) whether or not an individual had ever replied to a list message and b) the number of replies and the number of non-replies sent to the list by an individual.As above, only male and female list participants were considered for these statistical treatments.First, a 1-factor χ 2 test of independence using the chisq.test()function in R was performed comparing the gender of each participant with whether or not they had ever posted a reply to the list.The resulting χ 2 value, 11.476, indicates a very strong correlation between these two variables, giving a p-value of less than 0.005.A similar test was run by considering each email sent to the list and comparing the gender of the author with whether or not the email was sent as a reply to another message.The resulting χ 2 value, 272.17, indicates an extremely strong correlation, with a p-value less than 2.2 × 10 -16 .
Table 5 shows the contingency tables for each χ 2 test, along with the resulting statistical information.It is important to note that two male participants replied to the list with a great deal more frequency than any other list participants, combining for 464 replies total.No other participant replied to the list more than 65 times.As can be seen in Table 6, even with these two individuals removed, the χ 2 test still implies an extremely strong correlation between gender and both the likelihood of an individual to send a reply and the number of replies sent, with p-values of less than 0.005 and less than 2.2 × 10 -16 , respectively.It seemed a natural step to consider the same statistical tests on the data for each year.This was done but is not included in this study for two reasons: First, a χ 2 test of independence generally requires at least five occurrences to exist in each category in order to be considered valid and, as a result, several years would have been left out of the analysis as a result.Second, the resulting χ 2 and p-values did not add information to the study beyond what can be obtained through simple observation of the yearly raw data as discussed below.

DISCUSSION
These data show a contrast in how men and women both utilize and interact with the SCHOLCOMM listserv.Overall, the average male participant posted to the list just over eleven times throughout the list's existence, almost twice the average for female participants at just over six posts.This means that, despite comprising only about 35% of the participants on SCHOLCOMM, male participants accounted for over 50% of the activity on the listserv.
This gap grows more conspicuous when considering replies to the list, which can be used as a basic measure of interaction.Again, though women accounted for over 50% of the individuals who interacted with the list via posted replies, over 60% of the replies posted to the list were from male participants.In fact, the average number of replies sent by male participants was 9.4 as opposed to only 3.7 for female participants, a difference of almost two-thirds.When ordering the SCHOLCOMM list participants by number of replies, there is a clear distinction between male and female participants.As can be seen in Table 7, of the 20 individuals who replied to the list most frequently, 16 were men.The top six men together had 682 replies to the list, more than all female list participants combined.
These observations, along with the strong statistical correlation between gender and list interaction, help to provide a more complete understanding of the gender dynamics on the SCHOLCOMM listserv and support the opinion expressed by several participants in 2016 that male voices on SCHOLCOMM are overrepresented.However, it is important to bear in mind that the observed correlation does not provide an explanation as to why men are more likely to interact on the listserv.Similarly, this study considered only the number of posts by individuals, which does not provide any information on the content, purpose, or effect of participation; suggestions for further research are outlined in the conclusion below.Participation in SCHOLCOMM has changed significantly since its inception in 2003, however, and it is helpful to consider the data on a year-by-year basis.As can be seen in Figure 1, both the number of participants and the amount of participation on the listserv remained fairly steady from 2003 to 2010.However, an increase in the number of both can be seen beginning in 2011 and continuing through 2015.
That trend is also present when limiting to replies only, as is evident from the trend data in Figure 2. Again, the number of participants and the number of replies remain fairly Though the identification of outliers would likely be beneficial, it would require deeper statistical analysis and would likely include many of the most frequent contributors to the SCHOLCOMM listserv due to the high number of participants with only a single post.
We therefore leave this task to future researchers.

CONCLUSION
Participants on the ALA's SCHOLCOMM listserv voiced concerns in early 2016 over the state of the list; these concerns included a perceived overabundance of male voices on the list, which participants felt discouraged contributions from other groups.This notion was immediately met with dissenting opinions and, specifically, the rebuttal that such opinions could not be taken seriously without proof.Our study substantiates the initial opinions, showing that male participants are both more active in sending out initial messages as well as in replying to threads.
Despite this, the identification of any underlying causes for the overrepresentation of male voices on SCHOLCOMM was outside the scope of this study.Though future research can and should be undertaken to better understand the dynamics of the SCHOLCOMM listserv, initial steps towards more inclusive communication can be immediately implemented by participants.List members can expressly seek contrasting opinions and contributions to contentious topics, challenge themselves to speak up if they are not frequent contributors to the list, or critically evaluate their level of activity in the SCHOLCOMM community before posting to the list.Perhaps most importantly, any list member can amplify the contributions of women, members of underrepresented groups, and fresh voices on the list by repeating their ideas and attributing them to the original author.Building in a practice of listing and requesting preferred gender pronouns (e.g., she/her/hers, they/ them/theirs) would allow participants to further actions like amplification by making underrepresented groups more clearly identifiable.This self-identification does, however, bring it with the possibility of discrimination, and therefore this is only recommended if the community has a written code of conduct prohibiting such behavior, which SCHOL-COMM does indeed have, or a stated mission involving inclusivity. 6Gender dynamics in Journal of Librarianship and Scholarly Communication online communication are not fixed; such dynamics can and should be altered to better represent the diverse makeup of the community.

Limitations and Suggestions for Further Research
As discussed above, this study's identification of unique participants as well as their gender identity was basic and likely involved some amount of error.The number of individuals as well as variations in the names of authors posed major challenges in undertaking this initial research.Similarly, because identification of participants' gender identity was undertaken indirectly, it is possible that certain participants' preferred gender identities were misrepresented.
In terms of potential future research, the authors hope that this study can serve as a basis for future analysis of online interactions in the field of scholarly communications librarianship.Expanding the statistical analysis to include position, rank, and years of experience would likely provide a more comprehensive picture of the dynamics of the SCHOL-COMM listserv.While, anecdotally speaking, no clear difference in experience level was observed during the data-gathering process, a more systematic analysis would provide a definitive picture of whether there are larger dynamics informing the findings of this current study.Studying the participation of individuals who are transgender or those who do not conform to the male/female gender binary would also provide a more comprehensive understanding of dynamics on the listserv, as well as more broadly in online communication in the field.Additionally, delving deeper into the issue of gender identity and the intersection of other underrepresented groups would provide a better understanding of the findings of this study.
While the research undertaken for this study only included the metadata associated with messages, a discourse analysis looking more closely at the actual content of the SCHOL-COMM listserv, such as the one described in McGee and Briscoe (2003), would allow for a better understanding of the actual roles and contributions of participants.The comprehensive preservation of data spanning the history of the listserv would also allow future researchers to undertake a more thorough historical analysis than is presented here, though in the years before 2011, SCHOLCOMM did not see a large amount of participation.This makes statistical analysis of many of those years through a χ 2 test of independence impossible.The identification of outliers is similarly challenging due to the large number of list participants who posted or replied only once.As such, the simplistic statistical treatments utilized for this study leave space for future researchers to apply more sophisticated tools and to delve further into the content of the SCHOLCOMM listserv archives.

Figure 1 .Figure 2 .
Figure 1.Yearly trend data, male and female SCHOLCOMM participants and total post count

Table 1 .
Unique male and female SCHOLCOMM participants and total post count per year

Table 2 .
Unique male and female SCHOLCOMM participants total post count, percentages

Table 3 .
Unique male and female SCHOLCOMM repliers and total replies per year

Table 4 .
Unique male and female SCHOLCOMM repliers and total replies, percentages Journal of Librarianship and Scholarly Communication

Table 5 .
Contingency tables and statistical information related to list replies

Table 6 .
Contingency tables and statistical information related to list replies with two most frequent repliers removed.

Table 7 .
Summary of the 20 most frequent repliers to the SCHOLCOMM list Journal of Librarianship and Scholarly Communication