Dissertation to Book? A Snapshot of Dissertations Published as Books in 2014 and 2015, Available in Open Access Institutional Repositories

Johnson, A.M., Goldberg, T., & Detmering, R. (2017). Dissertation to Book? A Snapshot of Dissertations Published as Books in 2014 and 2015, Available in Open Access Institutional Repositories. Journal of Librarianship and Scholarly Communication, 5(General Issue), eP2177. http://dx.doi.org/10.7710/2162-3309.2177 Dissertation to Book? A Snapshot of Dissertations Published as Books in 2014 and 2015, Available in Open Access Institutional Repositories Volume 5, General Issue (2017)


INTRODUCTION
Searching the ProQuest Theses & Dissertation Global database for dissertations in English produces 64,842 results dated 2014 and 67,802 results dated 2015.For graduate students, publishing a dissertation may be absolutely necessary in a competitive job market, or it might simply be a personal goal.Whatever their motivation, many students have concerns about the presence of their dissertation in an open access institutional repository (IR) harming their chances of subsequently publishing it as a monograph.They worry that publishers will have no incentive to put out a book when all or most of the content is already freely available on the web.What these students may fail to grasp fully is that the development from dissertation to publishable manuscript is long and arduous, and in the case of many dissertation writers, it may not result in a publication.Initially, the authors wanted to answer the question, "Does the presence of a dissertation in an open access repository diminish the chances of it being published as a book?"The question, though, was far more complicated to answer than it seemed at first.Intrigued by students at their institution who were concerned about the likelihood of being able to publish material from their dissertation if it was in their IR, the authors undertook the present study to determine how many dissertations published as books were actually in IRs, and whether one could discern any patterns among publishers in two years' worth of data.We analyzed in particu-lar whether publishers behaved in the ways they claimed to behave as reported by Ramirez and colleagues (Ramírez, Dalton, McMillan, Read, & Seamans, 2013;Ramírez, McMillan, Dalton, Hanlon, Smith, & Kern, 2014).To address this question, the following exploratory research questions were examined.
1. How many dissertations that were published as books can be found in ProQuest Dissertations & Theses Global?
2. How many dissertations that were published as books can be found in both ProQuest Dissertations & Theses Global and an institutional repository?
3. How long on average does it take a dissertation to be published as a book?
4. How many libraries on average hold books that were originally dissertations?
5. What subject areas are most likely to be published as books?
6. Which publishers are most likely to publish dissertations?
The goal was to provide a snapshot of the current state of academic books resulting from dissertations in the humanities and social sciences and to provide data as a foundation for future studies in this area.Additionally, this study adds to the research on dissertations and their potential publication and examines the question of what publishers are doing with dissertations from a different angle than previous studies.

LITERATURE REVIEW
Both the open access (OA) movement and institutional repositories (IR) have been popular topics in the scholarly library literature for the last ten or more years, resulting in hundreds of articles being written about these areas and their impact on libraries.Consequently, this review is limited to the library literature on electronic theses and dissertations (ETDs) and specifically limits its treatment of OA and IRs to this context.Although the literature on ETDs is varied and extends back to the early 2000s, several recent articles examine the workflow or other technical aspects of ETDs in an IR.Li, Theimer, and Preate (2014) discuss the process by which Syracuse University moved their open access theses and dissertations from ProQuest to an in-house system called SURFACE, and how they subsequently created a process for in-house digitization of pre-2011 theses.Similarly, Wang, Bulick, and Muyumba (2014) describe their implementation of an IR with the attendant concerns of long-term preservation of the materials.These two case studies are typical examples of the literature in that they provide practical information for those wishing to implement an OA IR.For a discussion of the choices libraries face in this area, Clement and Rascoe (2013) compared ProQuest to institutional repositories, noting commonalities and distinctions and concluding that there is no "single best system."Clement (n.d.) has also published on the number of institutions dropping their requirements for students to submit their dissertations to ProQuest in favor of their own institutional repositories, and has noted elsewhere that ProQuest's holdings of dissertations have always been incomplete (Clement, 2013).
Other articles examined the number of IRs with ETDs in Asian countries (Ahmed, Alreyaee, & Rahman, 2014) or usage of ETDs by in-state or out-of-state users (Coates, 2014).While these articles take for granted the assumed benefits of OA theses and dissertations, some earlier pieces expressly make the case for their value, arguing against potential objections.For instance, Royster (2008) notes that OA dissertations from his institution received far more downloads than their counterparts housed behind firewalls, and Suber (2008), a professor of philosophy, lists nine reasons to mandate OA for ETDs, including improving discoverability and increasing scholarly impact.
Despite such advantages, institutional efforts to make ETDs more accessible to the general public have been controversial.In "Back to Grey: Disclosure and Concealment of Electronic Theses and Dissertations" (Schöpfel & Prost, 2014) the authors surveyed institutional repositories around the world and found that a percentage of ETDs were still inaccessible due to embargo or on-campus access restrictions.They outline the motivations of the different actors in the ETD process (author, dissertation supervisor, dissertation committee, library, etc.) and the degrees of openness in terms of reader rights, copyright, institution rights, policy, machine readability, and posting workflows.
Outside of library science and primarily in humanities disciplines, there has been considerable concern that the widespread electronic availability of OA theses and dissertations will hinder publishing opportunities and harm book sales, with much of the conversation taking shape in blogs, news items, and policy documents (Cirasella & Thistlethwaite, 2017).Some of the concern is a result of the significant revisions that must be performed on a dissertation in order to create a publishable manuscript (Brown, 2011;Germano, 2005).
The length of time this can take may factor into students' uncertainty.As Truschke (2015a) of Dissertation Reviews notes in the first of a three-part series of blog posts investigating dissertation publishing, anxieties among graduate students regarding their prospects for publication may be contributing to a substantial increase in the embargo rates for dissertations in various humanities and social science disciplines.In addition, several representatives for university presses have argued that open access ETDs may damage promising careers for young scholars, with Thatcher (2007), former director of Penn State Press, contending that libraries are buying fewer revised dissertations as books and thus publishers are publishing fewer of them.On the other hand, according to a thorough review by Cirasella and Thistlethwaite (2017), the majority of university publishers, including Columbia University Press and Harvard University Press, do not appear to consider the availability of an ETD in the evaluation of a book proposal.Likewise, based on interviews with editors from eight North American academic presses, Truschke (2015b) argues that publishers are largely unconcerned with the availability of an ETD.Moreover, Cirasella and Thistlethwaite (2017) point out that there is a lack of evidence to support the notion that libraries are using a book's status as a revised dissertation as a reason not to purchase it.
Nevertheless, some professional organizations have promoted the importance of embargoes as a means of protecting the publishing prospects of graduate students and early-career scholars.For example, the American Historical Society (AHS) has issued a statement supporting university ETD policies that allow up to six-year embargoes (2013).However, the AHS statement has led to contentious debates about the merits of embargoes (Hattem, 2015;Patton, 2018), and the Harvard University Library's Office for Scholarly Communication ( 2013) published its own position statement criticizing the AHS for failing to offer adequate evidence for the embargo policy.There has also been opposition to open access ETDs among creative writers.The Association of Writers and Writing Programs (AWP, 2016) has issued statements supporting either embargoes or no electronic access to ETDs, and creative writing students at the University of Iowa successfully opposed the university's open access intentions (Foster, 2008).Also in this area, Kaufka and Bryan make a case against open access electronic theses, citing their research examining a 38-year sample of MFA graduates at Bowling Green State University.In this study, 64% of the graduates published a book after 5 years and 43% took more than ten years (2007).It is essential to note, however, that creative writing presents a special case, given the expectation of first-publishing rights among publishers in that field (Churm, 2011).Still, opposition to OA among creative writers may contribute to negative views among scholars and graduate students in other disciplines, even if such views are not warranted based on the available evidence.
Several recent articles have attempted to provide evidence that open access ETDs do not harm chances for subsequent publication.Bennett and Flanagan (2016) examined the impact of open access dissertations and found no correlation between downloads and the number of citations for the titles they surveyed and that subsequent published work seemed to be necessary for the dissertation to receive more citations.Two companion articles published in College & Research Libraries in 2013 and 2014 report on research that surveyed journal and book publishers about their willingness to publish material from theses and dissertations, particularly those available in an institutional repository (Ramírez et al., 2013;Ramírez et al., 2014).Overall, they found that in the case of the humanities and social sciences, 82.8% of journal editors and 53.7% of university press directors would consider publishing material already available in an open institutional repository, and 89.5% of science editors would either welcome or consider these submissions, provided they were revised.Interestingly, the size of the publisher was a factor.Smaller publishers were less likely to want to consider EDTs available in IRs.Some university presses would not consider them under any circumstances in certain subject areas including romance literature, applied and social psychology, and mathematical methods in social sciences (Ramírez et al., 2013).Libraries, it appeared-at least anecdotally, from publisher comments-were implicated, in that vendors were reporting that some libraries were eliminating books published from dissertations as part of their approval plan profiles.Still, publishers are continuing to publish, and a study done in 2016 profiled art history books published by university presses from 1998 to 2013 that were listed as revised dissertations in a major online scholarly book ordering system (Hérubel, 2016).This study downloaded the titles listed in YBP Library Services (GOBI's predecessor) whose original format was "dissertation" to create a corpus of titles, the same methodology that the authors of this article used in their study.Hérubel's data did not show a decrease in publishing from 1998 to 2013, but in fact showed a slight increase in the number of overall titles published by university presses.His data also demonstrated that the percentage of art history titles originating as dissertations increased over the time period as well.

METHODOLOGY
To determine how many current academic monographs are based on open access dissertations, the authors searched an online ordering tool for books based on dissertations and compared the results against an ETD database and an open web search tool in order to determine whether publishers were publishing ETDs available in open access repositories.Data regarding the number of books found to have a matching dissertation in ProQuest Global Dissertations & Theses and then subsequently found in an IR, year of dissertation completion, call number assigned, and publishers were examined.
To identify books based in dissertations, one of the authors downloaded all citations with the search words thesis or dissertation, limited to years 2014 or 2015, and available in print or cloth using GOBI Library Solutions from EBSCO, an online ordering tool.GOBI records whether the book in their database was based on a dissertation or thesis, and this search was designed to garner as many of those book titles as possible.This same database of books was used by Hérubel (2016) in his study of art books based on dissertations.E-books were not included because electronic editions of books are often published subsequent to print editions, and not every print edition has an electronic version.The Ramírez articles (Ramírez et al., 2013;Ramírez et al., 2014) mentioned previously surveyed print publishers.The authors also felt that mixing in electronic editions could complicate the data in ways that would be difficult to address.GOBI allows the search to be limited by formats such as dissertations, translations into English, Bible commentary, and workbooks.The dissertation format includes all books based on dissertations; consequently, this limiter was chosen.Language is an additional limiter; the data in this study was limited to English.The data was collected in Spring of 2016.A search for dissertation or thesis, limited to the format dissertation and the language English resulted in a total of 2,682 book titles for the year 2014 and 2,464 titles for the year 2015, the most recent full years of data available.The authors chose to look at two years of data.While this is not enough to see trends, it is enough to provide a foundation for a future study of trends.GOBI provides a significant amount of information in the download including title, author, publisher, place of publication, date, call number, and a variety of fields such as series title, recommendation level, the institution where the dissertation was completed, ISBN, number of pages, audience level, and some fields internal to the GOBI system.The authors made the decision to focus on books published with assigned call numbers in the A-P Library of Congress Classification classes, which include only the humanities and social sciences.The rationale was that science dissertations are more often turned into journal articles as opposed to books.Books in call letters Q-Z (Library of Congress Classification classes that deal with science) were removed from the data set, as were duplicates (often books published in both print and cloth), titles that dealt with how to write a dissertation or thesis, and titles that were republications of older materials, resulting in 1,878 titles for 2014 and 2,090 for 2015.
Library student assistants were then assigned to look up the author of each title in the sample in ProQuest's Dissertations & Theses Global database.An earlier pilot sampling by the authors of this article had determined that searching the Networked Digital Library of Theses and Dissertations (NDLTD) was not producing enough additional dissertations to justify the time needed to search both resources.ProQuest's database was selected because it is considered by many to be the standard subscription resource for dissertations and theses (Procious, 2014).Although ProQuest Dissertations & Theses Global Database is advertised as "the largest single repository of graduate dissertations and theses," it is important for librarians to understand that it is not anywhere close to comprehensive.Other studies have noted this as well (Procious, 2014).Still, it provided the largest, most expedient means to locate the corresponding dissertation.If the author was found, and the title of the dissertation resembled the title of the book in its subject matter, the student recorded a "Yes" for the dissertation's presence in ProQuest, the title of the dissertation, and the year it was completed in an Excel spreadsheet.They then searched the title of the dissertation in Google to determine if it was present in an open access institutional repository.If a link to the dissertation in an open access repository was found, the student recorded a "Yes" for the dissertation's presence in its corresponding IR.The authors reviewed the students' work and corrected any mistakes that they found.Occasionally when the title was searched in Google, the dissertations were found in ProQuest's open access repository, a new service offered by ProQuest where the student can pay a $160 fee to ensure perpetual hosting and open access availability of their dissertation.The dissertations found there were simply counted as being in an IR, because this study was interested solely in whether the dissertation was available in any OA IR, regardless of type.
The authors of this study searched for 80 dissertation titles in the IR of institutions associated with the dissertations to determine if searching the institutional repository directly for the dissertation title produced more results if it was not found in ProQuest.No instances occurred where the dissertation was located using this method, indicating that it was no more effective to search the IR directly than to simply search the dissertation title in Google.An early pilot of the study where an attempt was made to search the open web for the author of the book and find the title of the author's dissertation through sleuthing was found to be time-consuming and nonproductive; of 183 dissertations searched, only 1 was found that was in an IR but not also in ProQuest.Consequently, for authors not found in ProQuest, no attempt was made to search the author and/or author and keywords in Google due to the amount of time required and the uncertainty of matching the author to the correct dissertation.
The titles were then sorted by call number in order to see what subjects were most likely to be published as books.Publisher data was gathered to examine which publishers were most likely to publish monographs based on dissertations.The publisher was the press listed by EBSCO on their GOBI record.No effort was made to determine which imprint belonged to another press.For example, Archeopress and Archeopress Archaeology were counted separately even though they appear to be related; Palgrave MacMillan, a subsidiary of Springer/Nature, was counted as a separate publisher.
In order to determine how many libraries held the print copy of the book title, one of the authors of this study looked up the library holdings in OCLC for the books based on dissertation titles that were found in both ProQuest and institutional repositories.This would create a baseline for the average number of libraries holding books based on dissertations that were also available in institutional repositories.For each title, the author chose the record for the print title with English as the language of the cataloging agency.If there were duplicate print records, the print record with the most holdings was chosen.The study authors' institution subscribes to Demand Driven Acquisitions (DDA) through GOBI Library Solutions, and holdings are attached to e-books according to the library's profile.The library doesn't own most of the e-books to which its symbol is attached, and presumably, the same would be true for other DDA participants.Therefore, the print record was used, and e-book records were ignored.

Number of book titles for which a corresponding dissertation was found in ProQuest and/or an institutional repository
Even though two years of data is not enough to see trends, the data does show a slight decrease

Time lapsed between dissertation completion and book publication
The date of the dissertation was recorded when it was found in the ProQuest database.In 2014, for the dissertation titles found, the average year in which the dissertation was completed was 2008 (rounded up), revealing a six-year gap between dissertation completion and book publication.For the 2015 data, the average year of dissertation completion was 2007, resulting in an average gap of eight years between dissertation completion and book publication.

Average number of library holdings for book titles with a corresponding dissertation in an open access repository
For the 2014 list of books with a corresponding dissertation in ProQuest and an IR, the average number of libraries listing in OCLC that they owned a book title in print was 83.For 2015, the data showed an average of 71 libraries holding those titles.

Book titles by LC class
The number of titles for each Library of Congress call number class was calculated.There was very little change between the two years examined.For the 2015 data, three call number classes-P, B, and H-were almost evenly split by percentage with 404 (19%), 397 (19%), and 375 (18%) titles respectively.Class D had 247 (12%) titles.All other classes accounted for 32% of the titles, with each class having well under 10%.[Table 2] Which publishers are accepting and publishing revised dissertations as books?
The books on the 2014 list were published by 232 unique publishers.There were 251 publishers in 2015.The top ten publishers accounted for 50% of the monograph titles published in 2014 and 53% published in 2015.[Figures 2 and 3] In 2015, 109 publishers published only one title, while 100 publishers published only a single title in 2014.In comparing the lists of the top ten publishers for all book titles and the lists of publishers of books with corresponding dissertations in an IR, the lists were very similar.For 2014, two publishers (Springer and Cambridge Scholars Publishing) from the total list were not found on the IR list [Figure 4], whereas in 2015, only one (University of California Press) was not present on the IR list.[Figure 5].Alternatively, Pickwick Publications and Lexington Books were on the IR list for 2014 but not the list of total books for 2014, and Cambridge Scholars Press was not on the IR list for 2015 but was on the list of total books for 2015.

DISCUSSION
While the data cannot support a definitive conclusion about the likelihood of a dissertation that appears in an IR being published as a book, the data does provide a useful snapshot of the current state of dissertations that are subsequently published as books.Our data set produced 1,878 book titles for 2014 and 2,090 book titles for 2015.While these dissertations took a number of years to become publishable manuscripts, in comparison to the over 60,000 dissertations produced in 2015 listed in ProQuest, the fact that they were published at all sets them apart.While in any given year the percentage of dissertations published as books might vary, it is overall quite small.
The data here showed that only 33% of the books produced in 2014 and labeled as being dissertations had a corresponding dissertation in the ProQuest database.In 2015, that percentage dropped to 31%.only two years of data, it is not possible to detect trends, but it will be interesting to see if that trajectory continues.This finding also supports earlier authors' contentions that ProQuest is by far not comprehensive.
One of the more revealing findings of this study was the number of dissertations available in both ProQuest and an institutional repository.In 2014, 12% were found in both repositories, and in 2015, 15% were found in both.Again, it is not possible to see trends with only two years of data, but this will be a data point to monitor in the future.As more institutional repositories appear and ingest more theses and dissertations, will universities cease to send their research materials to ProQuest?Institutions, such as that of this study's authors, having weighed the cost of printing, binding, storing, and circulating these materials in print, as well as the costs associated with having them published in ProQuest, have opted to discontinue sending their theses and dissertations to ProQuest.As mentioned in the literature review, other authors have already noted universities making this transition as a trend (Clement, 2013;Clement & Rascoe, 2013).If so, what will the implication of this change be?Google makes it relatively easy to find a specific dissertation in a repository, but it does not currently facilitate searching a large body of dissertations at once in the manner of ProQuest.Will the NDLTD be able to take its place?
One of the findings that could be significant if more data were examined is the average number of libraries holding a title in OCLC.For 2014, the average number of libraries holding the book titles that appeared as dissertations in both IRs and ProQuest was 83.In 2015, the number had dropped to 71.Does this portend what some publishers have predicted, namely that libraries will stop buying dissertations published as books that originally appeared in IRs? Could it be simply the result of budget cuts and many libraries not buying as many print books of any kind?Since these are recent titles, could the numbers for both years increase as collection specialists purchase these titles, as long as they remain available in GOBI?While the authors of this study wanted to focus on the most recent data available, it is possible that in this case, recent data does not provide the fullest possible picture.This dataset only examined humanities and social sciences titles, but it was interesting to note the consistency of the subject areas represented in the book titles.Call number classes P, B, H, and D represented two-thirds (66-67%) of the dissertations published as books over the course of the two years.While this finding is not surprising, it does suggest several not mutually exclusive possibilities-that these areas are of most interest to publishers, that these areas are studied by more graduate students and are thus written about more frequently, and/or authors of these subjects are more likely to pursue publication of books due to the nature of scholarly communication and prestige in those fields.
The list of the top ten publishers that published revised dissertations in the two years of data the authors examined was also not surprising.Many were large commercial publishers who publish numerous academic titles in any given year.The common perception that most books based on dissertations are published by university presses is debunked by this data.This data could also provide graduate students with possible publishers with which they were previously unfamiliar, although it would be important for dissertation authors to investigate the reputation of particular publishers in their field since publisher quality and reputation are important factors in hiring and tenure decisions.Further research that examines the publishers in the top ten list could yield additional information, such as the number of dissertation-based monographs published relative to the total number of publications in a given year.However, these data clearly seem to indicate that for the most part, publishers do what they say they do, which is consider dissertations available in open access IRs on a case-by-case basis.

Limitations and Suggestions for Further Research
There are several areas that might lend themselves to further research.The authors focused on dissertations available in GOBI that could be found in ProQuest and an IR.The authors initially attempted to search the NDLTD database, but decided to focus only on ProQuest and IRs.A study that compared ProQuest to the NDLTD using a similar methodology to that of Procious (2014) would help scholars ascertain whether that database might be a viable alternative.
Time limitations were also a factor in the authors' decision not to search the library holdings for titles appearing only in ProQuest.A future study could compare that number to the number of libraries holding titles appearing in ProQuest and an IR that this study found in order to provide a more nuanced picture.
for graduate students to understand the range of publishers that they could contact.University presses publish a relatively small number of dissertations, but there are many other small publishers as well as large commercial publishers who are willing to consider a revised dissertation manuscript.As the relative prestige or influence of a publisher may impact future prospects for promotion and tenure, graduate students should be advised to begin thinking early about their options for book publication.Finally, it will be interesting to see if the numbers noted here, showing a decrease in dissertations submitted to ProQuest's database and an increase in submissions to IRs, will constitute a trend over time.Given the complexity of the publishing environment and the diversity of dissertations produced each year, there are no simple answers to questions surrounding the viability of publishing a dissertation as a book.However, it is hoped that this research will add more nuance to ongoing discussions among librarians, teaching faculty, and graduate students.

Figure 2 .Figure 3 .Figure 4 .Figure 5 .
Figure 2. Top Ten Publishers of Books Based on Dissertations in GOBI 2014 in the number of dissertation titles found in GOBI and a slight increase in the number of titles found in both ProQuest and an IR between 2014 and 2015.For 2014, 627 or 33.4% of the titles downloaded from GOBI were found in ProQuest Dissertations & Theses Global database and 1251 titles or 66.6% were not.Looking at the 2015 data, the percentages are similar, with 31.4% or 656 titles found in the ProQuest database and 68.6% or 1,434 titles absent from the database.When the title was found in ProQuest, it was subsequently searched in Google.For the 2014 titles, 234 or 12.3% were found in both institutional repositories and the ProQuest Dissertations & Theses Global database.The 2015 data included 318 titles or 15.2% that were found in both institutional repositories and ProQuest.Percentage of Published Books in GOBI Based on Dissertations Found in ProQuest and Institutional Repositories

Table 1 .
Total Books in GOBI Listed as Dissertations by LC Class in 2014 (1,878 titles)

Table 2 .
Total Books in GOBI Listed as Dissertations by LC Class in 2015 (2,090 titles)