UCSF Archives & Special Collections awarded $99,325 LSTA grant for textual data extraction from historical materials on HIV/AIDS

The Archives and Special Collections department of the University of California, San Francisco (UCSF) Library is pleased to announce the award of a $99,325 “Pitch-An-Idea, Local” grant for the first year of a two-year project from the Institute of Museum and Library Services’ (IMLS) Library Services and Technology Act funding administered through the California State Library. The Archives will take the nearly 200,000 pages of textual HIV/AIDS historical materials which have been digitized as part of various digitization projects — including the National Historic Publications and Records Commission (NHPRC)-funded project­, “Evolution of San Francisco’s Response to a Public Health Crisis;” and the National Endowment for the Humanities (NEH)-funded project, “The San Francisco Bay Area’s Response to the AIDS Epidemic” — and will extract unstructured, textual data from these materials using Optical Character Recognition (OCR) and related software. The project team will prepare the text as a research-ready, unstructured textual dataset to be used for digital humanities, computationally driven cultural heritage, and machine learning research inquiries into the history of the HIV/AIDS epidemic.

The 24-month project, entitled “No More Silence — Opening the Data of the HIV/AIDS Epidemic” has commenced as of July 1, 2018. The digitized materials from which text will be extracted include handwritten correspondence, notebooks, typed reports, and agency records which represent a broad view of the lived experience of the epidemic, including documentation from People with AIDS and their friends, families, and scientists and public health officials working to slow the epidemic. All historical materials represented in this dataset have been previously screened to address privacy concerns. The resulting unstructured, textual dataset will be deposited in the UC Dash datasharing repository for public access and use by any interested parties, and will also be deposited in other similar data repositories as appropriate. “During my tenure at UCSF,” says health sciences historian and professor in the Department of Anthropology, History, and Social Medicine at UCSF, Dr. Aimee Medeiros, “I have been inspired by the library’s enthusiasm and dedication to public access and the use of practices in the digital humanities to help maximize access to HIV/AIDS material.” This project will build on that legacy by bringing these valuable historical materials into the realm of digital humanities and scientific research and making them computationally actionable.

Please find a full summary of the project on our blog. We are always interested in hearing from colleagues involved in similar work. For inquiries, contact University Archivist Polina Ilieva or Digital Archivist Charlie Macquarie.