Support Projects for Enhancing Function
Developing a Comprehensive Database of pre-Modern Books
using Kuzushiji Optical Character Recognition (OCR) Technology
Fiscal 2017
We aim to improve dramatically the utilization of rare books and old materials held at the Theatre Museum by using the kuzushiji OCR technology system, create a new environment to promote research, and enhance databases related to rare books.
Summary of research findings, fiscal 2017
In this project, we aim to use kuzushiji OCR technology to construct a new environment for researching old materials related to the theater. To this end, we have been working on an open-access tool for viewing these materials. This tool is designed so that the user can view and navigate a digital facsimile of the manuscript, while simultaneously viewing superimposed transcribed characters, which are revealed intuitively upon using the cursor. In 2017, the fruits of the previous year’s work went online on the kuzushiji viewer page of the project website. By making these kabuki texts available to view in this way, we have made progress in our endeavor to help researchers analyze kuzushiji texts.
With a view to making more content available for this tool this year, the team worked on digitally rendering one joruri maruhon (“Yoshitsune Senbon Zakura,” 100 two-pages spreads), six kaomise banzuke, and six yakuwari banzuke. However, we did not only increase the volume of data we made available through the viewer, but we also attempted to reform the work process. We assessed the potential of this work process for training students to read and decipher kuzushiji texts. To this end, we entrusted to a student the work of preparing the data for the joruri maruhon text, a transcription of which already exists. We instructed the student to prepare the data by collating the original manuscript with the transcription, while incorporating the corrective feedback of experts. We concluded that the work process did indeed help the student to develop the ability to decipher the text. As for the kaomise / yakuwari banzuke texts, which had no transcription, we searched for an efficient work process for preparing the data, while entrusting the data preparation work to experts, and proceeded with transcribing the texts.
We also pursued an effective way to communicate our accomplishments in 2017. To make the viewer more user- friendly, we started upgrading the viewing environment and provided a highlight function that makes it easier for the user to shift back and forth between viewing the manuscript and the transcribed characters. Additionally, to make it more widely accessible, we subsumed the data set of character figurations into Cultural Resource Database as well as put a terminal viewer in the permanent exhibition room of our Museum, which reopens in March 2018 to commemorate the 90 th anniversary of its foundation.
Moreover, we started communicating our efforts to other institutes and organizations to broaden our endeavor to help researchers decipher texts and reassess the significance thereof. As part of this strategy, Ryuichi Kodama, director of the Theatre Museum, attended the symposium The Future of Digital Cultural Assets 2017 (Roppongi Academy Hills, November 29) and delivered a lecture titled “Digitizing Materials and Researching Theatrical Productions: The Theatre Museum’s Attempt to Utilize Kuzushiji OCR Technology.” Also, Kotaro Shibata delivered a presentation titled “The Potential of Kuzushiji OCR Technology” at the Kabuki Association-hosted symposium “Kabuki Research in the Digital Age” (Waseda University, December 10). Through interchanges with other projects, we have been reexamining this project with a view of developing it further.

Transcribing kuzushiji (maruhon)

Transcribing kuzushiji (banzuke)
See here for the kuzushiji viewer