The idea of inclusiveness and “language sensitivity”Footnote 1 in digital humanities by giving more visibility to the voices of scholars working on non-Anglophone sources and to the importance of creating multilingually-accessible digital platforms and tools are gradually appearing in the frontline of DH-related discussions. In the case of East Asian studies, specifically, this phenomenon translated into the publication of multiple state-of-the-field pieces which shed light on the developments and current challenges of “going digital” in the context of East Asian languages (Cha, 2018; Horvath, 2022; Nagasaki, 2019; Vierthaler, 2020). These reflections showcase how scholars in the field are aiming to overcome the additional issues that processing non-Latin scripts often entail by building area-specific tools, for example (K-)MARKUS, LoGaRT, or KuroNet, contributing to (or even initiating) collaborative long-term projects and workshops, such as the NEH-funded New Languages for NLP: Building Linguistic Diversity in the Digital Humanities institute at Princeton, the Impact of the Digital on Japanese studies symposium and its redux event at the University of Chicago, the DHAsia Summit Meeting at Stanford, the Dream Lab project at the University of Pennsylvania, as well as other Asia-specific conferences and workshops.Footnote 2 In addition, the recognition of the importance of digital methods in the study of Asia has resulted in the incorporation of special panels in the conference programs of the Association for Asian Studies, dedicated to topics related to digital technology, which complements the existing and entirely DH-focused events of other academic organizations, like the Japanese Association for Digital Humanities. However, what makes the state of the field thought-provoking is that in parallel to serious developments which clearly help place DH projects, achievements, and innovations into the spotlight, most scholars are still wrestling with often fundamental challenges related to the inaccuracy of OCR (Optical Character Recognition), the lack of multilingually accessible tools, and additional problems to solve when attempting to create non-Latin script-based models in NLP (Natural Language Processing). Taking both the results and existing challenges of digital Asian studies into account, this essay will focus specifically on the case of digital Japanese studies, which seems to have hitherto received somewhat less attention than the role of DH in Chinese studies. Echoing the overarching concept of the current issue, I will structure the paper by juxtaposing projects that either engage in discussions about the digital, examine various research questions with the digital, or intertwine these aspects. It should also be noted that as much as I consider inclusivity a priority, the detailed analysis of all existing publications and initiatives would exceed the limitations of this piece. Therefore, I rather aim to demonstrate the remarkably creative utilization of the digital in the relatively small microcosm of Japanese studies which can be of interest to scholars working with sources in other languages and scripts as well, non-Latin and otherwise.Footnote 3

1 Building and curating tools and databases

A hallmark of DH in a Japanese context has been its major focus on tool development. To a large extent, this keen interest in the “technical” side of digital humanities seems to stem from the complexity of the Japanese writing system, which necessitates specialized methods, for example for OCR purposes. Recent technical developments in digital Japanese studies through tools such as KuroNet and miwo for kuzushiji (崩し字, handwritten classical Japanese) recognition (Clanuwat & Kitamoto, 2018, 2019, 2020; Clanuwat & Kitamoto, 2021; Le et al., 2019; Yamamoto & Osawa, 2016; Hashimoto, 2021, 2022; Yamada, 2021), now facilitate the handling of pre- and early modern Japanese sources to some extent. Dealing with Kanbun (“Chinese writing” or literary Sinitic), however, continues to be a major challenge. More specifically, the variety and complexity of Chinese characters, the layout of Kanbun texts with small “reading aids” still make this language severely under-resourced in DH, rendering the effectiveness of Kanbun-based digital research processes exceedingly challenging.Footnote 4 That said, considering the importance of kuzushiji in Japanese textual heritage, the open access KuroNet platform, developed at the Center for Open Data in the Humanities, and its accompanying smartphone application, miwo, which has won the 2022 Good Design Award (miwo: App for AI Kuzushiji Recognition), both have the capacity to not only recognize classical texts for scholarly purposes, but also contribute to the preservation and popularization of premodern Japanese cultural heritage among a broader audience. The latter perspective became even more prominent with the launch of the Kaggle Kuzushiji Recognition Competition, a form of crowdsourcing with the final goal of using community input to refine KuroNet.Footnote 5

Another aspect of DH in Japanese studies worth mentioning here is related to the importance of collaboration, for example in database building. Akin to the well-established China Biographical Database (CBDB)Footnote 6, currently a joint project of Harvard University, Peking University, and Academia Sinica, the Japan Biographical Database (JBDB), coordinated by Bettina Gramlich-Oka at Sophia University, provides an elaborate collection of the information of mainly early modern (18-19th century) individuals with multiple plugins leading the user to external platforms for further details. A key highlight of this ongoing collaborative project consists of its network-focused visualization feature which can also be transferred to maps.Footnote 7 The Digital Literary Map of Japan (日本デジタル文学地図, Nihon dejitaru bungaku chizu)Footnote 8, coordinated by Judit Árokay at Heidelberg University and funded by the National Institute of Japanese Literature and Osaka University, provides a similarly abundant dataset, but here the focus revolves around combining spatial humanities and literature through the annotation of “poetic places” (名所, meisho or 歌枕, utamakura) in premodern Japanese written documents. What makes this tool special is its sensitivity to the needs and nature of working with sources in a Japanese context by providing a tailored filter which allows for search in kanji (Chinese characters), furigana (kana transcription), and romanization and, in the case of administrative units, kuni- (domain-) based exploration is also available. The results also move beyond displaying basic information and provide the brief history of the places, intertwined with quotes from relevant literary works in Japanese, accompanied by their transcribed version, often with English translation and an IIIF-based image window plugin showcasing the original form for a multimodal experience.

The abovementioned tools and databases can be considered the means and basis of further investigation. However, as I will discuss in the following section, the past few years have seen the emergence of projects that utilize the digital also as a platform for publishing critical analysis, which can be, although not necessarily is, based on digitized sources – nevertheless, these pieces represent another layer of how the digital can impact the way we think about Japan and East Asia.

2 Re-reading Japan and East Asia through the digital

The idea of publishing online is not necessarily new, but in an academic context such pieces often refer to accessing articles in HTML or pdf format. An increasing number of projects, on the other hand, aim to make more substantial use of what technology has to offer by creating elaborate born-digital narrative experiences through the integration of interactive components to tell stories that would otherwise be challenging to present the same way in printed books due to constraints regarding, for instance, the inclusion of dynamic visualizations. Such born-digital projects can take diverse shapes, such as digital exhibitions, textbooks, or even monographs among others. The University of Manchester’s Travels in Tokugawa Japan (1603–1868): a Virtual Journey exhibit, for example, revolves around maps to explore the sociocultural and urban history of early modern Japan through the concept of traveling (Introduction (The University of Manchester, 2019–2021)). Besides the importance of cultural preservation and (open) knowledge dissemination for which digitization is often employed, an advantage of showcasing content entirely digitally lies in its capability to offer high-quality and enlargeable images (in this case of maps) for an enhanced user experience. The emphasis here is clearly on the visual components which can serve as a useful supplement to other (text-based) scholarly outputs.

Another elaborate iteration of interactive born-digital initiatives is the well-established Bodies and Structures: Deep Mapping Modern East Asian History project (Bodies and Structures 2.0., 2021) which combines spatial historical considerations with pedagogical purposes. Hosted on a University of Southern California-based ScalarFootnote 9 website, this project by David Ambaras and Kate McDonald offers an immersive, non-linear reading experience with the goal of “decentering cartographic space” in order to delineate a transnational mosaic of individual spatial experiences and spatial realities mainly in the context of imperial Japan and other territories relevant to it (Ambaras & Fletcher, 2019a). To some extent, this project resembles the openly accessible born-digital volume, The Chinese Deathscape: Land Reform in Modern China, edited by Thomas Mullaney, which also builds upon the concept of space by mapping the shifting geographical distribution and characteristics of burial sites in China (Mullaney, 2019). The interactive map function, which accompanies the textual narrative, allows more flexibility for the reader to explore the data presented in the visualizations which would likely not be possible in a print format. However, from a DH perspective, Bodies and Structures takes a step further by using the digital not only as a means for more fluid, rather than prescriptive, data sharing, but to even reconsider the very meaning of reading and learning. As the authors explain in a three-part article series, this translates to an experience with four “entry points”, namely “list of modules, visualized tag index, grid visualization of the entire site, and a geotagged map” (Ambaras & Fletcher, 2019a). The myriad trajectories create a non-linear reading experience which also harmonize with the expressly multilayered approach of the platform and even encourage the user to ponder on some fundamental questions, such as the most suitable term to describe the genre of the work. One could additionally point at the mechanisms of the “entry points” because the extent of freedom they, particularly the visualizations, provide can also make them more challenging, and less intuitive, to utilize. The nodes in the visualizations often refer to relatively broad and generic concepts and while these can help the reader find topics of particular interest (thus essentially serving as a substitute for the index in a traditional book) and trace potential connections between these themes, the terms appearing here may not be concrete or informative enough to guide the reader through the labyrinthine structure of the work. For example, clicking on the node with the label “gender” in one module can eventually transport the user to a different part of the “book”, but it is not always clear immediately where exactly the newly displayed section belongs. This can result in a somewhat puzzling experience which hinders the ability of the reader to see the content more holistically, therefore advance guidance may be necessary when the platform is used in a classroom setting. However, having a “bird’s eye view” may not even be a goal because the mosaic-like structure of the project may indirectly convey the message of incompleteness and the impossibility of knowing everything. Thus, in a sense, Bodies and Structures can be used to show the gradual progression of scholarly work, the variability of topical connections, as well as the inherent subjectivity of juxtapositions and interpretations. Coupled with the analytical sections, complemented by digitized primary sources and detailed metadata, this “research environment”, as the authors name it (Ambaras & Fletcher, 2019b), constitutes both a useful introduction to students on how research projects come to existence and a thought-provoking resource for a broader audience to reconsider the meaning of scholarly work in the humanities – digital and otherwise.

3 Situating the role of numbers through digital Japanese studies

My approach to analyzing the characteristics of Bodies and Structures revealed a number of aspects to consider beyond the relatively narrow boundaries of digital Japanese studies which can apply to scholars in other areas as well. However, while I treated the Bodies and Structures project as an example of how the digital can be utilized to re-read Japanese and East Asian history, some novel pieces in the existing scholarship, such as Hoyt Long’s recently published book, The Values in Numbers: Reading Japanese Literature in a Global Information Age, represents at least as much a case study of the opposite. After multiple publications on the integration of digital methods into the study of Japanese literature in a global context and on comparative approaches to the contributions of close and distant reading to the examination of English haiku poems, here Long wrestles with the concept and significance of numbers in modern literature. The title suggests a focus on studying literature in the context of digital technology, but this multilayered book is simultaneously concerned with the question of what non-Western literary studies can tell us about the role of numbers in humanistic inquiry. In so doing, the book places the case of modern Japanese literature in the spotlight, thereby also strengthening the representation of non-Western voices in DH.

Akin to the Bodies and Structures project, determining the genre of The Values in Numbers is not without complexity. The author himself refers to it as a “primer” which, to a large extent, works effectively, since each chapter revolves around one or more terms the author considers key in DH methodologies. In order to illustrate his points, Long uses diverse case studies from Japanese literary studies with implications for a broader audience. However, instead of a didactic introductory textbook, Long uses discussions on the notions of sampling, statistical modeling, inverse progression, word embeddings, repetition and redundancy, “algorithmic competence” and scale, among others to explore broader methodological and theoretical questions: the problem of (in)completeness in the context of source collections, the challenges and realities of building digital archives, and the question of why quantitative methods in a humanistic context came to be considered problematic in the first place, (in)directly making a case for the values in numbers and for DH methods as a whole, while remaining consistently critical of both “close” and “distant” reading.

More specifically, the book seeks “an analytical space in which the conventions of literary study can be renegotiated in tandem with the statistical facts and models applied to texts in our current global information age.” (Long, 2021: 6) To this end, Long explores how well-known Japanese authors, such as Natsume Sōseki, used numbers to shape existing perceptions about literature and to reconsider the question of value while aiming to handle Western literary influences in a Japanese milieu.

However, confining this rich and skillful analysis into the relatively narrow category of primers would not do justice to the broader relevance and implications of this monograph and to its potential future role in digital and literary studies (Japanese or otherwise). This can be seen, for example, in the breadth of the content which even extends to a meditation on the incompleteness and thus the inherently biased and “false sense of the archive” (69). The concept of the “imagined totality” (71) of archives can resonate with those less familiar with Japanese literature per se too since it echoes the claims of the existing scholarship (see for example Posner, 2016) with regard to the importance of being mindful of the logic behind the composition of archives (or statistical datasets) which can influence and skew the conclusions one draws from the sources included in these human-made collections.

From a literary studies perspective, the conclusions may not be entirely new – Long also acknowledges that part of his contribution, for example on racial discourse, serves more as a starting point for further discussion. Nevertheless, the book’s goal to not only use quantitative methods to interrogate questions related to Japanese literature, but to also build upon his case studies to talk about the “values in numbers” elevates the significance of DH as a broader “phenomenon” which may be difficult to pin down in the context of disciplinary categories.

This book is not a “primer” in a sense that it does not show the reader how to write code or how exactly the visualizations embedded in it have been created. On the other hand, it is a primer—a “primer,” which can arguably serve its purpose more effectively in the case of those with at least some degree of prior knowledge about Japanese literature (and language because some figures only include terms in Japanese). For them, the book can offer a thoughtful introduction to some recurring and pressing questions of literary studies (digital or otherwise) by showing what the digital can contribute to these discussions. That said, while the case studies are expressly Japan-specific, the book weaves Western analogies into the flow of the analysis which can help expand the relevance of the book by raising awareness of the peculiar challenges that “doing DH” in the context of non-Latin scripts entail, thereby contributing to the study of the global history of DH. Compared to born-digital publications, this book creates the impression of a hybrid experience with the inclusion of a link to the code and dataset used by the author for added transparency. All this makes for a print-based inquiry, which can stand on its own, but it also offers the opportunity for the reader to continue the journey online, since this dual (print and online) platform, or “analytical space,” embodies how “traditional” and statistical methods can co-exist in practice and critically engage with each other.

As the projects presented above show, one of the current key strengths of digital Japanese studies lies in its versatility which extends from using technology for innovative tool development, tailored to the characteristics of the language, to the building of abundant and open databases and visual collections, to critical engagements with the digital for pedagogical purposes and to refine our understanding about certain segments of Japanese studies and the nature of scholarly work. Meanwhile, further projects that integrate the digital into their inquiry are on the rise, for example as pedagogical resources in the form of the Meiji at 150 initiativeFootnote 10 or the new, primarily game-based, JapanLab,Footnote 11 which could constitute the subject of future comparative discussions. But the field already exhibits a promising and visible repertoire of meaningful projects, which strengthens the representation of non-Western voices in DH and shows what it means to “go digital” beyond the anglophone realm.