Vol. 13 No. 1 (2022): The Bibliographic Control in the Digital Ecosystem

DREAM. A project about non-Latin script data

Antonella Fallerini
Sapienza Università di Roma, Biblioteca Dipartimento ISO
Agnese Galeffi
Sapienza Università di Roma, Sistema Bibliotecario
Andrea Ribichini
Sapienza Università di Roma, Dipartimento DIAG (Ingegneria informatica, automatica e gestionale)
Mario Santanché
Sapienza Università di Roma, Sistema Bibliotecario
Mattia Vallania
Sapienza Università di Roma, Sistema bibliotecario

Published 2022-01-13


  • Romanization,
  • MARC records,
  • Cataloguing,
  • Transliteration

How to Cite

Fallerini, Antonella, Agnese Galeffi, Andrea Ribichini, Mario Santanché, and Mattia Vallania. 2022. “DREAM. A Project about Non-Latin Script Data”. JLIS.It 13 (1):347-55. https://doi.org/10.4403/jlis.it-12727.


The DREAM project is a large research project founded by Sapienza University of Rome, dealing with bibliographic data in non-Latin scripts. As the National Bibliographic Service catalogue (SBN) does not yet manage data in non-Latin scripts, the aim of DREAM is to offer researchers a catalogue searchable through original scripts (such as Arabic, Chinese, Cyrillic, etc.). One of the most remarkable features of the project is the creation of an ILS-independent working context in which the cataloguer may find and retrieve data in original script from authoritative catalogues, starting from the existing romanized ones. From a technical standpoint, the ever increasing Unicode support offered by modern operating systems, DBMSs and indexing engines makes the rapid development of the relevant software tools a concrete possibility. This in turn implies a shift in scientific focus towards the (often subtle) record linkage operations between different data sources. The authors hope that the DREAM project will gather the adhesion of other Italian libraries that perceive the same needs. Furthermore, as soon as SBN will support the management of data in non-Latin scripts, the DREAM project partners will be able to contribute with their data.


Metrics Loading ...


  1. Agenbroad, James E. 2006. “Romanization Is Not Enough.” Cataloging & Classification Quarterly 42 (2): 21-34. https://doi.org/10.1300/J104v42n02_03
  2. DuBose, Joy. 2019. “Russian, Japanese, and Latin Oh My! Using Technology to Catalog Non-English Language Titles.” Cataloging & Classification Quarterly 57 (7-8): 496-506. https://doi.org/10.1080/01639374.2019.1671929
  3. El-Sherbini, Magda, and Sherab Chen. 2011. “An Assessment of the Need to Provide Non-Roman Subject Access to the Library Online Catalog.” Cataloging & Classification Quarterly 49 (6): 457-483. https://doi.org/10.1080/01639374.2011.603108
  4. Eryani, Fadhl, and Nizar Habash. 2021. “Automatic Romanization of Arabic Bibliographic Records.” https://arxiv.org/pdf/2103.07199.pdf
  5. ICCU. 2016a. “Guida alla catalogazione in SBN – Materiale moderno.” Last modified July 13, 2016. https://norme.iccu.sbn.it/index.php?title=Guida_moderno/Descrizione/Capitolo_generale/Lingua_e_scrittura_della_descrizione
  6. ICCU. 2016b. “Regole italiane di catalogazione. Appendice F – Traslitterazione o trascrizione di scritture diverse dall’alfabeto latino.” Last modified September 21, 2016. https://norme.iccu.sbn.it/index.php?title=Reicat/Appendici/Appendice_F
  7. Inmon, William H. 2005. Building the data warehouse. 4th ed. Indianapolis: John Wiley & Sons.
  8. Ismail, Mohd Ikhwan, and Nurul Azurah Md. Roni. 2010. “Issues and challenges in cataloguing Arabic books in Malaysia academic libraries.” Education for Information 28 (2-4): 151-163.
  9. Kim, SungKyung. 2006. “Romanization in Cataloging of Korean Materials.” Cataloging & Classification Quarterly 43 (2): 53-76. https://doi.org/10.1300/J104v43n02_05
  10. Kimball, Ralph, Margy Ross, Warren Thornthwaite, Joy Mundy, and Bob Becker. 2008. The data warehouse lifecycle toolkit. 2° ed. Indianapolis: John Wiley & Sons.
  11. Kudo, Yoko. 2010. “A Study of Romanization Practice for Japanese Language Titles in OCLC WorldCat Records.” Cataloging & Classification Quarterly 48 (4): 279-302. https://doi.org/10.1080/01639370903338352
  12. Levenshtein, Vladimir Iosifovich. 1966. "Binary codes capable of correcting deletions, insertions and reversals." Soviet Physics Doklady 10 (8): 707-710.
  13. Li, Yue. 2004. “Consistency versus Inconsistency: Issues in Chinese Cataloging in OCLC.” Cataloging & Classification Quarterly 38 (2): 17-31. https://doi.org/10.1300/J104v38n02_04
  14. Molavi, Fereshteh. 2006. “Main Issues in Cataloging Persian Language Materials in North America.” Cataloging & Classification Quarterly 43 (2): 77-82. https://doi.org/10.1300/J104v43n02_06
  15. Navarro, Gonzalo. 2001. “A guided tour to approximate string matching.” ACM Computing Surveys 33 (1): 31-88. https://doi.org/10.1145/375360.375365
  16. Rao, Chaitra, Avantika Mathur, and Nandini C. Singh. 2013. “‘Cost in Transliteration’: The neurocognitive processing of Romanized writing.” Brain and Language 124 (3): 205-212. https://doi.org/10.1016/j.bandl.2012.12.004