The Early Stage in the Methodology of Automatic Toponym Recognition on Geographic Maps and in Texts

Authors

  • Alexander V. Dmitriev Peter the Great St. Petersburg Polytechnic University

DOI:

https://doi.org/10.52575/2712-7443-2025-49-1-128-145

Keywords:

historical maps, toponyms, automatic recognition, computer vision, machine learning

Abstract

The article examines the evolution of methodological approaches to automatic toponym recognition in cartographic materials and texts from the late 1980s to the mid-2010s. It analyzes key research works of the 2000s that laid the foundation for modern computer vision and machine learning methods in this field. Special attention is paid to the challenges of toponym recognition on historical maps, including issues of homonymy, temporal ambiguity, and multilingualism. The study investigates mathematical methods and algorithmic solutions from various research groups, evaluating their effectiveness and limitations in handling complex cartographic materials. The works by Smith and Crane, Gelbukh and Levachkine, Pouderoux, and other scientists who made significant contributions to methodology development are analyzed in detail. Particular emphasis is placed on the three-phase approach to semi-automatic toponym recognition and the specific challenges encountered when processing historical maps. The research demonstrates methodological continuity in the development of this direction and highlights the enduring relevance of classical approaches in the context of modern artificial intelligence technologies, suggesting ways to integrate traditional methods with contemporary neural network architectures.

Downloads

Author Biography

Alexander V. Dmitriev, Peter the Great St. Petersburg Polytechnic University

Candidate of Philological Sciences, Associate Professor, Doctoral Candidate, Associate Professor of the Humanities Institute

E-mail: avd84@list.ru

References

Список литературы

Вицентий А.В., Диковицкий В.В., Шишаев М.Г. 2020. Технология извлечения и визуализации пространственных данных, полученных при анализе текстов. Труды Кольского научного центра РАН, 11(8–11): 115–119. https://doi.org/10.37614/2307-5252.2020.8.11.012

Вицентий А.В., Шишаев М.Г. 2021. Технология извлечения геоатрибутированных сущностей для визуального представления пространственной связности объектов на основе автоматизированной генерации картосхем. Труды Кольского научного центра РАН, 12(5): 35–49. https://doi.org/10.37614/2307-5252.2021.5.12.003

Герцен А.А., Герцен О.А., Гордова Ю.Ю., Костовска С.К., Костовска Ст.К., Хропов А.Г. 2023. Аспекты картографии и топонимии культовых сооружений в историко-географической перспективе. ИнтерКарто. ИнтерГИС, 29: 180–203. https://doi.org/10.35595/2414-9179-2023-2-29-180-203

Гордова Ю.Ю., Герцен О.А., Герцен А.А., Костовска С.К. 2021. Применение картографических методов в топонимике (история вопроса и современные исследования). ИнтерКарто. ИнтерГИС, 27(4): 520–536. https://doi.org/10.35595/2414-9179-2021-4-27-520-536

Колесников А.А., Кикин П.М., Нико Д., Комиссарова Е.В. 2020. Системы обработки естественного языка для извлечения данных и картографирования на основе неструктурированных блоков текста. ИнтерКарто. ИнтерГИС, 26(1): 375–384. https://doi.org/10.35595/2414-9179-2020-1-26-375-384

Красовский А.П. 2024. Большие данные как инструмент исследований в топонимике и истории межевания. ИнтерКарто. ИнтерГИС, 30(1): 321–341. https://doi.org/10.35595/2414-9179-2024-1-30-321-341

Крейдлин Л.Г. 2006. Программа выделения русских индивидуализированных именных групп TagLite. В кн.: Компьютерная лингвистика и интеллектуальные технологии. Труды Международной конференции Диалог, Москва, 01–04 июня 2016. М., Российский государственный гуманитарный университет: 292–297.

Поспелов Е.М. 1971. Топонимика и картография. М, Мысль, 256 c.

Fletcher L.A., Kasturi R. 1988. A Robust Algorithm for Text String Separation from Mixed Text/Graphics Images. IEEE transactions on pattern analysis and machine intelligence, 10(6): 910–918. https://doi.org/10.1109/34.9112

Gelbukh A., Levachkine S. 2002. Error Detection and Correction in Toponym Recognition in Cartographic Maps. IASTED International Conference Geopro-2002: 1–7

Gelbukh A., Levachkine S., Han S.Y. 2003. Resolving Ambiguities in Toponym Recognition in Cartographic Maps. In: Graphics Recognition. Recent Advances and Perspectives. GREC 2003. Lecture Notes in Computer Science. Ed. by Lladós J., Kwon Y.B. Springer, Berlin, Heidelberg: 75–86.

Gllavata J., Ewerth R., Freisleben B. 2003. A Robust Algorithm for Text Detection in Images. In: Proceedings of the 3rd International Symposium on Image and Signal Processing and Analysis, Rome, Italy: 611–616.

Leidner J.L. 2008. Toponym resolution in text: Annotation, evaluation and applications of spatial grounding of place names. USA, Florida, Universal-Publishers, 261 p.

Lenc L., Martínek J., Baloun J., Prantl M., Král P. 2022. Historical Map Toponym Extraction for Efficient Information Retrieval. In: Document Analysis Systems. DAS 2022. Lecture Notes in Computer Science. Springer: 171–183. https://doi.org/10.1007/978-3-031-06555-2_12

Levachkine S. 2003. Raster to Vector Conversion of Color Cartographic Maps. In: Graphics Recognition. Recent Advances and Perspectives. GREC 2003. Lecture Notes in Computer Science. Berlin, Heidelberg, Springer: 50–62. https://doi.org/10.1007/978-3-540-25977-0_5.

Levachkine S., Vel´azquez A., Alexandrov V., Kharinov M. 2002. Semantic Analysis and Recognition of Raster-Scanned Color Cartographic Images. In: Graphics Recognition Algorithms and Applications. GREC 2001. Lecture Notes in Computer Science. Berlin, Heidelberg, Springer-Verlag: 178–189.

Milleville K., Verstockt S., Van de Weghe N. 2020. Improving Toponym Recognition Accuracy of Historical Topographic Maps. Automatic Vectorisation of Historical Maps, Proceedings of the International Workshop on Automatic Vectorisation of Historical Maps, Budapest, Hungary, 13: 63–72.

Olson R., Kim J., Chiang Y.Y. 2024. Automatic Search of Multiword Place Names on Historical Maps. Proceedings of the 3rd ACM SIGSPATIAL International Workshop on Searching and Mining Large Collections of Geospatial Data, 9–12. https://doi.org/10.1145/3681769.3698577

Peters M., Neumann M., Iyyer M., Gardner M., Clark Chr., Lee K, Zettlemoyer L. 2018. Deep Contextualized Word Representations. In: Human Language Technologies. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics, New Orleans, Association for Computational Linguistics, Vol. 1: 2227–2237.

Pierrot-Deseilligny M., Men H.L., Stamon G. 1995. Characters String Recognition on Maps, a Method for High Level Reconstruction. Montreal, QC, Canada, Proceedings of ICDAR: 249–252.

Pouderoux J., Gonzato J.-C., Pereira A., Guitton P. 2007. Toponym Recognition in Scanned Color Topographic Maps. Ninth International Conference on Document Analysis and Recognition (ICDAR 2007). IEEE, 1: 531–535.

Schlegel I. 2021. Automated Extraction of Labels from Large-Scale Historical Maps. AGILE: GIScience Series, 2: 12. https://doi.org/10.5194/agile-giss-2-12-2021

Simon R., Pilgerstorfer P., Isaksen L., Barker E. 2014. Towards Semi-Automatic Annotation of Toponyms on old maps. e-Perimetron, 9(3): 105–128.

Smith D.A., Crane G.R. 2001. Disambiguating Geographic Names in A Historical Digital Library. In: Research and Advanced Technology for Digital Libraries. ECDL 2001. Lecture Notes in Computer Science, Springer, Berlin: 127–136.

Vel´azquez A., Levachkine S. 2003. Text/graphics separation and recognition in raster-scanned color cartographic maps. Proceedings: Automated geographic indexing of text documents. Journal of the American Society for Information Science, 45(9): 645–655.

Zhou B., Zou L., Hu Y., Qiang Y., Goldberg D. 2023. TopoBERT: a Plug and Play Toponym Recognition Module Harnessing Fine-Tuned BERT. International Journal of Digital Earth, 16(1): 3045–3064.

References

Vitsentiy A.V., Dikovitskiy V.V., Shishayev M.G. 2020. The Technology of Extraction and Visualization of Spatial Data Obtained by Texts Analysis. Kola Science Centre Publisher, 11(8–11): 115–119 (in Russian). https://doi.org/10.37614/2307-5252.2020.8.11.012

Vitsentiy A.V., Shishayev M.G. 2021. The Geoattributed Entity Extraction Technology for Visual Representation of Objects Spatial Relations Based on Automated Schematic Map Generation. Kola Science Centre Publisher, 12(5): 35–49 (in Russian). https://doi.org/10.37614/2307-5252.2021.5.12.003

Herzen A.A., Herzen O.A., Gordova Yu.Yu., Kostovska S.K., Kostovska St.K., Khropov A.G. 2023. Aspects of Cartography and Toponymy of Religious Buildings in the Historical-Geographical Perspective. InterKarto. InterGIS, 29: 180–203 (in Russian). https://doi.org/10.35595/2414-9179-2023-2-29-180-203

Gordova Yu.Yu., Herzen O.A., Herzen A.A., Kostovska S.K. 2021. Usage of Cartographic Methods in Place-Name Study (History of the Problem and Actual Research). InterKarto. InterGIS, 27(4): 520–536 (in Russian). https://doi.org/10.35595/2414-9179-2021-4-27-520-536

Kolesnikov A.A., Kikin P.M., Niko D., Komissarova E.V. 2020. Natural Language Processing Systems for Data Extraction and Mapping on the Basis of Unstructured Text Blocks. InterKarto. InterGIS, 26(1): 375–384 (in Russian). https://doi.org/10.35595/2414-9179-2020-1-26-375-384

Krassowski A.P. 2024. Big data as a research tool in toponymy and the history of land surveying. InterCarto. InterGIS, 30(1): 321–341 (in Russian). https://doi.org/10.35595/2414-9179-2024-1-30-321-341

Kreydlin L.G. 2006. Program for Extraction of Russian Individualized Name Groups TagLite. In: Computational Linguistics and Intellectual Technologies. Proceedings of the International Conference Dialogue, Moscow, 1–4 June 2016. Moscow, Pabl. Rossiyskiy gosudarstvennyy gumanitarnyy universitet: 292–297 (in Russian).

Pospelov E.M. 1971. Toponimika i kartografiya [Toponymy and Cartography]. Moscow, Pabl. Mysl, 256 p.

Fletcher L.A., Kasturi R. 1988. A Robust Algorithm for Text String Separation from Mixed Text/Graphics Images. IEEE transactions on pattern analysis and machine intelligence, 10(6): 910–918. https://doi.org/10.1109/34.9112

Gelbukh A., Levachkine S. 2002. Error Detection and Correction in Toponym Recognition in Cartographic Maps. IASTED International Conference Geopro-2002: 1–7

Gelbukh A., Levachkine S., Han S.Y. 2003. Resolving Ambiguities in Toponym Recognition in Cartographic Maps. In: Graphics Recognition. Recent Advances and Perspectives. GREC 2003. Lecture Notes in Computer Science. Ed. by Lladós J., Kwon Y.B. Springer, Berlin, Heidelberg: 75–86.

Gllavata J., Ewerth R., Freisleben B. 2003. A Robust Algorithm for Text Detection in Images. In: Proceedings of the 3rd International Symposium on Image and Signal Processing and Analysis, Rome, Italy: 611–616.

Leidner J.L. 2008. Toponym resolution in text: Annotation, evaluation and applications of spatial grounding of place names. USA, Florida, Universal-Publishers, 261 p.

Lenc L., Martínek J., Baloun J., Prantl M., Král P. 2022. Historical Map Toponym Extraction for Efficient Information Retrieval. In: Document Analysis Systems. DAS 2022. Lecture Notes in Computer Science. Springer: 171–183. https://doi.org/10.1007/978-3-031-06555-2_12

Levachkine S. 2003. Raster to Vector Conversion of Color Cartographic Maps. In: Graphics Recognition. Recent Advances and Perspectives. GREC 2003. Lecture Notes in Computer Science. Berlin, Heidelberg, Springer: 50–62. https://doi.org/10.1007/978-3-540-25977-0_5.

Levachkine S., Vel´azquez A., Alexandrov V., Kharinov M. 2002. Semantic Analysis and Recognition of Raster-Scanned Color Cartographic Images. In: Graphics Recognition Algorithms and Applications. GREC 2001. Lecture Notes in Computer Science. Berlin, Heidelberg, Springer-Verlag: 178–189.

Milleville K., Verstockt S., Van de Weghe N. 2020. Improving Toponym Recognition Accuracy of Historical Topographic Maps. Automatic Vectorisation of Historical Maps, Proceedings of the International Workshop on Automatic Vectorisation of Historical Maps, Budapest, Hungary, 13: 63–72.

Olson R., Kim J., Chiang Y.Y. 2024. Automatic Search of Multiword Place Names on Historical Maps. Proceedings of the 3rd ACM SIGSPATIAL International Workshop on Searching and Mining Large Collections of Geospatial Data, 9–12. https://doi.org/10.1145/3681769.3698577

Peters M., Neumann M., Iyyer M., Gardner M., Clark Chr., Lee K, Zettlemoyer L. 2018. Deep Contextualized Word Representations. In: Human Language Technologies. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics, New Orleans, Association for Computational Linguistics, Vol. 1: 2227–2237.

Pierrot-Deseilligny M., Men H.L., Stamon G. 1995. Characters String Recognition on Maps, a Method for High Level Reconstruction. Montreal, QC, Canada, Proceedings of ICDAR: 249–252.

Pouderoux J., Gonzato J.-C., Pereira A., Guitton P. 2007. Toponym Recognition in Scanned Color Topographic Maps. Ninth International Conference on Document Analysis and Recognition (ICDAR 2007). IEEE, 1: 531–535.

Schlegel I. 2021. Automated Extraction of Labels from Large-Scale Historical Maps. AGILE: GIScience Series, 2: 12. https://doi.org/10.5194/agile-giss-2-12-2021

Simon R., Pilgerstorfer P., Isaksen L., Barker E. 2014. Towards Semi-Automatic Annotation of Toponyms on old maps. e-Perimetron, 9(3): 105–128.

Smith D.A., Crane G.R. 2001. Disambiguating Geographic Names in A Historical Digital Library. In: Research and Advanced Technology for Digital Libraries. ECDL 2001. Lecture Notes in Computer Science, Springer, Berlin: 127–136.

Vel´azquez A., Levachkine S. 2003. Text/graphics separation and recognition in raster-scanned color cartographic maps. Proceedings: Automated geographic indexing of text documents. Journal of the American Society for Information Science, 45(9): 645–655.

Zhou B., Zou L., Hu Y., Qiang Y., Goldberg D. 2023. TopoBERT: a Plug and Play Toponym Recognition Module Harnessing Fine-Tuned BERT. International Journal of Digital Earth, 16(1): 3045–3064.


Abstract views: 72

Share

Published

2025-03-28

How to Cite

Dmitriev, A. V. (2025). The Early Stage in the Methodology of Automatic Toponym Recognition on Geographic Maps and in Texts. Regional Geosystems, 49(1), 128-145. https://doi.org/10.52575/2712-7443-2025-49-1-128-145

Issue

Section

Earth Sciences