Languages Supported by Rossum

Rossum lets you process documents in many different languages. Here’s a complete list of the supported languages and the features available for each one.

Legacy Data Capture

Before Aurora, Rossum could recognize the following languages: Czech, Danish, Dutch, English, Finnish, French, German, Hungarian, Italian, Lithuanian, Norwegian, Polish, Portuguese, Brazilian Portuguese, Romanian, Slovak, Slovenian, Spanish, Catalan, and Swedish. We also provided support for Japanese and Chinese.

With the release of Aurora 1.5, the list of supported languages expanded significantly. We now offer full or partial support for 276 languages. This article provides all the details.

Rossum Aurora – Fully Supported Languages

For all the languages listed below, Document Language Classification is available, meaning Rossum will automatically recognize the language (the document language can be predicted). Instant Learning is also supported, which means our AI can learn from annotations in these languages.

Most of these languages support pre-trained fields, meaning the AI can provide predictions for certain fields right away, without needing extra training or previous annotations. You can find the full list of fields here.

For some languages, we support full or partial recognition of handwritten text. Partial handwriting recognition means that some diacritics may not be recognised.

✅ – Available

❌ – Not available

Language

Instant Learning

Document Language Classification

Pre-trained fields

Handwritten text recognition*

Albanian

PARTIAL

Arabic

Azerbaijani

Basque

PARTIAL

Belarusian

Bosnian

Bulgarian

Catalan

PARTIAL

Chinese Simplified

Chinese Traditional

Croatian

PARTIAL

Czech

PARTIAL

Danish

PARTIAL

Dutch

PARTIAL

English

Estonian

PARTIAL

Finnish

PARTIAL

French

German

Greek

Hebrew

Hindi

Hungarian

PARTIAL

Icelandic

PARTIAL

Indonesian

PARTIAL

Italian

Japanese

Kazakh

Korean

Latvian

PARTIAL

Lithuanian

PARTIAL

Malay

Norwegian

PARTIAL

Persian

Polish

PARTIAL

Portuguese

PARTIAL

Romanian

PARTIAL

Russian

Serbian

Slovak

PARTIAL

Slovenian

PARTIAL

Spanish

PARTIAL

Swedish

PARTIAL

Thai

Turkish

PARTIAL

Ukrainian

Vietnamese

PARTIAL

*To enable recognition of handwritten text, queue locale has to be set to auto (Document regional format -> Detect automatically)

Rossum Aurora – Instant Learning Support

For the languages listed below, we support Instant Learning, meaning our AI will learn from annotations made in these languages. Currently, no additional features are available.

Afrikaans

Abaza

Abkhazian

Achinese

Acoli

Adangme

Adyghe

Akan

Algonquin

Angika

Asturian

Asu

Avaric

Awadhi

Aymara

Bafia

Bagheli

Bambara

Bashkir

Bemba

Bena

Bhojpuri

Bikol

Bini

Bislama

Bodo

Brajbha

Breton

Bundeli

Buryat

Cebuano

Chamling

Chamorro

Chechen

Chhattisgarhi

Chiga

Choctaw

Chukot

Chuvash

Cornish

Corsican

Creek

Crimean Tatar

Crow

Dargwa

Dari

Dhimal

Dogri

Duala

Dungan

Efik

Erzya

Faroese

Fijian

Filipino

Fon

Friulian

Ga

Gagauz

Galician

Ganda

Gayo

Gilbertese

Greenlandic

Guarani

Gurung

Gusii

Haitian Creole

Halbi

Hani

Haryanvi

Hawaiian

Herero

Hiligaynon

Hmong Daw

Ho

Iban

Igbo

Iloko

Inari Sami

Ingush

Interlingua

Irish

Jaunsari

Jola-Fonyi

K’iche’

Kabardian

Kabuverdianu

Kachin

Kalenjin

Kalmyk

Kangri

Kanuri

Kara-Kalpak

Karachay-Balkar

Kashubian

Khakas

Khaling

Khasi

Kikuyu

Kildin Sami

Kinyarwanda

Komi

Kongo

Korku

Koryak

Kosraean

Kpelle

Kuanyama

Kumyk

Kurdish

Kurukh

Kyrgyz

Lak

Lakota

Latin

Lezghian

Lingala

Lower Sorbian

Lozi

Lule Sami

Luo

Luxembourgish

Luyia

Macedonian

Machame

Madurese

Mahasu Pahari

Makhuwa-Meetto

Makonde

Malagasy

Maltese

Malto

Mandinka

Manx

Maori

Mapudungun

Marathi

Mari

Masai

Mende

Meru

Meta’

Minangkabau

Mohawk

Mongondow

Montenegrin

Morisyen

Mundang

Nahuatl

Navajo

Ndonga

Neapolitan

Nepali

Ngomba

Niuean

Nogay

North Ndebele

Northern Sami

Nyanja

Nyankole

Nzima

Occitan

Ossetic

Pampanga

Pangasinan

Papiamento

Pashto

Pedi

Quechua

Ripuarian

Romansh

Rundi

Rwa

Sadri

Sakha

Samburu

Samoan

Sango

Sangu

Sanskrit

Scots

Scottish Gaelic

Sena

Shambala

Shona

Siksika

Sirmauri

Skolt Sami

Soga

Somali

Songhai

South Ndebele

Southern Altai

Southern Sami

Southern Sotho

Swahili

Swati

Tabassaran

Tahitian

Taita

Tajik

Tatar

Teso

Tetum

Thangmi

Tok Pisin

Tongan

Tsonga

Tswana

Turkmen

Tuvan

Udmurt

Uighur

Upper Sorbian

Urdu

Uzbek

Volapük

Vunjo

Walser

Welsh

Western Frisian

Wolof

Xhosa

Yucatec Maya

Zapotec

Zarma

Zhuang

Zulu