Open access platform to save the Odia Indian language

Image by:

Opensource.com

In February 2014, the Government of India declared the South Asian language Odia as the 6th classical language of India which is one among 22 scheduled languages of India and has a literary heritage of more than 5,000 years. There are documents for more than 3,500 years, and the rest are undocumented oral histories. The native Odia speakers became hopeful of getting a lot of language related projects implemented to grow the lineage of this long literary heritage and see the language used and spoken globally, not just in literature but in computer and mobile games, interactive computer applications and in other digital media—and to reach the masses as a communicative language.

So far, not many federal initiatives have been put into place, nor a single policy level change has been made, to implement a standard as simple as like Unicode for easy access of information. And, there are very few mobile apps that offer concise and easy to digest content. Overall, there is not much content online that is available in a standard format that is easy to search, access, and reproduce,

Wikisource is here to change that and is working to open up a whole new world of online resources for readers.

With more than 40 million native Odia speakers living in the Indian state of Odisha and its neighboring states and the diaspora in rest of the world—primarily living in countries like the US, UK, UAE, and many of the South and East Asian counties—far less content in the Odia language has been made available on the Internet. The highest is Odia Wikipedia, with 8441 articles created by October 2014. A bigger problem is that though there are a few websites with Unicode content, government portals do not have content in Unicode to make them searchable and reusable. A non-profit Srujanika, with support from two other institutions, has digitized around 740 books under the scope of the project: Open Access to Oriya Books (OAOB), most of which were published between 1850 and 1950. This remains the largest digital archive so far for the Odia language, yet all of the books are scanned PDFs, restricting searchability of the content.

Odia Wikisource is a project that aims for the digitization of rare books that are out of copyright. The project is even allowing authors and publishers to donate their copyrighted work by re-licensing under CC0 or CC BY-SA licenses. The goal is to bring about access to large volumes of books and manuscripts and create more Open Educational Resources (OERs). The single biggest advantage of the Wikisource project at-large is that it makes text for books available in Unicode standard, making it searchable on the web and allows readers to copy and use it elsewhere. Most other conventional archival systems lack this important feature.

Wikisource is run by a volunteers and communities who often retype or prepare the books by Optical Character Recognition (OCR), a technique that converts scanned images of books into text. Participate and contribute to Odia Wikisource by visiting or.wikisource.org, the project is open to all who want to help!

As a Wikimedia project, Odia Wikisource went through a thorough and long approval process for about 1 year and 9 months, as an active incubator project—first by the Language Committee and then by the Wikimedia Foundation's Board. During this incubation phase, the project has digitized three books completely and one partially—thanks to the individual contributors. An educational institution Kalinga Institute of Social Sciences (KISS) in collaboration with the Wikimedia funded Centre for Internet and Society's Access To Knowledge (CIS-A2K) are in the process of digitizing 9 books by the author Dr. Jagannath Mohanty that were re-licensed to CC BY-SA 3.0 earlier this year.

Four new Wikisource contributors joined the project in response to a tweet and a Facebook post by the author to digitize The Odia Bhagabata, classic literature compiled in 14th century. "Content that has already been typed in fonts of various non-Unicode based encoding, now they can be converted by (this) like it was done for The Odia Bhagabata, that was typed and available on the community hosted website Odia.org. New contributors did not face the problem of retyping,” says Manoj Sahukar, who along with the author designed a converter for reading text and transforming into Unicode for The Odia Bhagabata.

Questions for early contributors to Odia Wikisource

Subhashish Panigrahi (SP): You have been with Odia Wikisource since its inception. How you think it will help other Odias?
Mrutyunjaya Kar, a long time Wikimedian who proofreads the books on Odia Wikisource: Odias around the globe will have access to a vast amount of old as well as new books and manuscripts online in the tip of their finger. Knowing more about the long and glorious history of Odisha will become easier.

SP: Do you think any particular section of the society is going to be benefited by this?
Nasim Ali, the oldest active Odia Wikimedian and Wikisource writer: Books contain the gist of all human knowledge. The ease of access and spread of books are the markers of the intellectual status of a society. And in this e-age, Wikisource can be helpful by not just providing easy access to a plethora of books under free licenses but also aiding the spread of basic education in developing economies. Together with Wikisource and cheaper internet this could catalyze a Renaissance of 21st century.

SP: How does it feel to be one of the few contributors to digitize Odia Bhagabata? How do you want to get involved in future?
Nihar Kumar Dalai, a Wikisource writer: This is a proud opportunity for me to be a part of digitization of such old literature. I, at times, think if I could get involved with this full time!

SP: You have digitized almost two books, are the highest contributor to the project and also one of the main reasons for Odia Wikisource getting approved. What are your plans next to grow it and take to masses?
Pankajmala Sarangi, a Wikisource writer: I would be happy to contribute by typing more books on Odia so that they can be stored and available to all. We can take this to masses through social, print and audio & visual media and organizing meetings/discussions.