domingo, enero 13, 2013

More Wikipedia news. Automatic index!

In one week with mixed news, I received a two good news, +Kartik Kumar Perisetla and +Anish Mangal prepared a Hindi version of the wikipedia activity, and 
+Ignacio Rodríguez worked in a Portuguese version. Kartik wrote a nice post about this too.

I asked to Kartik what other languages can be of interest in India, and he replied: "In india, Hindi, Gujrati and Punjabi are most common languages is North and West India; Telugu, Tamil, Malyalam, Kannada, Bengali are common in South and West india". Without doubt, India is a challenge!

One part of the process to create a offline wikipedia activity is tedious right now, create the list of articles used to start the selection, and prepare the index.html page used as home with the links to that articles. We have a good selection of pages in the English version, and usually is a good idea translate this selection and later add or remove a few articles. Then I decided create a script to use the interwiki links in the English articles to create a list of articles and index page  to use as a base.

My first experiments can be seen here:

A Farsi version:  

A Guaraní version:

A Italian version:

The script add a class to the links without a translation, and show it with a red background.
Of course, this does not do all the job, in a few cases there are garbage., somebody need check the words, found the remaining translations, add or remove articles depending on the target audience, etc, but I think is a  nice improvement.

No hay comentarios.: