-
Notifications
You must be signed in to change notification settings - Fork 4
Work Completed
This report will cover the work completed during the last four weeks and in the community bonding period.
I was assigned the following work plan by Francis Tyers. http://wiki.apertium.org/wiki/Hindi_and_English/Work_plan
I started my work in the community bonding period itself as I had learnt to use Apertium before this period for the coding challenge. The coverage for the monodix and bidix before the community bonding period was 56% and 77% respectively but those numbers were not accurate as they were obtained from a small corpus.
The community bonding period gave me a chance to learn more about Apertium.I went through the wiki documentation of Apertium and started to learn about constraint grammar, writing transfer rules and lexical analyzer.I experimented with these things and completed the coding challenge story. I started committing my changes to github during this period. I also added some paradigms for verbs and pronouns during this period.I created a tagset for the hindi-english language pair http://wiki.apertium.org/wiki/Hindi_and_English/TagSets. The basic goal for me for the community bonding period was to know about my mentors and learn about the apertium project and I think I was able to achieve the same.
I was assigned some goals for each week . For the first week it was adding the postpositions in the dictionaries and writing testvocs for the same.I completed the same. http://wiki.apertium.org/wiki/Hindi_and_English/Pending_tests.As I was not able to obtain a suitable corpus during this period so I conducted the coverage tests with the corpus I had which gave a coverage of about 59%. I also wrote some translation rules for the verbs during this period.
My city experienced floods during this period so I was not able to commit regularly during this week and the following week.
I was assigned to work on the translation of an article by Francis Tyers. http://www.bbc.co.uk/hindi/science/2013/05/130527_jupiter_moon_alien_ap.shtml. I started working on the same adding words and some translation rules. I also started adding some unknown frequently occurring new words
I was to add testvocs for conjunctions during this period but could not complete my goal due to the above mentioned problem.
I had to cover up for the lost time of the previous two weeks during this week. I first finished adding conjunctions in the dictionary. I also completed the article translation and added numerals which were the goals for this week. I obtained a wiki corpus and processed it to get accurate results for the coverage. The coverage came out to be 56% which was way below the desired coverage of 64%. I started adding some frequently occurring unknown words and was able to get the coverage to 60% by the end of the week.
http://wiki.apertium.org/wiki/Hindi_and_English/Results
I started exploring ways for automating addition of words in the dictionary.I obtained some online bilingual dictionaries and wrote some python scripts to automate addition of words in the dictionary. I was successful in adding some 3500 words to the bilingual dictionary which were mostly nouns and adjectives. I also was able to increase the coverage to 68% which is now in almost sync with the work plan.I also added determiners to the dictionary. This week was a high point for me as I was able to regain the time I had lost in the previous weeks.
I had set my milestone for this week to meet the target of 75% dictionary coverage and I was able to achieve the same. I added about 5000 nouns and adjectives and also wrote some testvoc for the same.I also started with some disambiguation and addition of rules