Skip to content

Commit

Permalink
u
Browse files Browse the repository at this point in the history
  • Loading branch information
hadithmv committed Jan 9, 2024
1 parent ba45d79 commit e1a85d6
Show file tree
Hide file tree
Showing 3 changed files with 439 additions and 20 deletions.
2 changes: 1 addition & 1 deletion books/quranBakurube.html

Large diffs are not rendered by default.

372 changes: 372 additions & 0 deletions notes/scripts/for work/radheef/radheef.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,372 @@
get base apk, unzip
assets\flutter_assets\assets\data
unzip words and meanings to json
convert words json online using
https://www.convertcsv.com/json-to-csv.htm
other converters introduces errors it seems
wait until it converts, may look like its freezing
delete extra columns?

remove empty cols from each
now match by word_id and merge

dont need the following cols at all:

meaning_id
dictionaries
sub_class (for now)
variant_template (for now)
photo_adv
item_order
word_en_meaning_not_applicable
tag_groups

...

in words json, each word id is unique, meaning only on entry of one word
but is meanings json, more than one meaning can apply to the same word, and also its id

!meaning_id

word id
has this entry: ޓެސްޓު މާނައެއް

meaning_text

!dictionaries
only unique values for this:
142,143,144
142
142,143
(dont think these are of any consequence)


main_class (mai ginthi)
100
103
104
105
102
107
101
106
155

corresponds to:
މަސްދަރު
ނަން
ނަންއިތުރު
ނަންއިތުރުގެނަން
ކަންއިތުރު
އަކުރު
ކަން
އިތުރު
އަދަބީ ބަސް

!sub_class
0 (equals blank)
108 (has only one entry, but no wordid, so no way to know what it equals as of yet)
109 (has 5 entries w wordid, and equals the value of "ގިނަހެރި")

tense (zamaan)
0
110
112
114
113
111
115

corresponds to:

ވާކަން
ވާނޭކަން
ނިމިނިމުނުކަން
ނިމުނުކަން
ވޭވޭހުރިކަން
ވެދާނެކަން

literary_class (adabi ginthi)
0
116
120
118
119
117
121

corresponds to:

މަޖާޒު
މުސްކުޅިބަސް
މިސާލުބަސް
ހަރުބަސް
މަޖާޒީ މިސާލު
އޮޅިބަސް

dialect (bahuruva)
126
149
154
131
0
130
129
127
128

corresponds to:
އާންމު
ބޯދާ
ސަރަހައްދީ 154
މަލިކު
-
ހުވަދޫ
ހައްދުންމަތި
އައްޑޫ
ފުވައްމުލަކު

diction_level (dharaja)
122
124
123
0
125

corresponds to:
އާންމު
އެންމެ މާތް
މާތް
0
ހުތުރ / ބާޒާރީ

subject_area (dhaaira)
0
137
135
132
140
134
136
133
138
141
150

corresponds to:
ކަނބުރުވެރިކަން
ނަކަތްތެރިކަން
މަސްވެރިކަން
މަސައްކަތްތެރިކަން
ބޭސްވެރިކަން
ނިޔަމިކަން
ދަނޑުވެރިކަން
ފަންޑިތަވެރިކަން
މިއުޒިކު
އިންފޮމޭޝަން ޓެކްނޮލޮޖީ

specific_usage_atolls (atoll)
(has 92 unique values w commas, and 23 without)
specific_usage_atolls
20
1
2
19
16
17
18
12
13
9
6
23
7
8
10
11
5
3
4
21
15
22
14

corresponds to:
ސ
ހއ
ހދ
ޏ
ލ
ގއ
ގދ
މ
ފ
އއ
ބ
ގ
ޅ
ކ
އދ
ވ
ރ
ށ
ނ
ހ
ތ
އ
ދ

usage_example
(has 1397 unique values, bring them as is, no need to match)

!variant_template
(just this one entry below, so dont think its needed)
ދޫމެއް ދޫމަކާއި ދޫމުގެ ދޫމުތައް ދޫމަކީ

!regions
(empty)

!photo_adv
(no need for it)
uploads/2020/07/30826_2_dhoomu_adv.png
uploads/2020/12/32900_1928_kelavaki_adv.jpg

!item_order
(has 23 unique values, dont think its of consequence, might have been used to order for words where wordid is same perhaps)

word_en_meaning

word_en_pos
0
1
3
4
5
7

corresponds to:

noun
adjective
verb
adverb
7 (no match exists in wordid)

!word_en_meaning_not_applicable
(this probably tells the app whether to show the english pos)
0
1

!tag_groups
(empty)

... ... ...

PART 2

words.json headers look like this:
1 word_id
2 letter (not needed)
3 word_en
4 approved_word_dv
5 transliteration
6 pronounciation (not needed, empty)
7 morphemes (not needed, empty)

remove columns 2, 6, 7 from words.json

that leaves:
1 word_id
2 word_en
3 approved_word_dv
4 transliteration

meanings.json headers look like this:
1 meaning_id (remove)
2 word_id
3 meaning_text
4 dictionaries (remove)
5 main_class
6 sub_class (remove)
7 tense
8 literary_class
9 dialect
10 diction_level
11 subject_area
12 specific_usage_atolls
13 usage_example
14 variant_template (remove)
15 regions
16 photo_adv (remove)
17 item_order (remove)
18 word_en_meaning
19 word_en_pos
20 word_en_meaning_not_applicable (remove)
21 tag_groups (remove)

remove columns 1, 4, 6, 14, 15, 16, 17, 20, 21 from meanings.json

that leaves:
1 word_id
2 meaning_text
3 main_class
4 tense
5 literary_class
6 dialect
7 diction_level
8 subject_area
9 specific_usage_atolls
10 usage_example
11 word_en_meaning
12 word_en_pos

4 cols of word and 12 of meanings need to join
=16 cols

= 14

1 word_id
2 word_en
3 approved_word_dv
4 transliteration

5 word_id
6 meaning_text
7 main_class
8 tense
9 literary_class
10 dialect
11 diction_level
12 subject_area
13 specific_usage_atolls
14 usage_example
15 word_en_meaning
16 word_en_pos



4 meaning_text
5 main_class
6 tense
7 literary_class
8 dialect
9 diction_level
10 subject_area
11 specific_usage_atolls
12 usage_example
13 word_en_meaning
14 word_en_pos

1 word_en
2 approved_word_dv
3 transliteration

now use
https://www.convertcsv.com/json-to-csv.htm

to convert the merged json back to csv

upload to g docs

delete extra empty 2 cols at the end
Loading

0 comments on commit e1a85d6

Please sign in to comment.