-
Notifications
You must be signed in to change notification settings - Fork 3
Adding telugu.tsv #82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
LinguList
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @alzkuc, I just eye-balled your annotations, but I'd argue that there are some things better to be corrected. I could do that, but I'd ask you first to consider the following and see if my arguments are clear: You have heavy inline changes in the teens, like fifty, sixty, etc, where you always drop the sequence d i (in eight and nine), which is dropped in the words for eighty and ninety. Do you really think that it is this linear? It rather seems to me d i is some random suffix that one coudl identify in eight and nine, reflecting some kind of contamination, that makes numerals that are consecutive have a similar sound (elf zwölf, some say even drölf in German). Thus, in short, I think the analysis by now makes too much use of the inline alignments to drop entire sequences but could actually be more apt if making the core numbers until ten not monomorphemic, as they are now.
|
What also helps is considering the contrast with ordinals: https://www.omniglot.com/language/numbers/telugu.htm |
|
You have enimidi "eight" vs. enimi-dava (eighth), with |
Hi @LinguList, thanks for the comment. I agree that the 8 and 9 would indeed be natural candidates for a polymorphemic structure and strictly linear interpretation of the alignments might be incorrect. In other Dravidian languages, 9, for example, can clearly be analyzed as polymorphemic: oṉpatu (Tamil); ombattu (Kannada) onpatŭ (Malayalam): ONE (less than) TEN (e.g. Tamil ONE=onru, TEN=pattu). Telugu is, however, synchronically much less transparent. There is no trace of okati/oka/on/ (1) in the form of 9 - tommidi. We could, however, argue that -(i)di is a suffix and relates to TEN too (padi), resulting in something like NINE as a FRACTIONOFTEN? And the same for 8. What about other core numbers? Mudu (three) vs. Muppai (thirty); Nalugu (four) vs. Nalabai (forty); aidu (five) jabbai (fifty). Would you suggest a polymorphemic analysis here as well with suffixes -du and -gu? I am not sure that the distribution is systematic enough here to justify segmenting these core numerals. |
|
I checked the data again, and I would prefer to be much more careful in the analysis with respect to indicated internal cognates. |
|
Right now, your analysis vastly suggests that e.g. 80 can be derived from 8. I'd suggest to be much more careful here. |
|
Maybe, we make a case of inter-annotator agreement test here, asking somebody else to analyse the data? |
I think caution is a good idea, and I trust your instinct on this, @LinguList. I’m all in for the inter-annotator agreement test. Who should we ask? |
|
I can also try and do it, but this would rather mean annotation in collaboration. Yet I find it also okay at this point of the process. I could then share the results on Thursday in our grad seminar? |
Adding the updated telugu.tsv file with numbers till 99.