Saturday, July 6, 2013

The weirdest languages on Earth

After my last blog post I had some discussions about common language structures and language phenomena in different languages. Thanks to a comment from CC Bilgin (and other interesting comments) I now know that the repetition of words is called "reduplication" and is not a solely Kurdish language phenomenon but can be observed in many unrelated languages. So, I thought it would be good to have a world map of these language phenomena.

Apparently, I am not the first person with that idea:  Instead of creating I am referring to "The World Atlas of Language Structures" (WALS).  
WALS Online is a joint effort of the Max Planck Institute for Evolutionary Anthropology and the Max Planck Digital Library website and it has a lot of great maps comparing various grammar features of languages all over the world.

So the first world map I looked up was, of course, "Reduplication"
Unfortunately, Kurdish is not included in the map for reduplication but just based on the overall distribution of it, one can say that reduplication is a very common language feature (except in Europe),  not having reduplication in a language is actually "weird".

The next feature I looked up was "Order of Object and Verb" in a sentence.
In Kurdish the Verb is at the end of the sentence, so the object comes first. This sentence order is called Object-Verb or simply "OV" (highlighted in blue in the world map).
Most languages have the "OV" or the "VO" sentence order, but very few use both orders depending on the situation and sentence. Languages like Armenian or German don't have a dominant order, which is pretty weird.

Next, I looked up "Gender Distinctions in Independent Personal Pronouns".
In Kurdish there is no gender specific "he" or "she" just "ew/v" ( "o" in Zazaki), so pretty easy grammar rule. Turkish, Basque, and Hungarian is like that as well.
Contrary, Arabic has not only a gender specific "he" (huwa (هو)) and "she" (hiya (هي)) but also for the 2nd Person Singular "you" masculine anta (أنت) and feminine anti (أنت). Moreover, Arabic has a gender specific 2nd Person Plural "you": masculine antum (أنتم) and feminine antunna (أنتنّ) and a gender specific 3rd Person Plural "they": masculine hum (هم) and feminine hunna (هنّ), not so easy grammar rules. Arabic also has 2nd Person Dual "you two" antumā (أنتما) and 3rd Person Dual humā (هما). Based on the map, it seems like that Spanish picked up some of these weird gender rules from Arabic (from the Moors?).

These are just a few language features I covered here. The list of maps on WALS Online is long, everyone who is interested in languages should definitely take a look at other maps as well.

The website Idibon ("Language Technologies for a Connected World") went through the provided data of the WALS Online website and determined the "Weirdness-Index" of each language.

Here is a quote from the Idibon website how they defined weirdness:

This is odd. Is this odd? One of the features that distinguishes languages is how they ask yes/no questions.The vast majority of languages have a special question particle that they tack on somewhere (like the ka at the end of a Japanese question). Of 954 languages coded for this in WALS, 584 of them have question particles. The word order switching that we do in English only happens in 1.4% of the languages. That’s 13 languages total and most of them come from Europe: German, Czech, Dutch, Swedish, Norwegian, Frisian, English, Danish, and Spanish.

A total of 21 "weirdness" features were compared, only languages that had data for at least 14 features were included in the final ranking, however, they calculated the "Weirdness-Index" for the other languages as well. The Kurdish "Weirdness-Index" is 0.781 but Kurdish was excluded from the ranking (only 12 features compared), otherwise Kurdish would have been at position #26, so Kurdish is pretty weird.

One of the weirdest language families seems to be the Germanic language family.

Some of the more known languages, their ranking and Weirdness-Index:
#9 Armenian (0.861),
#10 German (0.858),
#11 Abkhaz (0.844),
#12 Dutch ( 0.844),
#16 Norwegian  (0.828),
#21 Czech (0.791),
#23 Spanish (0.790),
#25 Mandarin (0.789),
#33 English (0.756),
#40 Japanese (0.736),
#57 Greek (0.669),
#75 Persian (0.649),
#77 Hebrew (0.639),
#111 Polish (0.564),
#159 Finnish (0.466),
#181 Russian (0.401),
#217 Lithuanian (0.257)
#226 Turkish (0.214),
#230 Basque (0.189),
#235 Hungarian (0.132)
#239 Hindi (0.087).

  1. Was ergativity also studied?

    1. Yes, in chapter 98-100 and in chapter 28 but in most cases they did not include Kurdish in the maps.

      Chapter 28:

      Chapter 98:

      Chapter 99:

      Chapter 100:

  2. Spanish doesn't even change the word order for questions, just the intonation. Example:

    · affirmation: "tienes las llaves" (you have the keys - flat intonation)
    · question: "¿tienes las llaves?" (do you have the keys? - high to low declining intonation)

    ... maybe that's the reason they write inverted question marks at the beginning of questions, unlike most other languages.

    On the other hand, Basque does have a second person sing. only gender distinction (in transitive verbs only): 'duk' (thou, male, hath) vs. 'dun' (thou, female, hath), however this 'hi' (thou) form is almost extinct because it was replaced by the ancient plural 'zu' (you), which in turn was replaced in plural form by a doubly pluralized 'zuek' (you pl.). It is anyhow the only case where Basque makes a male/female distinction, so I guess it's not too important.

  3. We have question preposition in Kurmanji.'ma' It comes before sentence. Example: Tu derî. (You are going) Ma tu derî? (Are you going) But you don't have to use.

    1. That also happens in Basque with the "al" particle but it works a bit differently: etxean dago (he/she is at home/in the house), etxean al dago? (is he/she at home/in the house?), sometimes this is combined with word change but this is more a matter of emphasis, as it's not the same "does JOHN own a car?" than "does John own A CAR?" This in Basque works as follows:
      → Kotxe bat Jonek al dauka?
      → Jonek kotxe bat al dauka?

      Whatever is near the verb ("(al) dauka") is the matter of the question.

    2. The Kurmanji "ma" is related to Sorani "Magar" ( مـــهگـــهر) and Farsi "Magar" (مگر ) and Farsi "mey" (می). In most cases you could translate it with "unless".

      "Raste" - "Is that right?"
      "Magar raste?" - "That’s not right, is it?" or "Unless that is right?"

      "Raste" - "Is that right?"
      Ma raste? - "That’s not right, is it?" or "Unless that is right?"

      "Dazâni?" - "Do you know?"
      "Magar dazâni?" - "You don’t know, do you?" or "Unless you know".

      (1) "Magar" introduces an affirmative question to which a negative answer is expected: (Examples above);
      (2) "Magar" introduces a negative question to which an affirmative answer is expected:
      "Magar namgut?" - "Didn’t I say so?" or "Unless I did not say so?

      (3) unless:
      "Magar bimirim, danâ daykam." - "Unless I die—otherwise I’ll do it.".

      "Magar" also exist in

      Hindi: मगर (magar)
      Punjabi: ਮਗਰ/مگر (magar)
      Sindhi: مَگَر (magar)
      Urdu: مگر (magar)