Just HOW good is modern Greek for NLP software, Automatic Translation and Sense Disambiguation?

For a long time now, I have felt extremely skeptical about the claims of certain Greek… supremacists that «Greek language is the best for computing», in the sense that it’s somehow better or more expressive, or more precise (etc.) for certain computing tasks, vaguely associated with Artificial Intelligence.

Unfortunately my knowledge of ancient Greek is limited, but my guess is that ancient Greek is a much more appropriate candidate for such claims of quality. In this post, however, I will only discuss Modern Greek, leaving ancient Greek aside (or for another posting, if there is enough time and interest).

My own opinion and experience of Modern Greek in Natural Language Programming (NLP) has been quite positive, but by no means as extreme as the ideas of some biased Greek nationalists, who persistently undervalue other languages, also believing e.g. that Bill Gates is a secret lover of Greek, and so on. According to these ignorant nationalists, writing code in today’s programming languages must be a sheer waste of time; all we need to do is simply replace them all with Greek, ending the current proliferation of computer languages using a globally revived version of ancient Greek! From my point of view, these ideas are Pure Bulshit: -There is simply no way to replace programming languages with (any) human language, no matter how perfect.

Nevertheless, if there are some advantages in Greek, from a computational linguistics’ and NLP point of view, I’d like to discuss them with you in detail, so that we can arrive at an unbiased opinion, with some positive practical results -if possible- for A.I. and computing.

There are 2 advantages of modern Greek, in NLP software, that I stumbled upon and recognised repeatedly in the past:

1) Greek spelling is heavily impregnated with additional information that can differentiate easily (and disambiguate) similar-sounding but different words. There are very few ambiguous cases (like «love» in English, which can be a verb but also a noun, or «party» -which can be either a friendly gathering or a political collectivity). This point was also discussed in another Greek blog (of my colleague j95).

2) Greek morphology is similarly full of useful surface-level characteristics, in a wide variety of easily identifiable forms (word-endings, etc) that make the automatic disambiguation of Greek grammar (as well as the detection of many aspects of unknown/new words) a lot easier than the disambiguation of English (for example); my very poor knowledge of other human languages (apart from English, Greek and some French) does not suffice to express a definite opinion as regards the possibility that certain other human languages have similar or better computational advantages (please do help, if you have some direct knowledge of relevance here).

In the last 20 years I have written a lot of Prolog code for Natural Language Understanding, as well as experimental Automatic Translation. At some point in the past I used some NLP to produce the first Greek hypertext-based dictionary software «HyperLEX» (click here for details). The task of disambiguating Greek was certainly proved considerably easier through this work, than similar efforts in English, but not as miraculously easier as some advocates of Greek Cultural Supremacism believe.

Well, your opinions and experiences are welcome! Feel free to comment (preferably) in English, but also (if you insist or can’t help it) do also comment in Greek. (If necessary, I might translate some Greek comments in English for the benefit of people who might need this, since the reverse English-to-Greek translation is unlikely to be needed).

UPDATE: A few weeks later, I was intrigued to discover more evidence that the Nationalist Pseudo-scientific myths about Greek being supposedly superior than other languages (even… computer languages) are crap; e.g. (if you can read Greek, links may be added here).

 

3 comments

  1. «There is simply no way to replace programming languages with (any) human language, no matter how perfect.»

    Για το παρόν έχεις δίκιο. Αλλά για το μέλλον; Δεν έχω ιδέα από γλώσσες προγραμματισμού, αλλά στο χώρο της τεχνητής νοημοσύνης δε θα μπορούσε να γίνει χρήση ανθρώπινης γλώσσας (όχι κατ’ ανάγκη της ελληνικής φυσικά);

    You have apparently right for the present time. But what about the future? I have no idea about programming languages or programming procedures, but would be not possible of using a human language on the area of artificial intelligence?

  2. Πάρη,

    Ναι, έπεσες… πολύ κοντά. Στην πραγματικότητα ο προγραμματισμός ΗΔΗ μπορεί να γίνει με βάση υποσύνολα της ανθρώπινης γλώσσας, αλλά με κάποιους περιορισμούς, που σίγουρα αύριο θα αναιρεθούν!

    Στη δεκαετία 90 είχα γράψει ο ίδιος κώδικα Prolog που καταλάβαινε απλές ανθρώπινες εντολές και τις μετέτρεπε σε κώδικα προγράμματος, π.χ. αν έγραψες το Πυθαγόρειο θεώρημα το μετέτρεπε σε κώδικα που έβρισκε την υποτείνουσα σαν την τετραγωνική ρίζα των δύο πλευρών ενός τριγώνου.

    Πάντως, ενδέχεται αυτή η δυνατότητα, αφ’ ενός να μην έχει καμία σχέση με τα Ελληνικά (απαραίτητα) και αφ’ ετέρου να παραμείνει περιορισμένη καθώς ολοένα και καλύτερες γλώσσες προγραμματισμού θα είναι πιο καλές σε απόδωση.

    Βέβαια, η τελική εξέλιξη της Τεχνητής Νοημοσύνης είναι έξυπνες μηχανές που γράφουν το δικό τους κώδικα με βάση ανθρώπινους ορισμούς ή προδιαγραφές. Το μακρινό αυτό μέλλον όμως δεν συνδέεται απαραίτητα με την Ελληνική γλώσσα. Τεχνητές ανθρώπινες γλώσσες όπως η Lojban (πρώην Loglan) είναι πιθανώς πιο αποδοτικές από τα Ελληνικά για να «μιλάμε» με τις μηχανές, ήδη σήμερα.

    Approximate Translation:
    This is almost correct; already there are programs that convert human language (restricted subsets of it) to programming code, with some (limited) success. However, even in the future this is likely to be a restricted use of human language as a programming language, as more and more effective artificial languages are used for programming.

    Of course, the ultimate evolution of Artificial Intelligence is for machines to write their own code, using human specifications. This is possible, but not necessarily associated with the Greek language. Αrtificial human languages like Lojban (formely LogLan) are certainly better than other human languages for specifying code, even today.

Σχολιάστε

Εισάγετε τα παρακάτω στοιχεία ή επιλέξτε ένα εικονίδιο για να συνδεθείτε:

Λογότυπο WordPress.com

Σχολιάζετε χρησιμοποιώντας τον λογαριασμό WordPress.com. Αποσύνδεση / Αλλαγή )

Φωτογραφία Twitter

Σχολιάζετε χρησιμοποιώντας τον λογαριασμό Twitter. Αποσύνδεση / Αλλαγή )

Φωτογραφία Facebook

Σχολιάζετε χρησιμοποιώντας τον λογαριασμό Facebook. Αποσύνδεση / Αλλαγή )

Φωτογραφία Google+

Σχολιάζετε χρησιμοποιώντας τον λογαριασμό Google+. Αποσύνδεση / Αλλαγή )

Σύνδεση με %s