Semantic Hackers need NLP (Natural Language Processing), PROLOG and Assembly Language…

Datasets in the Linking Open Data project, as of September 2007Image via Wikipedia

Semantic Hacker is a cool blog (and site) about the Semantic Web. I’ve placed them in my Web 3.0 blogroll today. They are offering an API (Application Programming Interface in geek-lingo) that can be integrated in Semantic Web applications using «Concept-Spaces», based on weighted keywords.

However, an attempt I made to run their on-line test proved immediately and conclusively that their innovative API badly needs (Prolog-based?) Natural Language methods (NLP) of parsing and understanding human text. Without such methods, I’m afraid that the Semantic Hacker’s API… sucks! (Although this failure is occasional; most of the time the API test being quite successful). You can check this out by looking at the text I submitted to their on-line test, together with their (rather off-topic) results (copied and pasted from their test-form page):

Input Text Cut and paste your own text into this area.

Suppose there is an Internet Community with just one moderator; and that every person in the community keeps himself moderated: some by moderating themselves, some by obeying the (one and only) moderator. It seems reasonable to imagine that the moderator obeys the following rule: He moderates all and only those members of the community who do not moderate themselves. Under this scenario, we can ask the following question: Does the moderator moderate himself?

Simplified Semantic Signature®

…/EverQuest_Games/EverQuest/Server_Specific 38
…/Space_Combat/Star_Wars_Games/Star_Wars_Galaxies/Clans_and_Guilds 26
Society/Philosophy/Chats_and_Forums 24
Home/Family/Pregnancy/Chats_and_Forums24
Arts/Literature/Authors/Blake,_William 23

Related Wikipedia Articles

(end of test-results copied-&-pasted from Semantic Hacker)

Well, one can easily verify that… absolutely NONE of these results have any connection to the original text submitted: It was a text about moderation (aka private censorship), paraphrasing the well-known Logic Paradox (originally by Bertrand Russell) called «The barber’s paradox», as explained in my blog-post «The Moderator’s Paradox» (a barber’s paraphrase).

Of course, if the Semantic Hacker’s API could understand the word «moderate» as a verb (instead of an adjective) then perhaps it could come up with more useful results. However, I am reluctant to accept the efficacy of methods based on dictionary-terms and concepts alone, without proper parsing (even the partial parsing of noun phrases as a minimum prerequisite).
NOTE: Parsing can also be done in Assembly Language, linked to Prolog or Java (or another programming language). Assembly Language Code for parsing and data-mining is a necessity, despite the high speed of modern CPU’s. Of course -rather sadly- few people write Assembly code today (and I am one of them, e.g. here).
  • Assembly Language is necessary highly beneficial because of the huge amount of Data that needs to be processed: The Entirety of the Web, in fact. Even the fastest CPUs and the fastest programs are simply not enough !
  • PROLOG (and Natural Language ProgrammingNLP) is also (even more) necessary, for the reasons explained previously (and elsewhere).
  • Use of PROLOG turbo-enhanced with Pure Assembly Language is my personal choice for most programming, in the last 18 years or so (e.g. as in http://omadeon.com/alc)
Related articles

.

4 comments

Σχολιάστε

Εισάγετε τα παρακάτω στοιχεία ή επιλέξτε ένα εικονίδιο για να συνδεθείτε:

Λογότυπο WordPress.com

Σχολιάζετε χρησιμοποιώντας τον λογαριασμό WordPress.com. Αποσύνδεση /  Αλλαγή )

Φωτογραφία Twitter

Σχολιάζετε χρησιμοποιώντας τον λογαριασμό Twitter. Αποσύνδεση /  Αλλαγή )

Φωτογραφία Facebook

Σχολιάζετε χρησιμοποιώντας τον λογαριασμό Facebook. Αποσύνδεση /  Αλλαγή )

Σύνδεση με %s