>> People still are trying to "infer rules" and do logical, rather than
probabilistic reasoning. I get why that is.
That is because it's very hard to collect statistics on something that you
can't really quantify- meaning, in this case.
There was a thread on HN a couple of days ago about a blog post where someone
was experimenting with, among other things, training an LSTM network to generate Java programs [1].
In one example, the LSTM did really well in reproducing the structure of a
Java program, with import declarations, followed by a class implementing an
interface with a few methods with structured comments and throws declarations
and everything- and even a test!
On the other hand, this program was completely useless. From a cursory glance
it would probably not even compile (e.g. it refered to undeclared variables
etc). There was one method named "numericalMean()" that took a single double
and returned an (undeclared) variable "sum". The class had a nonsensical name
- "SinoutionIntegrator". The test was testing something called "Cosise",
presumably a method- but not one defined in the class. In short- a mess.
That might sound a bit harsh, but I think it's a very good example of why
statistical NLP is really bad at doing meaning: because there is nothing, not
a shred, of meaning in examples of the data we use to train statistical models
of language, i.e. text.
Because, you see, the relation between meaning and text (and even spoken
language) is completely arbitrary. Or, to put it in another way, there are
potentially an infinite number of valid mappings between structure and
meaning, of which we, human beings, somehow by convention or some other crazy
mechanism, have agreed to use just one. And even though the various forms
language entities take (inflections etc) are used exactly to convey meaning,
right, the rules of how meaning varies with structure are, again, completely
independent from structure itself.
Now, we have done very well in modelling structure, from examples of it (which
is what text is). But it's completely unreasonable to expect our algorithms to
be able to extract meaning from it also.
And that is why people are still trying to put down the rules of meaning by
hand. Because that's the only way we can think of, currently, to process
meaning automatically.
That is because it's very hard to collect statistics on something that you can't really quantify- meaning, in this case.
There was a thread on HN a couple of days ago about a blog post where someone was experimenting with, among other things, training an LSTM network to generate Java programs [1]. In one example, the LSTM did really well in reproducing the structure of a Java program, with import declarations, followed by a class implementing an interface with a few methods with structured comments and throws declarations and everything- and even a test!
On the other hand, this program was completely useless. From a cursory glance it would probably not even compile (e.g. it refered to undeclared variables etc). There was one method named "numericalMean()" that took a single double and returned an (undeclared) variable "sum". The class had a nonsensical name - "SinoutionIntegrator". The test was testing something called "Cosise", presumably a method- but not one defined in the class. In short- a mess.
That might sound a bit harsh, but I think it's a very good example of why statistical NLP is really bad at doing meaning: because there is nothing, not a shred, of meaning in examples of the data we use to train statistical models of language, i.e. text.
Because, you see, the relation between meaning and text (and even spoken language) is completely arbitrary. Or, to put it in another way, there are potentially an infinite number of valid mappings between structure and meaning, of which we, human beings, somehow by convention or some other crazy mechanism, have agreed to use just one. And even though the various forms language entities take (inflections etc) are used exactly to convey meaning, right, the rules of how meaning varies with structure are, again, completely independent from structure itself.
Now, we have done very well in modelling structure, from examples of it (which is what text is). But it's completely unreasonable to expect our algorithms to be able to extract meaning from it also.
And that is why people are still trying to put down the rules of meaning by hand. Because that's the only way we can think of, currently, to process meaning automatically.
________
[1] https://news.ycombinator.com/item?id=14526305