CSCI 2951K: Topics in Grounded Language for Robotics: Deep Parsing

Friday, November 22, 2013

Deep Parsing

This week we will read a paper on deep models for parsing:

Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank, Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Chris Manning, Andrew Ng and Chris Potts. Conference on Empirical Methods in Natural Language Processing.

There is a lot going on in this paper. By Sunday night at 5pm, post a question about something in the paper. By Monday night at 5pm, post an answer to somebody else's question.

There is no class on Thursday. Happy Thanksgiving!

40 comments:

UnknownNovember 24, 2013 at 7:30 AM
Since the paper doesn’t specifically mention robotic, my question is how would you apply sentiment analysis to robotics? I think one way would be to include sentiment analysis when interpreting natural language to know whether a human collaborator is happy or angry and therefore whether the robot should continue what it’s doing or switch actions.
ReplyDelete
Replies
UnknownNovember 24, 2013 at 9:49 AM
When compute the probability to classify the new phrase into five classes, the equation uses softmax, instead of argmax as we usually use in models like POMDP. Why softmax ?
ReplyDelete
Replies
UnknownNovember 24, 2013 at 10:54 AM
This comment has been removed by the author.
ReplyDelete
Replies
UnknownNovember 24, 2013 at 10:56 AM
When initialize word's matrix of RNN, why do we need to add a small amount of Gaussian noise? Another question is, to my understanding, the way getting the phrases is the same way like getting n-gram, is that right? Are they the same stuff or they are different?
ReplyDelete
Replies
dabelNovember 24, 2013 at 12:24 PM
Q1: Why is this called 'deep' learning?

Q2: How exactly does the tensor product work? Why does adding this have a big impact? The paper claims that "the tensors can directly relate input vectors", but why does this give the desired effect (answering this question from the paper: "Can a single, more powerful composition function per- form better and compose aggregate meaning from smaller constituents more accurately than many in- put specific ones?") and how does this correspond to the desired composition?
ReplyDelete
Replies
UnknownNovember 24, 2013 at 1:49 PM
It is nice that the RNTN model can accurately classify positive/negative sentiments of phrases. Especially handling the negation of positive/negative phrases is really nice.

Troubles::
I don’t really get how the learning is done (what is going on in section 4.4?).
I am having trouble of picturing what tensor is. It looks like a kind of a matrix...

Softmax:
http://ufldl.stanford.edu/wiki/index.php/Softmax_Regression has a nice explanation of softmax regression that was used in the paper.
ReplyDelete
Replies
Kurt SpindlerNovember 24, 2013 at 1:50 PM
Q1: not really answerable by us I think, but I'd like to see examples where the system fails. What are the failure cases?

Q2: Somewhat similar to Dave's question - I feel a little bit fuzzy on the precise meaning of everything in the tensor pipeline. One specific question is: they list a problem with the MV model as the explosion of number of parameters (seems to be d x d + d per node, and its dependence on vocab size). But then V seems to be 2d x 2d x d, and similarly dependent on vocab size.

(sub Q2): Just confirming my understanding, but isn't d the size of vocabulary? I wonder what their specific number was. Unless I mis-understand, it seems like those matrices would be really really large.
ReplyDelete
Replies
UnknownNovember 24, 2013 at 2:02 PM
Why does the RNTN account so well for the Negated Negative and Negated Positive examples, while the MV-RNN, which is very close to the accuracy of the RNTN on most other examples, fails miserably? It's fairly easy to see why bag of words models would fail on these cases, but I'm not as clear as to why some of the more complex models fail.
ReplyDelete
Replies
Lixing LianNovember 24, 2013 at 11:06 PM
I am also confused about the accuracy gap between MVRNN and RNTN, which factor make this gap so large though MVRNN has high space complexity? Also, I tested 'the movie is not bad' and 'the movie is not too bad' in the live demo. I am not sure if the design handles 'too' but I guess not because the sentiment classification of 'too' is neutral. However, the classification of nodes 'too bad' and 'is not too bad' look a little weird: why 'too bad' tends to be neutral but then 'is not too bad' tends to be negative? How does the system do that?
ReplyDelete
Replies
UnknownNovember 24, 2013 at 11:42 PM
Is this algorithm useful for differentiating other aspects of language? Sarcasm or stubbornness are too contextual for short phrases, but perhaps the algorithm would be able to detect bitterness or humor? I also wonder how this could expand into categorizing sounds around language, like pauses, which could be added into the parse trees.
ReplyDelete
Replies
NGNovember 25, 2013 at 11:04 AM
1) How is Eq 2 working? It does not look like comes directly from the KL divergence but has some other funny working and the parameters of lambda and theta have not been talked about.
2) Most of the backprop is talking about improving the model itself, where is talk about improving the word vectors which were chosen randomly? Or am I understanding this completely wrong.
ReplyDelete
Replies
UnknownNovember 25, 2013 at 12:19 PM
Q1: I'm also confused about the tensor. What are slices inside a tensor? And the paper said the composition function for all nodes are tensor-based and are the same. So are the slices in the tensor same too?

Q2: I'm confused about the tensor product equation(hi). The first matrix of hi is 1*2d, the second matrix(Vi) of hi is d*d and the third matrix is 2d*1. How could they multiply each other? Because the dimension of the each matrix does not match.

Q3: Why does RNTN performs so well in the negating sentences? How does tensor impact these types of sentences?
ReplyDelete
Replies
UnknownNovember 25, 2013 at 12:39 PM
I am also confused about the parameter d. It seems that it is the number of features, but what is the appropriate value of it? Does it have something to do with the size of available data set? And how would the value of d affect the performance of the model?
ReplyDelete
Replies
Jun Ki LeeNovember 25, 2013 at 1:30 PM
The approach seems to be very interesting since it goes beyond just using a bag of words model and actually use a sentence structure to acquire sentiment judgement.
My question is also similar to above questions. How positively or negatively labeled sentences can help human robot interaction? To me, It can be helpful more for a long term interaction than a short term interaction.
With many people above, I am also confused about the tensor and how it can be separated. It seems important since in the learning phase, they learned separately.
ReplyDelete
Replies
UnknownNovember 25, 2013 at 3:58 PM
I really want to know how well humans do on the same task, and how much agreement they saw between the 3 judges. 85% seems reasonably good, but are the 15% really ambiguous? Or does a human always get them right?
ReplyDelete
Replies
akovacsNovember 25, 2013 at 8:21 PM
Not really a direct question, but this is a pretty complicated system and we end up about 10% better in the best case compared to a simple-minded vector averaging approach so was it all worth it?

Sentiment analysis is useful but a pretty specific problem with a binary or single-dimensional answer. Does this approach generalize to other language questions like ones relevant to grounding meaning beyond vague concepts like sentiment?
ReplyDelete
Replies
IzaakNovember 26, 2013 at 12:27 AM
I'm not sure if this is the right thing to post -- I just had a terrible time trying to walk myself through the mathematics in this paper. I found that I was able to understand the general idea. The system uses a very fine-grained dataset of correctly parsed sentences annotated with parse trees and sentiment information in order to classify new statements. I'm unfamiliar with the motivation and general idea of "neural" systems and I'm wondering if anyone has some good resources with which to start?
ReplyDelete
Replies

Add comment

Pages

Friday, November 22, 2013

Deep Parsing

40 comments: