CSCI 2951K: Topics in Grounded Language for Robotics: 2013

Monday, December 2, 2013

Final Project Presentations

Here is the schedule for final project presentations. As we discussed in class, the presentation will be 15 minutes followed by five minutes of questions. Although we discussed doing auto-advance in class,with the longer talks, I'd like to try without auto advance, because I think 15 minutes is too long for auto-advance to work well. Thus you can use whatever presentation tool and format that you'd like.

Your presentation should include:

Research question
Significance (why is it an important problem?)
Related Work (who else has addressed it, and how is your approach different?)
Methodology (what you did to address the problem)
Results (what new things have you learned about the research question?)
Contributions (What have you contributed to the state of the art through this project? What is the answer to your research question?)

You should also prepare a written final project document which should cover these areas, directed at a reader unfamiliar with the project. It should be written in the style of the papers we have been reading, with enough detail that an interested reader could implement your algorithm to reproduce your results. This document is due to me and Eugene by email on Thursday 12/12, the last day of classes.

Thursday 12/5:

Jun Ki Lee, Zhiqiang Sui. Learning Natural Language Commands for Robots in Home Environment Situations.
Miles Eldon and Kurt Spindler. Comparing Inference Algorithms for Grounding Trajectories.
Do Kook Choe. Navigation via Machine Translation.
Stephen Brawner. Task-based User Modeling in Shared Autonomy.

Tuesday 12/9:

David Abel and Gabrial Barth-Maron
Lauren Bilsky. Machine Translation using Grounded Language and Topic Modeling.
Andrew Kovacs and Sam Birch. Webtalk.
Xiaolu Li , Zhe Zhao. Automatic Turtle Graphics.

Thursday 12/12:

Izaak Baker and Nakul Gopalan. Athena.
Charles Yeh and Bowei Wang. Application of SHRDLU in Minecraft.
Yujie Wan, Lixing Lian. Learning Semantic Parser from Question-Answer Pairs.
Tom Sgouros. SHRDLU updated: parsing with ambiguity and without rules

Tuesday, November 26, 2013

Tree Substitution Grammars

The reading for Tuesday 12/3 will cover tree substitution grammars:

Cohn, Trevor, Sharon Goldwater, and Phil Blunsom. Inducing compact but accurate tree-substitution grammars. Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 2009.

By Sunday 12/1 at 7pm, please post a question about the paper. By 7pm on Monday 12/2, post an answer to somebody else's question.

Friday, November 22, 2013

Deep Parsing

This week we will read a paper on deep models for parsing:

Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank, Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Chris Manning, Andrew Ng and Chris Potts. Conference on Empirical Methods in Natural Language Processing.

There is a lot going on in this paper. By Sunday night at 5pm, post a question about something in the paper. By Monday night at 5pm, post an answer to somebody else's question.

There is no class on Thursday. Happy Thanksgiving!

Friday, November 15, 2013

Applications

This week the class will focus on applications. We will read three papers, one from the supply/warehouse environment, one from a factory floor environment, and one from a household environment. Each paper describes a robot designed for the corresponding environment. These papers give a sense of the complexity of an end-to-end robotic system, and also some visions of the ways in which robots will impact the world.

11/19/13 (Tues.)

S. Teller, Matthew R. Walter, M. Antone, A. Correa, R. Davis, L. Fletcher, E. Frazzoli, J. Glass, J.P. How, A.S. Huang, J.H. Jeon, S. Karaman, B. Luders, N. Roy, and T. Sainath. A voice-commandable robotic forklift working alongside humans in minimally-prepared outdoor environments. In Proc. IEEE Int’l Conf. on Robotics and Automation (ICRA), pages 526–533, 2010.
R. A. Knepper, T. Layton, J. Romanishin, and D. Rus. IkeaBot: An autonomous multi-robot coordinated furniture assembly system. In Proc. IEEE Int’l Conf. on Robotics and Automation (ICRA), Karlsruhe, Germany, May 2013.

11/21/13 (Thurs.)

M. Bollini, S. Tellex, T. Thompson, N. Roy, and D. Rus. Interpreting and executing recipes with a cooking robot. In 13th International Symposium on Experimental Robotics, 2012.
Jeremy Maitin-Shepard, Marco Cusumano-Towner, Jinna Lei and Pieter Abbeel. Cloth Grasp Point Detection based on Multiple-View Geometric Cues with Application to Robotic Towel Folding. In the proceedings of the International Conference on Robotics and Automation (ICRA), 2010.

By Sunday night at 5pm, post a comment discussing how language can fit into these systems. Try to identify a problem that language input or output could solve. Then discuss a technique you might use to solve it.

By Monday night at 5pm, post a reply to someone else's comment. Ask a question; expand on an idea; suggest a related citation.

Tuesday, November 12, 2013

Language and Gesture

There is a lot going on in the reading for this week. As you read the paper, focus on understanding what is given to the algorithm and what is being inferred by the algorithm. What is the input and what is the output? What are the analogs to the input and output in the language understanding domain?

Write a blog comment of about 200 words answering these questions. Please post it by 7am on Thursday morning.

Friday, November 8, 2013

Generating Communicative Actions

This week we will read two papers focused on generating communicative actions. The first one is currently in submission at a conference, and the second two papers were published last year at RSS and HRI by Sidd Srinivasa's group at CMU. There is a deep mathematical connection between the two approaches, which we will discuss in class.

11/12/13 (Tues.) Generating Language

Stefanie Tellex, Ross Knepper, Adrian Li, Thomas Howard, Daniela Rus, and Nicholas Roy. Asking for help using inverse semantics. (in preparation), 2014.

11/14/13 (Thurs.) Generating Motion

A.D. Dragan, K.T. Lee and S.S. Srinivasa, "Legibility and Predictability of Robot Motion". International Conference on Human-Robot Interaction (HRI), 2013.
A.D. Dragan and S.S. Srinivasa, "Generating Legible Motion". Robotics: Science and Systems (R:SS), 2013.

For Tuesday, please post on the blog suggestions for improving the Tellex et al. paper. I'm looking for about 200 words. This paper is not yet in its final version, so your comments will help improve it. It will also give you practice critically evaluating research which you can apply to your own work as you finish up your final projects.

Saturday, November 2, 2013

Midterm Presentations

Here are the instructions for your midterm presentations next week. Put your presentation slides in a Google Presentation in this directory:
https://drive.google.com/?pli=1&authuser=0#folders/0B3LSuLTwkM-_b3VCVVl5M0M3Tk0

Your slides must be set to auto advance. Instructions for doing that are here:
https://support.google.com/drive/answer/1696787?hl=en

You can make the timings be whatever you like, but they must automatically advance to the next slide. You have five minutes to present, and there will be a hard cutoff. Following your five minute talk, there will be ten minutes for discussion, questions, and comments. The way to be successful with this presentation format is to practice your talk. You should practice it out loud, from start to finish, at least three times, in order to do it smoothly and get the transitions right.

Your presentation should include a table of results, with at least some results. It's okay if the results aren't very good, but you should have at least run some algorithm on the dataset you plan to use and assessed its effectiveness. You should also discuss your plans for the rest of the semester.

Here is the presentation schedule:

Tuesday 11/5:

Izaak Baker and Nakul Gopalan. Athena.
Xiaolu Li , Zhe Zhao. Automatic Turtle Graphics.
Miles Eldon and Kurt Spindler. Comparing Inference Algorithms for Grounding Trajectories.
Yujie Wan, Lixing Lian. Learning Semantic Parser from Question-Answer Pairs.
Tom Sgouros. SHRDLU updated: parsing with ambiguity and without rules

Thursday 11/7:

Charles Yeh and Bowei Wang. Application of SHRDLU in Minecraft.
Lauren Bilsky. Machine Translation using Grounded Language and Topic Modeling.
Stephen Brawner. Task-based User Modeling in Shared Autonomy.
Do Kook Choe. Navigation via Machine Translation.
Jun Ki Lee, Zhiqiang Sui. Learning Natural Language Commands for Robots in Home Environment Situations.

Tuesday 11/12

David Abel and Gabriel Barth-Maron
Andrew Kovacs and Sam Birch. Webtalk.

Saturday, October 26, 2013

Machine Translation and Semantic Parsing

This Tuesday, we will cover machine translation. Please read chapter 2 of the textbook from Eugene's class:

Eugene Charniak and Mark Johnson. Introduction to Computational Linguistics. Chapter 2: Machine Translation.

Thursday's class will be a guest lecture from Yoav Artsi. There are two readings:

Yoav Artzi and Luke Zettlemoyer. Weakly Supervised Learning of Semantic Parsers for Mapping Instructions to Actions. Transactions of the Association for Computational Linguistics (TACL), 2013.

If you find that one too dense, then this paper might unpack things:

Luke Zettlemoyer and Michael Collins. Learning to Map Sentences to Logical Form: Structured Classification with Probabilistic Categorial Grammars.
In Proceedings of the Twenty First Conference on Uncertainty in Artificial Intelligence (UAI), 2005.

Friday, October 18, 2013

Dialog

Our previous readings have largely focused on one-off or one-sided interactions. This week we will think about dialogue, where the agent and human interact continuously using language. Our first reading will be a philosophical paper by Paul Grice. The reading for Tuesday's class will be a review paper covering the POMDP approach to dialogue systems, along with a philosophical paper that has influenced a lot of the thinking about dialogue.

Young, S., Gašić, M., Thomson, B., & Williams, J. D. (2013). POMDP-based statistical spoken dialog systems: A review. Proceedings of IEEE. 101:5. pages 1160-1179.
Grice, H. Paul. Logic and conversation.(1975): 41-58.

For Thursday's class, we will read Adam Vogel's recent work about the emergence of Gricean maxims from DEC-POMDP dialogue models:

Adam Vogel, Max Bodoia, Christopher Potts, and Dan Jurafsky. Emergence of Gricean Maxims from Multi-Agent Decision Theory. NAACL 2013
Adam Vogel, Christopher Potts, and Dan Jurafsky. Implicatures and Nested Beliefs in Approximate Decentralized-POMDPs.

Please post on the blog a comment of about 250 words pointing out one advantage and one disadvantage of the POMDP approach to dialogue. Post this response by 5pm on Sunday evening. Then by 7am Tuesday morning, post a response to someone else's comment on the blog.

Tuesday, October 15, 2013

Pick one of the other approaches for understanding commands (Chen & Mooney, Winograd, Matuszek et al., or Branavan et al.,) and describe how it could be integrated with SLAM. What are the strengths and weaknesses of these representations for understanding language, compared to the approach described in this week's paper?

Thursday, October 10, 2013

Semantic Mapping

On Tuesday we will study SLAM, without worrying about language. SLAM (Simultaneous Localization and Mapping) is an important problem in robotics.

Tuesday 10/14:

Durrant-Whyte, Hugh, and Tim Bailey. Simultaneous localization and mapping: part I. Robotics & Automation Magazine, IEEE 13.2 (2006): 99-110.

Then, on Thursday 10/17 we will study one attempt to integrate slam with language:

Ananth Ranganathan and Frank Dellaert. Online Probabilistic Topological Mapping. IJRR 2011.
Matthew R. Walter, S. Hemachandra, B. Homberg, S. Tellex, and S. Teller. Learning semantic maps from natural language descriptions. In Proceedings of Robotics: Science and Systems (RSS), Berlin, Germany, June 2013 (accepted, to appear).

Since you are working on your project proposals there is no required blog post assignment. But feel free to post a question or comment if you'd like!

Monday, October 7, 2013

Project Proposal Presentations

Here is the schedule for project proposal presentations. Each team will have 15 minutes. You should plan on talking for five minutes, so that most of the time will be used for discussion and feedback.

Let me know as soon as possible if you are taking the class but don't appear on the schedule.

Tuesday 10/8:

Lauren Bilsky. Machine Translation using Grounded Language and Topic Modeling.
Stephen Brawner. Task-based User Modeling in Shared Autonomy.
Do Kook Choe. Navigation via Machine Translation.
Jun Ki Lee, Zhiqiang Sui. Learning Natural Language Commands for Robots in Home Environment Situations.
Andrew Kovacs and Sam Birch. Webtalk.

Thursday 10/10:

Izaak Baker and Nakul Gopalan. Athena.
Xiaolu Li , Zhe Zhao. Automatic Turtle Graphics.
Miles Eldon and Kurt Spindler. Comparing Inference Algorithms for Grounding Trajectories.
Yujie Wan, Lixing Lian. Learning Semantic Parser from Question-Answer Pairs.

Tuesday 10/1:

David Abel and Gabrial Barth-Maron
Tom Sgouros. SHRDLU updated: parsing with ambiguity and without rules
Charles Yeh and Bowei Wang. Application of SHRDLU in Minecraft.

Tuesday, October 1, 2013

Probabilistic Grounding Models (III)

Probabilistic Grounding Models (III)

On Thursday, October 3rd, we will turn back to probabilistic grounding models, looking at one that takes a reinforcement learning approach.

S. R. K Branavan, Harr Chen, Luke S Zettlemoyer, and Regina Barzilay. Reinforcement learning for mapping instructions to actions. In Proceedings of the Association for Computational Linguistics, page 82–90, 2009.

This paper won Best Paper at ACL 2009. Post a response on the blog of about 500 words comparing this paper to the three previous papers in this area. (Chen & Mooney; Tellex et al., and Matuszek et al.) How are word meanings represented? How are they learned? What training data is required? What else is given to the system as input?

Friday, September 27, 2013

Planning Under Uncertainty

This week we will read a pair of papers about POMDPs, Partially Observable Markov Decision Processes. This framework has been used for robot planning and perception, as well as spoken dialogue systems.

Leslie P. Kaelbling, Michael L. Littman, and Anthony R. Cassandra. Planning and acting in partially observable stochastic domains. Artificial Intelligence, 101(1-2), 1998.
Hoey, Jesse, et al. Automated handwashing assistance for persons with dementia using video and a partially observable Markov decision process. Computer Vision and Image Understanding 114.5 (2010): 503-519. Focus on section 3, the POMDP model.

Post by 5pm on Sunday about 25 words answering the following question:

What are the challenges in using a POMDP model to drive a language-using agent? How could that challenge be overcome?

By 5pm on Monday, post a reply to someone else's blog post. Suggest an alternate solution, ask a clarifying question, or point out something they might find useful.

Tuesday, September 24, 2013

Probabilistic Grounding Models (II)

This week we will continue our investigation into probabilistic grounding models. We will read a paper about a language understanding system developed in collaboration between computational linguists and roboticists at the University of Washington.

Cynthia Matuszek, Nicholas FitzGerald, Luke Zettlemoyer, Liefeng Bo, and Dieter Fox. A Joint Model of Language and Perception for Grounded Attribute Learning. In Proceedings of the International Conference on Machine Learning (ICML), 2012.

This paper was written after the Chen and Mooney paper (2011) and the Tellex et al. paper. (Even though the Tellex et al. paper hasn't come out yet, the original contributions appeared in conferences in 2010 and 2011.) In some ways it unifies ideas from the two approaches: it jointly learns a semantic parsing model as well as attribute classifiers for colors and shapes.

Please post on the blog by 5pm on Wednesday a roughly 500 word answer to the following question:

Compare and contrast how this paper represents word meanings with the previous week's readings. (Chen and Mooney and Tellex et al.) What is being learned? What is given to the system? What tradeoffs are being made by the this approach, compared to the other two?

Friday, September 20, 2013

Dialogue

This Tuesday's lecture will be a guest lecture by Scott AnderBois about dialogue. The reading is here:

Ginzburg, J. The Dynamics and Semantics of Dialogue. Logic, language and computation 1 (1996).

He won't be talking about the specific formalism, so you can mostly ignore the situation and infon-specific details. He'll focus on the general mindset the paper embodies and about linguistic phenomena claimed to be sensitive to QUDs (Question Under Discussion). There is no specific assignment for this week. Instead I encourage you to begin working on your final project proposal.

UPDATE: Scott's slides are here: http://cs.brown.edu/courses/csci2951-k/papers/QUD_Lecture_Robotics.pdf.

Tuesday, September 17, 2013

Probabilistic Grounding Models (I)

There were some questions in the comments about why we've been studying the readings so far, and also about how it relates to robotics. As a result I've modified the schedule slightly, so that next week we jump into two papers about enabling robots to follow natural language commands.

The first paper describes an end-to-end system that my collaborators and I created for enabling robots to follow natural language commands about movement and manipulation. The paper is currently in submission to a journal and not currently published, so please don't distribute it:

Thomas Kollar, Stefanie Tellex, Matthew R. Walter, Albert Huang, Abraham Bachrach, Sachi Hemachandra, Emma Brunskill, Ashis Banerjee, Deb Roy, Seth Teller, Nicholas Roy. Generalized Grounding Graphs: A Probabilistic Framework for Understanding Grounded Language. (in submission.) Read carefully through section 4, paying special attention to the math in Section 3. Skim Section 5 (Route directions)

The second paper describes a different approach to following natural language commands by mapping between natural language and a symbolic language:

David Chen and Raymond Mooney. Learning to Interpret Natural Language Navigation Instructions from Observations. AAAI 2011.

Post a short (200 word) response to the following question: Compare and contrast how these two papers represent word meanings. What is being learned? What is given to the system? What tradeoffs are being made by the two different approaches?

Friday, September 13, 2013

Tuesday, September 17: Grounded Semantics

This week we will read a paper from cognitive science describing the connection between spatial language and spatial cognition. Specifically they study what geometric and perceptual features people appear to use to map between spatial language and objects in the external world.

Barbara Landau and Ray Jackendoff. “What” and “where” in spatial language and spatial cognition. Behavioral and Brain Sciences, 16:217–265, 1993.

For the assignment, let's try a different format to encourage more discussion.

By Sunday night at 5pm, please post a comment on the blog of about 200 words about any of the things we have discussed so far. You might describe a possible project, ask a question and present some possible answers, or compare and contrast ideas in what we've read so far and suggest areas you'd like to investigate more closely
By Monday night at 5pm, please reply to at least one other comment. Give them feedback about their ideas, try to answer their question, or expand on a point that you agree with. I'm looking for about 200 words total; it could be spread across several different comments.

Tuesday, September 10, 2013

Thursday, September 12: Semantics

This week we will read about semantics and natural language. My goal is for you to understand how words combine through syntax to create meaning. In the following weeks we will study computational approaches, but this week we will focus on linguistic representations.

Read:

I. Heim and A. Kratzer. Semantics in generative grammar, volume 13. Blackwell Oxford, 1998. Chapter 1, 2.1, 2.2, 2.3, 2.5, 4.4.

Questions:

Complete the following mini-problem set: http://cs.brown.edu/courses/csci2951-k/psets/2013-09-10-semantics/. Download the latex file and fill in the entries in the table of word meanings using the notation from Heim and Kratzer. Email me the tex and pdf file when you are done. If you have never used latex before, there is more information about it at the Brown CS LaTeX page.
Post an answer to the following question on the blog: How should a robot represent word meanings? What is good about the Heim and Kratzer approach? What is missing?

Thursday, September 5, 2013

Tuesday, September 10, 2013: Language Grounding and Artificial Intelligence

This Tuesday we will read one of the first systems for understanding natural language:

Terry Winograd. Procedures as a Representation for Data in a Computer Program for Understanding Natural Language. PhD thesis, Massachusetts Institute of Technology, 1971. (pages 1-52)

Terry Winograd's thesis was a system for understanding language about a table top filled with blocks of different shapes and colors. The whole thesis is more than 300 pages long; so only read the first two chapters. Pay particular attention to Section 1.3, the sample dialog. Please post a response of about 500 words¹ to the following questions in the forum:

How is SHRDLU representing word meanings? How is it combining word meanings together to understand entire sentences?
What are the strengths and weaknesses of Winograd's approach for representing word meaning? Especially think about running on a real robot instead of simulation. For additional contex, skim Harnad's paper on the symbol grounding problem:

Stevan Harnad. The symbol grounding problem. Physica D, 43:335–346, 1990.

What techniques and algorithms are necessary to support the types of linguistic behaviour that appear in Section 1.3? Brainstorm!

1. I'm looking for 500 words total, not 500 words for each question.

Wednesday, September 4, 2013

Welcome!

Welcome to Topics in Grounded Language for Robotics! In the home, in the factory, and in the field, robots have been deployed for tasks such as vacuuming, assembling cars, and disarming explosives. As robots become more powerful and more autonomous, it is crucial to develop ways for people to communicate with them. Natural language is an intuitive and flexible way of enabling this type of communication. The aim of this course is to study how to endow robots with the ability to interact with humans using natural language and then build systems!

The course will cover foundational material in artificial intelligence, computational linguistics, and robotics, as well as a survey of recent conference and journal papers. A collaborative final project will provide an opportunity to more deeply engage with the material and provide a jumping-off point for future research.

We will use the course blog to post updates and host discussions about the papers. Stay tuned for more details!

Pages