Machine Learning
methods in Modeling Human learning
(Psy
5993-034)
University of Minnesota, Fall Semester,
2008
http://www.schrater.org
Instructors:
Paul Schrater (schrater@umn.edu)
Adam Johnson
Meeting time : Friday 2-3:30pm
Place: Elliott Hall
204
Recent advances in machine learning provide a powerful set of
new tools to understand human learning.
Understanding the computational principles and fundamental problems
faced by attempts to produce artificial agents provides a framework for
developing models of human abilities. Because of its intrinsic
importance to human behavior, learning is a central problem for
researchers
interested in development, neuroscience, cognition and behavior, and
artificial intelligence. We will study three interrelated issues where
cognitive
scientists have begun using machine learning tools to study learning.
In particular, we will look at the role of structure learning, causal
analysis, and hierarchy in explaining difficult-to-model aspects of
human learning.
Format: Discussion of
journal articles led by
seminar members. Students will prepare a term paper or term project
on a related topic.
SEPT 12th
Planning and Acting in Partially
Observable Stochastic Domains
http://www.eecs.harvard.edu/~avi/CS281r/F06/Papers/kaelbling-et-al-pomdp.ps
POMDP
for dummies
and/or Chapter 3 of Sutton and Barto's
book Reinforcement Learning: An Introduction
http://www.cs.ualberta.ca/%7Esutton/book/ebook/node27.html
SEPT 19th
SEPT 26th
OCT 3rd REPRESENTATION
OF REWARD -
OCT 10th REPRESENTATION
OF ACTION
OCT 17th GENERALIZATION
OF VALUE
Steve Damer will leadMahadevan,
S. and Maggioni, M. "Proto-Value Functions: A Laplacian
Framework
for Learning Representation and Control in Markov Decision Processes" ,
Journal of Machine Learning Research, pp. 2169-2231, vol. 8,
2007, MIT Press.
http://www.cs.umass.edu/~mahadeva/papers/06-35.pdfArsen Bagyan
will leadMannor, S., Menache, I., Hoze, A., and
Klein, U. 2004. Dynamic abstraction in reinforcement learning via
clustering.
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.1.5975&rep=rep1&type=pdf OCT 23rd REPRESENTATION
OF ENVIRONMENT
Brett Hemes will lead
Boutilier,
C., Dearden, R., Goldszmidt, M. (1995). Exploiting Structure in Policy
Construction.
In IJCAI 1104-1113
http://www.isi.edu/~blythe/cs541/Readings/spi.pdfGuestin,
C., Koller, D., Parr, R., and Venkataraman, S. (2003) Efficient
Solution Algorithms for Factored MDPs,
Journal of
Arti¯cial Intelligence Research 19, 399-468.
http://www.cs.cmu.edu/afs/cs/project/jair/pub/volume19/guestrin03a.pdf OCT 31st
NOV 7th
NOV
14th
Paul Schrater will lecture on Causal models
J.
Pearl, "Graphs, Causality, and Structural Equation Models"
UCLA
Cognitive Systems Laboratory, Technical Report (R-253), June 1998.
Socioligical
Methods and Research, Vol. 27, No. 2, 226-284, November 1998.
http://ftp.cs.ucla.edu/pub/stat_ser/R253.pdfGlymour,
C., Learning, prediction and causal Bayes nets, Trends Cogn. Sci. 7
(2003), pp. 43–48.
pdf
NOV 21st
NOV 28th
THANKSGIVING!
DEC 5th
Adam Steiner will leadModel uncertainty in classical conditioning
A. Courville, N.D. Daw, G. Gordon, and D.S. Touretzky
Advances in NeuralInformation Processing 16, MIT Press, Cambridge, MA, 2005.
http://www.cns.nyu.edu/~daw/cdgt03.pdfAdam Johnson will leadTse
D, Langston RF, Kakeyama M, Bethus I, Spooner PA, Wood ER, Witter MP,
Morris RGM
(2007) Schemas and memory consolidation. Science
316(5821):76–82.
http://www.sciencemag.org/cgi/content/abstract/316/5821/76
DEC 12th
Arsen Bagyan & Paul will leadKemp,
C., & Tenenbaum, J. B. (2008). The discovery of structural
form.
Proceedings of the National Academy of Sciences.
105(31), 10687-10692.
http://www.psy.cmu.edu/~ckemp/papers/kempt08.pdfChris Kallie will leadif itme: Lu,
H., Rojas, R., Beckers, T., & Yuille, A. (2008). Sequential
causal learning in humans and rats.
Proceedings of
the Twenty-ninth Annual Conference of the Cognitive Science Society. [
PDF]
Background reading
Human learning - concepts and
key results
Causal learning -
theoretical framework
J.
Pearl, "Graphs, Causality, and Structural Equation Models"
UCLA
Cognitive Systems Laboratory, Technical Report (R-253), June 1998.
Socioligical
Methods and Research, Vol. 27, No. 2, 226-284, November 1998.
http://ftp.cs.ucla.edu/pub/stat_ser/R253.pdf
JUDEA
PEARL - CAUSALITY http://bayes.cs.ucla.edu/BOOK-2K/
Sequential
decision making - theoretical framework
Planning
and Acting in Partially Observable Stochastic Domains
http://www.eecs.harvard.edu/~avi/CS281r/F06/Papers/kaelbling-et-al-pomdp.ps
POMDP
for dummies
Recent
Advances in Hierarchical Reinforcement Learning
Hierarchical
Reinforcement Learning with the MAXQ Value Function
http://www.jair.org/media/639/live-639-1834-jair.pdf
Tentative
Reading List
Bray
S, Rangel A, Shimo jo S, Balleine B, O’Doherty JP (2008) The neural
mechanisms underly-
ing the influence of pavlovian cues on
human decision making. J Neurosci 28(22):5861–5866.
URL http://dx.doi.org/10.1523/JNEUROSCI.0897-08.2008.
Boutilier, C., Dearden, R., Goldszmidt, M. (1995).
Exploiting Structure in Policy Construction.
In IJCAI
1104-1113 http://www.isi.edu/~blythe/cs541/Readings/spi.pdf
Cohen
JD, McClure SM, Yu AJ (2007) Should i stay or should i go? how the
human brain
manages the trade-off between exploitation and
exploration. Philos Trans R Soc Lond B Biol
Sci
362(1481):933–942. URL http://dx.doi.org/10.1098/rstb.2007.2098.
Colwill RM, Rescorla RA (1990) Evidence for the
hierarchical structure of instrumental learn-
ing. Animal
Learning and Behavior 18(1):71–82. pdf
Daw
ND, Niv Y, Dayan P (2005) Uncertainty-based competition between
prefrontal and dor-
solateral striatal systems for
behavioral control. Nature Neuroscience 8(12):1704–1711. pdf
Daw
ND, O’Doherty JP, Dayan P, Seymour B, Dolan RJ (2006) Cortical
substrates for exploratory decisions in humans.
Nature
441(7095):876–879. http://www.nature.com/nature/journal/v441/n7095/abs/nature04766.html
Glymour,
C., Learning, prediction and causal Bayes nets, Trends Cogn. Sci. 7
(2003), pp. 43–48. pdf
Guestin,
C., Koller, D., Parr, R., and Venkataraman, S. (2003) Efficient
Solution Algorithms for Factored MDPs,
Journal of
Arti¯cial Intelligence Research 19, 399-468. http://www.cs.cmu.edu/afs/cs/project/jair/pub/volume19/guestrin03a.pdf
Hagmayer,
Y. et al. Causal reasoning through intervention. http://else.econ.ucl.ac.uk/papers/uploaded/199.pdf
In
Causal Learning: Psychology, Philosophy, and Computation (Gopnik, A.
and Schulz, L., eds), Oxford University Press.
Kakade
S, Dayan P (2002) Dopamine: generalization and bonuses. Neural Networks
15:549–
599. http://ttic.uchicago.edu/~sham/papers/neuro/nn_da.pdf
Kemp,
C., & Tenenbaum, J. B. (2008). The discovery of structural
form.
Proceedings of the National Academy of Sciences.
105(31), 10687-10692. http://www.psy.cmu.edu/~ckemp/papers/kempt08.pdf
Kemp
C, Perfors A, Tenenbaum JB (2007) Learning overhypotheses with
hierarchical bayesian
models. Dev Sci 10(3):307–321. http://web.mit.edu/cocosci/Papers/devsci07_kempetal.pdf
Lu
H, Yuille AL, Liljeholm M, Cheng PW, Holyoak KJ (2008) Bayesian generic
priors for causal learning.
Psychol Rev, in press. http://www.stat.ucla.edu/~yuille/pubs/ucla/C10_hjlu_PsychRev2007.pdf
Lu,
H., Rojas, R., Beckers, T., & Yuille, A. (2008). Sequential
causal learning in humans and rats.
Proceedings of
the Twenty-ninth Annual Conference of the Cognitive Science Society. [PDF]
Mahadevan,
S. and Maggioni, M. "Proto-Value Functions: A Laplacian
Framework
for Learning Representation and Control in Markov Decision Processes" ,
Journal of Machine Learning Research, pp. 2169-2231, vol. 8,
2007, MIT Press. http://www.cs.umass.edu/~mahadeva/papers/06-35.pdf
Mannor,
S., Menache, I., Hoze, A., and Klein, U. 2004. Dynamic abstraction in
reinforcement learning via clustering.
In
Proceedings of the Twenty-First international Conference on Machine
Learning (Banff, Alberta, Canada, July 04 - 08, 2004). ICML '04, vol.
69.
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.1.5975&rep=rep1&type=pdf
Mehta,
N., Ray, S., Tadepalli, P., Dietterich, T. (2008). Automatic Discovery
and Transfer of MAXQ Hierarchies.
International Conference
on Machine Learning (ICML-2008) http://pages.cs.wisc.edu/~sray/papers/maxq.icml08.pdf
Niv
Y., Joel D., Meilijson I. and Ruppin E. (2002) -- Evolution of
Reinforcement Learning in Uncertain Environments:
A Simple
Explanation for Complex Foraging Behaviors pdf
Niv
Y, Daw ND, Dayan P (2005) How fast to work: Response vigor, motivation
and tonic
dopamine. In: Advances in Neural Information
Processing Systems 18. Cambridge, MA:
MIT Press. http://www.cns.nyu.edu/~daw/ndd05.pdf
Niv
Y, Joel D, Dayan P (2006) A normative perspective on motivation. Trends
Cogn Sci
10(8):375–381. http://www.gatsby.ucl.ac.uk/~dayan/papers/njd2006.pdf
Sloman,
S., Hagmayer, Y.(2006) The causal psycho-logic of choice,
Trends in Cognitive SciencesVolume 10, Issue 9, Pages
407-412.
(http://www.sciencedirect.com/science/article/B6VH9-4KKNNHN-2/2/2e28a448e7044c8e93c064f7d9908c5e)
Steyvers
et al., Inferring causal networks from observations and interventions,
Cogn. Sci. 27 (2003), pp. 453–489.
http://web.mit.edu/cocosci/Papers/steyvers-etal-2003.pdf
Talmi
D, Seymour B, Dayan P, Dolan RJ (2008) Human pavlovian-instrumental
transfer. J
Neurosci 28(2):360–368. URL http://dx.doi.org/10.1523/JNEUROSCI.4028-07.2008.
Tanaka SC, Balleine BW, O’Doherty JP (2008)
Calculating consequences: brain sys-
tems that encode the
causal effects of actions. J Neurosci 28(26):6750–6755. URL
http://dx.doi.org/10.1523/JNEUROSCI.1808-08.2008.
Tolman EC (1939) Prediction of vicarious trial and
error by means of the schematic sowbug.
Psychological Review
46:318–336.
http://www.cc.gatech.edu/ai/robot-lab/research/eBug/
Tse
D, Langston RF, Kakeyama M, Bethus I, Spooner PA, Wood ER, Witter MP,
Morris RGM
(2007) Schemas and memory consolidation. Science
316(5821):76–82. http://www.sciencemag.org/cgi/content/abstract/316/5821/76
Wang,
G., Mahadevan, S. "Hierarchical Optimization of
Policy-Coupled Semi-Markov Decision Processes",
Proceedings
of the 16th International Conference on Machine Learning (ICML '99),
Bled, Slovenia, June 27-30, 1999.
http://www.cs.umass.edu/~mahadeva/papers/icml99.ps.gz
Yu
AJ, Dayan P (2005) Uncertainty, neuromodulation, and attention. Neuron
46(4):681–692.
URL http://dx.doi.org/10.1016/j.neuron.2005.04.026.