An Inverse Risk-Sensitive Reinforcement Learning Approach to Learning Agent Preferences
Eric Mazumdar

Citation
Eric Mazumdar. "An Inverse Risk-Sensitive Reinforcement Learning Approach to Learning Agent Preferences". Talk or presentation, 24, August, 2017.

Abstract
The modeling and learning of human behaviors is becoming increasingly important as critical systems begin to rely more on automation and artificial intelligence. Yet, in this task we face a number of challenges, not least of which is the fact that humans are known to behave in ways that are not completely rational. We address the problem of learning a risk-sensitive agent's value function in a Markov decision process (MDP) through a gradient-based inverse risk-sensitive reinforcement learning approach. We model risk-sensitivity in a reinforcement learning framework by making use of models of human decision-making having their origins in behavioral psychology, behavioral economics, and neuroscience. We validate the approach on the classical Grid World MDP, and an MDP constructed from ride-sharing data.

Electronic downloads


Internal. This publication has been marked by the author for FORCES-only distribution, so electronic downloads are not available without logging in.
Citation formats  
  • HTML
    Eric Mazumdar. <a
    href="http://www.cps-forces.org/pubs/282.html"
    ><i>An Inverse Risk-Sensitive Reinforcement
    Learning Approach to Learning Agent
    Preferences</i></a>, Talk or presentation,  24,
    August, 2017.
  • Plain text
    Eric Mazumdar. "An Inverse Risk-Sensitive Reinforcement
    Learning Approach to Learning Agent Preferences". Talk
    or presentation,  24, August, 2017.
  • BibTeX
    @presentation{Mazumdar17_InverseRiskSensitiveReinforcementLearningApproachTo,
        author = {Eric Mazumdar},
        title = {An Inverse Risk-Sensitive Reinforcement Learning
                  Approach to Learning Agent Preferences},
        day = {24},
        month = {August},
        year = {2017},
        abstract = {The modeling and learning of human behaviors is
                  becoming increasingly important as critical
                  systems begin to rely more on automation and
                  artificial intelligence. Yet, in this task we face
                  a number of challenges, not least of which is the
                  fact that humans are known to behave in ways that
                  are not completely rational. We address the
                  problem of learning a risk-sensitive agent's value
                  function in a Markov decision process (MDP)
                  through a gradient-based inverse risk-sensitive
                  reinforcement learning approach. We model
                  risk-sensitivity in a reinforcement learning
                  framework by making use of models of human
                  decision-making having their origins in behavioral
                  psychology, behavioral economics, and
                  neuroscience. We validate the approach on the
                  classical Grid World MDP, and an MDP constructed
                  from ride-sharing data.},
        URL = {http://cps-forces.org/pubs/282.html}
    }
    

Posted by Carolyn Winter on 24 Aug 2017.
Groups: forces
For additional information, see the Publications FAQ or contact webmaster at cps-forces org.

Notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright.