Reinforcement Learning Explained強化學習解義

  • 開課機構:  財團法人資訊工業策進會-微軟專業學程MPP
  • 開課平台: 資策會-數位學習平台
  • 講師介紹: Jonathan Sanito ;Roland Fernandez;Adith Swaminathan;Kenneth Tran;Katja Hofmann;Matthew Hausknecht

  • Jonathan Sanito

    Senior Content Developer Microsoft

    Jonathan works as a content developer and project manager for Microsoft focusing in Data and Analytics online training. He has worked with trainings for developer and IT pro audiences, from Microsoft Dynamics NAV to Windows Active Directory.

    Before coming to Microsoft, Jonathan worked as a consultant for a Microsoft partner, implementing Microsoft Dynamics NAV solutions.

    Roland Fernandez

    Senior Researcher and AI School Instructor, Deep Learning Technology Center Microsoft Research AI

    Roland works as a researcher and AI School instructor in the Deep Learning Technology Center of Microsoft Research AI. His interests include reinforcement learning, autonomous multitask learning, symbolic representation, AI education, information visualization, and HCI. Before coming to the DLTC, Roland worked in the VIBE group of MSR doing visualization and HCI projects, most notably the SandDance project. Before MSR, Roland worked (at Microsoft and other companies) in the areas of Natural User Interfaces, Activity Based Computing, Advanced Prototyping, Programmer Tools, Operating Systems, and Databases.

    Adith Swaminathan

    Researcher Microsoft Research AI

    Adith is a researcher at the Deep Learning Technology Center at Microsoft Research. He studies principles and algorithms that can improve human-centered systems using machine learning. Adith spent the 2015-16 academic year visiting the Information and Language Processing Systems group at the University of Amsterdam, interned with the Machine Learning group at Microsoft Research NYC during the summer of 2015, Computer Human Interactive Learning group (now called Machine Teaching Group) at Microsoft Research Redmond during the summer of 2013, Search Labs at Microsoft Research during the summer of 2012, and worked as a strategist with Tower Research Capital for 14 months from June 2010-July 2011.

    Kenneth Tran

    Principal Research Engineer Microsoft Research AI

    Kenneth is a Principal Research Engineer at the Deep Learning Technology Center. He has wide interest in Machine Learning spanning from optimization algorithms to distributed systems. His current main research pursuit is deep reinforcement learning with focus on off-policy learning and sample efficient methods, safe exploration, reverse reinforcement learning and real-world optimal control applications, including drones control, data center energy optimization, indoor farming optimization, etc.

    Katja Hofmann

    Researcher Microsoft Research AI

    Katja is a researcher at the Machine Intelligence and Perception group at Microsoft Research Cambridge. She is the research lead of Project Malmo, which uses the popular game Minecraft as an experimentation platform for developing intelligent technology. Her long-term goal is to develop AI systems that learn to collaborate with people, to empower their users and help solve complex real-world problems. Outside of Project Malmo, Katja works on online evaluation and interactive learning for information retrieval, which means understanding how we can apply machine learning an artificial intelligence to develop more intelligent search and recommendation systems.

    Matthew Hausknecht

    Researcher Microsoft Research AI

    Matthew is a researcher at Microsoft Research. His interests involve expanding the capabilities of intelligent agents. His main research is at the intersection of Reinforcement Learning and Deep Learning. Matthew received his PhD from the University of Texas at Austin under the supervision of Peter Stone.

    分享至FaceBook  分享至Google  分享至Twitter


    About this course

    Reinforcement Learning (RL) is an area of machine learning, where an agent learns by interacting with its environment to achieve a goal.

    In this course, you will be introduced to the world of reinforcement learning. You will learn how to frame reinforcement learning problems and start tackling classic examples like news recommendation, learning to navigate in a grid-world, and balancing a cart-pole.

    You will explore the basic algorithms from multi-armed bandits, dynamic programming, TD (temporal difference) learning, and progress towards larger state space using function approximation, in particular using deep learning. You will also learn about algorithms that focus on searching the best policy with policy gradient and actor critic methods. Along the way, you will get introduced to Project Malmo, a platform for Artificial Intelligence experimentation and research built on top of the Minecraft game.



    What you'll learn
    •Reinforcement Learning Problem
    •Markov Decision Process
    •Dynamic Programming
    •Temporal Difference Learning
    •Approximate Solution Methods
    •Policy Gradient and Actor Critic
    •RL that Works