RL 在线资源整理

强化学习在线资源整理

  1. DeepMind Spinning Up简单推导和上手实践
  2. Berkeley CS285强化学习课程,主要为PG流派
  3. 李宏毅老师RL课程,逻辑清晰比较简单
  4. Multi-agent RL实现框架pymarl,包括CDTE方法下的QMIX、COMA实现

Open source implementation

  1. 超越IMPALA和SEED RL的强化学习加速框架
  2. IMPALA的pytorch实现
  3. DRL框架GARAGERLlibCatalystrlpyt
  4. SMAC SMAC - StarCraft Multi-Agent Challenge
  5. Fully Cooperative Multiagent Object Transporation Problems (CMOTPs)
    The Apprentice Firemen Game
    Pommerman
    Starcraft Multiagent Challenge
    The Multi-Agent Reinforcement Learning in Malmo (MARLO)
    Hanabi is a cooperative multiplayer card game (two to five players)
    Arena
    MuJoCo Multiagent Soccer
    Neural MMO

Game Theory mechanism expriments

  1. Keynes Beauty Contest
  2. Auction
  3. Stone Scissors
  4. Star Craft II
  5. two didactic

可follow的组/研究人员

Anuj Mahajan -- OATML

Chongjie Zhang -- Tsinghua University


RL目前存在的挑战

  1. scalability: 可扩展性CTDE(Centralized Training and Decentralized Execuation)
  2. Credit Assignment:each agent's contribution to the team
  3. uncertainty (non-stationary): ​p​a​r​t​i​al​ ​a​nd​ ​n​oi​s​y​ ​o​b​s​e​r​v​a​ti​on🌊通过communication解决来自环境的不确定性,多智能体会相互影响
  4. Heterogeneity:异构性 requiring diverse behaviors of agents /role-based
  5. Hierarchical:层次化Agent,多级agent面对的模型
  6. Coordination: 协调
  7. Generality:RL的泛化性,off-policy和在推理时数据不同