I recently released on arXiv a new version of the paper The Problem of Social Cost in Multi-Agent General Reinforcement Learning: Survey and Synthesis, which can be found at
https://arxiv.org/abs/2412.02091
The new version has
- a more comprehensive description of the agent valuation functions in Section 4.1, as well as some variations and their problems in Appendix A.1;
- a description in Section 5.2 of a Bayesian reinforcement learning agent adapted from Dynamic Hedge AIXI that, as a group, will converge to a Nash equilibrium in general history-based environments as long as a certain “grain-of-truth” condition is satisfied;
- a description in Section 5.3 on how the latest advances in Swap Regret Minimisation algorithms can be used to design agents that collectively converge to a correlated equilibrium, which includes Nash equilibrium as a special case, when the “grain-of-truth” condition is not necessarily satisfied; and
- a new Guaranteed Utility Mechanism as an alternative to the VCG-based Mechanism in Section 4.1.
These are all non-trivial extensions of the paper that build on recent new results in different fields and they are worth a read.
Enjoy.