# Federated Learning Algorithms & Applications II – Invited Special Session

## A5L-E: Federated Learning Algorithms & Applications II - Invited Special Session

Session Type: LectureSession Code: A5L-E

Location: Room 5

Date & Time: Wednesday March 22, 2023 (15:20-16:20)

Chair: Sanjay Purushotham

Track: 12

Paper ID | Paper Name | Authors | Abstract |
---|---|---|---|

3030 | Distributed Policy Gradient with Heterogeneous Computations for Federated Reinforcement Learning | Ye Zhu, Xiaowen Gong | The rapid advances in federated learning (FL) in the past few years have recently inspired federated reinforcement learning (FRL), where multiple reinforcement learning (RL) agents collaboratively learn a common decision-making policy without exchanging their raw interaction data with their environments. In this paper, we consider a general FRL framework where agents interact with different environments with identical state and action spaces but different rewards and dynamics. Motivated by the fact that agents often have heterogeneous computation capabilities, we propose a heterogeneous distributed policy gradient (PG) algorithm for FRL, where agents can use different numbers of data trajectories for their local PG algorithms. We characterize the training loss bound of the proposed algorithm. It shows that the algorithm converges at a rate of $O(1/T)$ where $T$ is the number of communication round, which matches the performance of existing algorithms for distributed RL. The result also shows the impacts of the trajectory numbers on the training loss. The theoretical results are verified empirically for various RL benchmark tasks. |

3133 | Teaching Reinforcement Learning Agents via Reinforcement Learning | Kun Yang, Chengshuai Shi, Cong Shen | In many real-world reinforcement learning (RL) tasks, the agent who takes the actions often only has partial observations of the environment. On the other hand, a principal may have a complete, system-level view but cannot directly take actions to interact with the environment. Motivated by this agent-principal mismatch, we study a novel “teaching” problem where the principal attempts to guide the agent’s behavior via implicit adjustment on her observed rewards. Rather than solving specific instances of this problem, we develop a general RL framework for the principal to teach any RL agent without knowing the optimal action a priori. The key idea is to view the agent as part of the environment, and to directly set the reward adjustment as actions such that efficient learning and teaching can be simultaneously accomplished at the principal. This framework is fully adaptive to diverse principal and agent settings (such as different agent strategies and adjustment costs), and can adopt a variety of RL algorithms to solve the teaching problem with provable performance guarantees. Extensive experimental results on different RL tasks demonstrate that the proposed framework guarantees a stable convergence and achieves the best tradeoff between rewards and costs among various baseline solutions |