Joint Optimization of Concave Scalarized Multi-Objective Reinforcement Learning with Policy Gradient Based Algorithm

Qinbo Bai; Mridul Agarwal; Vaneet Aggarwal

doi:10.1613/JAIR.1.13981

Back

Joint Optimization of Concave Scalarized Multi-Objective Reinforcement Learning with Policy Gradient Based Algorithm

Journal article

Open access

Peer reviewed

Joint Optimization of Concave Scalarized Multi-Objective Reinforcement Learning with Policy Gradient Based Algorithm

Qinbo Bai, Mridul Agarwal and Vaneet Aggarwal

The Journal of artificial intelligence research, Vol.74, pp.1565-1597

01/01/2022

DOI: https://doi.org/10.1613/JAIR.1.13981

Abstract

Computer Science

Computer Science, Artificial Intelligence

Science & Technology

Technology

Many engineering problems have multiple objectives, and the overall aim is to optimize a non-linear function of these objectives. In this paper, we formulate the problem of maximizing a non-linear concave function of multiple long-term objectives. A policy-gradient based model-free algorithm is proposed for the problem. To compute an estimate of the gradient, an asymptotically biased estimator is proposed. The proposed algorithm is shown to achieve convergence to within an epsilon of the global optima after sampling O(M-4 sigma(2)/(1-gamma)(8)epsilon(4)) trajectories where gamma is the discount factor and M is the number of the agents, thus achieving the same dependence on epsilon as the policy gradient algorithm for the standard reinforcement learning.

Files and links (1)

url

https://doi.org/10.1613/JAIR.1.13981View

Published (Version of record) Open

Metrics

1 Record Views

See more details

Details

Title: Joint Optimization of Concave Scalarized Multi-Objective Reinforcement Learning with Policy Gradient Based Algorithm
Creators - without role: Qinbo Bai - Purdue Univ, W Lafayette, IN 47907 USA
Mridul Agarwal - Purdue Univ, W Lafayette, IN 47907 USA
Vaneet Aggarwal - Purdue University System
Publication Details: The Journal of artificial intelligence research, Vol.74, pp.1565-1597
Publisher: Ai Access Foundation
Number of pages: 33
Identifiers: 9945022608331
Academic Unit: King Abdulaziz University; King Abdullah University of Science & Technology
Language: English
Resource Type: Journal article