JISE

This paper focuses on push-manipulation in an agent-based animation. A policy is learned in a learning session in which an agent perceives its own internal state and the surrounding environment and determines its actions. In each time step, the agent performs an action. Then it receives a reward that is a combination of different types of reward terms, including forward progress, orientation progress, collision avoidance, and finish time. Based on the received reward, the policy is improved gradually. We develop a system that controls an agent to transport a box. We investigate the effects of each reward term and study the impacts of various inputs on the performance of the agent in environments with obstacles. The inputs include the number of rays for perceiving the environment, obstacle settings, and box sizes. We performed some experiments and analyzed our findings in details. The experiment results show that the behaviors of agents are affected by the reward terms and various inputs in certain aspects, such as the movement smoothness of the agents, wandering about the box, loss of orientation, sensitivity about collision avoidance, and pushing styles.