化工进展 ›› 2024, Vol. 43 ›› Issue (3): 1167-1177.DOI: 10.16085/j.issn.1000-6613.2023-1498

• 化工过程与装备 • 上一篇    

基于SAC的炼厂原油储运调度方法

马楠1(), 李洪奇1(), 刘华林2,3, 杨磊2,3   

  1. 1.中国石油大学(北京) 信息科学与工程学院,北京 102249
    2.中国石油天然气股份有限公司规划总院,北京 100083
    3.中国石油天然气股份有限公司油气业务链优化重点实验室,北京 100083
  • 收稿日期:2023-08-28 修回日期:2023-11-18 出版日期:2024-03-10 发布日期:2024-04-11
  • 通讯作者: 李洪奇
  • 作者简介:马楠(1988—),女,博士研究生,研究方向为计算机技术与资源信息工程、油气领域智能决策。E-mail:mn2006hotter@126.com
  • 基金资助:
    直属院所基础研究和战略储备技术研究基金(KJ2021-316)

Scheduling algorithm for refinery crude oil storage and transportation based on SAC

MA Nan1(), LI Hongqi1(), LIU Hualin2,3, YANG Lei2,3   

  1. 1.School of Information Science and Engineering, China University of Petroleum, Beijing 102249, China
    2.Petrochina Planning and Engineering Institute, Beijing 100083, China
    3.Key Laboratory of Oil Gas Business Chain Optimization, CNPC, Beijing 100083, China
  • Received:2023-08-28 Revised:2023-11-18 Online:2024-03-10 Published:2024-04-11
  • Contact: LI Hongqi

摘要:

目前对于炼厂原油储运调度决策的研究大多采用基于数学规划的静态调度方案,求解时间较长并且无法针对环境的变化进行实时高效的储运调度优化。为此,本文结合深度强化学习算法建立了考虑炼厂生产约束的原油资源储运动态实时调度决策算法。该算法首先将炼厂原油资源调度问题转换为马尔可夫决策过程,其次提出了一种基于软演员-评论家(soft actor-critic,SAC)的深度强化学习算法来同时确定调度过程中的传输目标等离散决策以及传输速度等连续决策。结果表明,算法学习到的策略可行性较好,与基线算法相比,油轮在港时间、调度方案事件数量、加工计划执行率等重要指标方面均得到了较好的效果,在求解时间方面大幅提升至毫秒级,并有效控制随机事件对整体决策的影响范围。该算法可为沿海炼厂原油储运调度快速决策提供新的思路。

关键词: 炼厂原油储运, 资源调度, 深度强化学习, 软演员-评论家

Abstract:

Currently, most refinery crude oil scheduling studies adopt static scheduling schemes based on mathematical programming, which cannot adjust and optimize according to environmental change in real-time. This paper established a dynamic real-time scheduling decision model subject to refinery production constraints and designed the corresponding agent interaction environment. The soft actor-critic (SAC) algorithm in deep reinforcement learning solved the model. Firstly, the crude oil resource scheduling problem was transformed into a Markov decision process, and a deep reinforcement learning algorithm based on SAC was proposed to simultaneously determine discrete decisions such as transmission target and continuous decisions such as transmission speed in the scheduling process. Extensive experimental results showed that the strategy learned by the algorithm has better usability, which effectively improved the decision-making efficiency of the algorithm and effectively controlled the influence range of random events on the overall decision-making compared with the baseline algorithm. This algorithm can provide new ideas for rapid decision-making of crude oil storage and transportation scheduling in coastal refineries.

Key words: refinery crude oil storage and transportation, resource scheduling, deep reinforcement learning, soft actor-critic (SAC)

中图分类号: 

京ICP备12046843号-2;京公网安备 11010102001994号
版权所有 © 《化工进展》编辑部
地址:北京市东城区青年湖南街13号 邮编:100011
电子信箱:hgjz@cip.com.cn
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn