基于SAC的炼厂原油储运调度方法

doi:10.16085/j.issn.1000-6613.2023-1498

摘要/Abstract

摘要：

目前对于炼厂原油储运调度决策的研究大多采用基于数学规划的静态调度方案，求解时间较长并且无法针对环境的变化进行实时高效的储运调度优化。为此，本文结合深度强化学习算法建立了考虑炼厂生产约束的原油资源储运动态实时调度决策算法。该算法首先将炼厂原油资源调度问题转换为马尔可夫决策过程，其次提出了一种基于软演员-评论家（soft actor-critic，SAC）的深度强化学习算法来同时确定调度过程中的传输目标等离散决策以及传输速度等连续决策。结果表明，算法学习到的策略可行性较好，与基线算法相比，油轮在港时间、调度方案事件数量、加工计划执行率等重要指标方面均得到了较好的效果，在求解时间方面大幅提升至毫秒级，并有效控制随机事件对整体决策的影响范围。该算法可为沿海炼厂原油储运调度快速决策提供新的思路。

关键词: 炼厂原油储运, 资源调度, 深度强化学习, 软演员-评论家

Abstract:

Currently, most refinery crude oil scheduling studies adopt static scheduling schemes based on mathematical programming, which cannot adjust and optimize according to environmental change in real-time. This paper established a dynamic real-time scheduling decision model subject to refinery production constraints and designed the corresponding agent interaction environment. The soft actor-critic (SAC) algorithm in deep reinforcement learning solved the model. Firstly, the crude oil resource scheduling problem was transformed into a Markov decision process, and a deep reinforcement learning algorithm based on SAC was proposed to simultaneously determine discrete decisions such as transmission target and continuous decisions such as transmission speed in the scheduling process. Extensive experimental results showed that the strategy learned by the algorithm has better usability, which effectively improved the decision-making efficiency of the algorithm and effectively controlled the influence range of random events on the overall decision-making compared with the baseline algorithm. This algorithm can provide new ideas for rapid decision-making of crude oil storage and transportation scheduling in coastal refineries.

Key words: refinery crude oil storage and transportation, resource scheduling, deep reinforcement learning, soft actor-critic (SAC)

中图分类号:

TE624

马楠, 李洪奇, 刘华林, 杨磊. 基于SAC的炼厂原油储运调度方法[J]. 化工进展, 2024, 43(3): 1167-1177.

MA Nan, LI Hongqi, LIU Hualin, YANG Lei. Scheduling algorithm for refinery crude oil storage and transportation based on SAC[J]. Chemical Industry and Engineering Progress, 2024, 43(3): 1167-1177.

图/表 13

参考文献 23

1	Uğur YÜZGEÇ, PALAZOGLU Ahmet, ROMAGNOLI Jose A. Refinery scheduling of crude oil unloading, storage and processing using a model predictive control strategy[J]. Computers & Chemical Engineering, 2010, 34(10): 1671-1686.
2	郑万鹏, 高小永, 朱桂瑶, 等. 原油作业过程优化的研究进展[J]. 化工学报, 2021, 72(11): 5481-5501.
	ZHENG Wanpeng, GAO Xiaoyong, ZHU Guiyao, et al. Research progress on crude oil operation optimization[J]. CIESC Journal, 2021, 72(11): 5481-5501.
3	LEE Heeman, PINTO Jose M, GROSSMANN Ignacio E, et al. Mixed-integer linear programming model for refinery short-term scheduling of crude oil unloading with inventory management[J]. Industrial & Engineering Chemistry Research, 1996, 35(5): 1630-1641.
4	ZHANG Haoran, LIANG Yongtu, LIAO Qi, et al. Mixed-time mixed-integer linear programming for optimal detailed scheduling of a crude oil port depot[J]. Chemical Engineering Research and Design, 2018, 137: 434-451.
5	FURMAN Kevin C, JIA Zhenya, IERAPETRITOU Marianthi G. A robust event-based continuous time formulation for tank transfer scheduling[J]. Industrial & Engineering Chemistry Research, 2007, 46(26): 9126-9136.
6	ZHANG Shujing, XU Qiang. Refinery continuous-time crude scheduling with consideration of long-distance pipeline transportation[J]. Computers & Chemical Engineering, 2015, 75: 74-94.
7	ASSIS Leonardo S, CAMPONOGARA Eduardo, GROSSMANN Ignacio E. A MILP-based clustering strategy for integrating the operational management of crude oil supply[J]. Computers & Chemical Engineering, 2021, 145: 107161.
8	周智菊, 周祥, 周涵. 基于滚动时域分解策略的原油混输调度模型[J]. 石油学报(石油加工), 2021, 37(2): 320-329.
	ZHOU Zhiju, ZHOU Xiang, ZHOU Han. Scheduling model based on rolling-horizon algorithm for crude oil transportation[J]. Acta Petrolei Sinica (Petroleum Processing Section), 2021, 37(2): 320-329.
9	HOU Yan, WU Naiqi, ZHOU Mengchu, et al. Pareto-optimization for scheduling of crude oil operations in refinery via genetic algorithm[J]. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2017, 47(3): 517-530.
10	HOU Yan, WU Naiqi, LI Zhiwu, et al. Many-objective optimization for scheduling of crude oil operations based on NSGA-Ⅲ with consideration of energy efficiency[J]. Swarm and Evolutionary Computation, 2020, 57: 100714.
11	王子豪, 荣冈, 冯毅萍. 基于仿真的炼油厂罐区操作再调度策略[J]. 化工学报, 2012, 63(9): 2755-2765.
	WANG Zihao, RONG Gang, FENG Yiping. Simulation-based rescheduling strategy of tank farm operations in refinery[J]. CIESC Journal, 2012, 63(9): 2755-2765.
12	VINYALS Oriol, BABUSCHKIN Igor, CZARNECKI Wojciech M, et al. Grandmaster level in StarCraft II using multi-agent reinforcement learning[J]. Nature, 2019, 575(7782): 350-354.
13	BADIA Adrià Puigdomènech, PIOT Bilal, KAPTUROWSKI Steven, et al. Agent57: Outperforming the Atari human benchmark[C]//Proceedings of the 37th International Conference on Machine Learning. New York: ACM, 2020: 507-517.
14	YANG Xinyi, WANG Ziyi, ZHANG Hengxi, et al. A review: Machine learning for combinatorial optimization problems in energy areas[J]. Algorithms, 2022, 15(6): 205.
15	KUHNLE Andreas, Marvin Carl MAY, Louis SCHÄFER, et al. Explainable reinforcement learning in production control of job shop manufacturing system[J]. International Journal of Production Research, 2022, 60(19): 5812-5834.
16	PARK Junyoung, CHUN Jaehyeong, KIM Sang Hun, et al. Learning to schedule job-shop problems: Representation and policy learning using graph neural network and reinforcement learning[J]. International Journal of Production Research, 2021, 59(11): 3360-3377.
17	LUO Shu. Dynamic scheduling for flexible job shop with new job insertions by deep reinforcement learning[J]. Applied Soft Computing, 2020, 91: 106208.
18	Lang Sebastian, Behrendt Fabian, Lanzerath Nico, et al. Integration of deep reinforcement learning and discrete-event simulation for real-time scheduling of a flexible job shop production[C]//Proceedings of the Winter Simulation Conference. December 14 - 18, 2020, Orlando, Florida. New York: ACM, 2020: 3057-3068.
19	ESTESO Ana, PEIDRO David, MULA Josefa, et al. Reinforcement learning applied to production planning and control[J]. International Journal of Production Research, 2023, 61(16): 5772-5789.
20	ZHENG Shuai, Gupta Chetan, Serita Susumu. Manufacturing dispatching using reinforcement and transfer learning[C]//The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2019, 11908: 655-671.
21	KUHNLE Andreas, KAISER Jan-Philipp, Felix THEIß, et al. Designing an adaptive production control system using reinforcement learning[J]. Journal of Intelligent Manufacturing, 2021, 32(3): 855-876.
22	HAARNOJA Tuomas, ZHOU Aurick, ABBEEL Pieter, et al. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor[EB/OL]. 2018: arXiv: 1801.01290.
23	XU Yahao, WEI Yiran, JIANG Keyang, et al. Action decoupled SAC reinforcement learning with discrete-continuous hybrid action spaces[J]. Neurocomputing, 2023, 537: 141-151.

参数名称	参数值
折扣因子	0.99
策略网络初始学习率	0.03
价值网络初始学习率	0.03
软更新系数	0.005
采样批量	512
熵阈值	0.9
经验池大小	100000
优化器	Adam

参数名称	参数值
折扣因子	0.99
策略网络初始学习率	0.03
价值网络初始学习率	0.03
软更新系数	0.005
采样批量	512
熵阈值	0.9
经验池大小	100000
优化器	Adam

周期	各储罐液位水平										加工水平
	商储罐		码头罐		厂内罐						加工装置
	H01	H02	H03	H04	F01	F02	F03	F04	F05	F06	CDU1	CDU2
1	0.67	0.17	0.87	0.06	0.12	0.70	0.84	0.47	0.15	0.48	1.00	0.48
2	0.63	0.34	0.87	0.11	0.12	0.68	0.82	0.47	0.13	0.52	1.00	0.48
3	0.60	0.50	0.87	0.17	0.12	0.65	0.80	0.46	0.11	0.53	1.00	0.48
4	0.57	0.67	0.87	0.22	0.12	0.62	0.78	0.45	0.09	0.55	1.00	0.48
5	0.53	0.83	0.87	0.28	0.12	0.60	0.76	0.44	0.07	0.57	1.00	0.48
6	0.50	1.00	0.87	0.33	0.11	0.57	0.74	0.44	0.07	0.57	1.00	0.48
7	0.67	1.00	0.87	0.33	0.11	0.57	0.75	0.43	0.04	0.58	0.72	0.48
8	0.84	1.00	0.80	0.30	0.11	0.57	0.75	0.42	0.04	0.58	0.72	0.48
9	0.93	1.00	0.73	0.26	0.11	0.57	0.75	0.41	0.04	0.57	0.72	0.48
10	0.93	1.00	0.66	0.22	0.11	0.57	0.75	0.40	0.05	0.56	0.72	0.48
11	0.93	1.00	0.59	0.22	0.10	0.58	0.76	0.40	0.05	0.55	0.72	0.48
12	0.93	1.00	0.52	0.22	0.10	0.58	0.76	0.39	0.05	0.54	0.72	0.48
13	0.93	1.00	0.45	0.22	0.10	0.58	0.76	0.38	0.05	0.53	0.72	0.48
14	0.93	1.00	0.38	0.22	0.10	0.58	0.77	0.37	0.05	0.52	0.72	0.48
15	0.93	1.00	0.31	0.22	0.10	0.58	0.77	0.36	0.05	0.51	0.72	0.48
16	0.93	1.00	0.25	0.22	0.09	0.58	0.77	0.36	0.05	0.50	0.72	0.48
17	0.93	1.00	0.18	0.22	0.09	0.58	0.77	0.35	0.05	0.49	0.72	0.48
18	0.93	1.00	0.11	0.22	0.09	0.59	0.78	0.34	0.05	0.48	0.72	0.48
19	0.93	1.00	0.11	0.22	0.09	0.59	0.78	0.33	0.05	0.48	0.72	0.48
20	0.93	1.00	0.11	0.22	0.09	0.59	0.78	0.32	0.06	0.47	0.72	0.48
21	0.93	1.00	0.11	0.22	0.08	0.59	0.78	0.32	0.06	0.46	0.72	0.48
22	0.93	1.00	0.11	0.22	0.08	0.59	0.79	0.31	0.06	0.45	0.72	0.48
23	0.93	1.00	0.11	0.22	0.08	0.59	0.79	0.30	0.06	0.44	0.72	0.48
24	0.93	1.00	0.11	0.22	0.08	0.59	0.79	0.29	0.06	0.43	0.72	0.48
25	0.93	1.00	0.11	0.22	0.08	0.60	0.79	0.28	0.06	0.42	0.72	0.48
26	0.93	1.00	0.11	0.22	0.07	0.60	0.80	0.28	0.06	0.41	0.72	0.48
27	0.93	1.00	0.11	0.22	0.07	0.60	0.80	0.27	0.06	0.40	0.72	0.48
28	0.93	1.00	0.11	0.22	0.07	0.60	0.80	0.26	0.06	0.39	0.72	0.48
29	0.93	1.00	0.11	0.22	0.07	0.60	0.81	0.25	0.07	0.38	0.72	0.48
30	0.93	1.00	0.11	0.22	0.07	0.60	0.81	0.25	0.07	0.37	0.72	0.48
31	0.93	1.00	0.11	0.22	0.06	0.60	0.81	0.24	0.07	0.37	0.72	0.48
32	0.93	1.00	0.11	0.22	0.06	0.60	0.81	0.23	0.07	0.36	0.72	0.48
33	0.93	1.00	0.11	0.22	0.06	0.61	0.82	0.22	0.07	0.35	0.72	0.48
34	0.93	1.00	0.11	0.22	0.06	0.61	0.82	0.21	0.07	0.34	0.72	0.48
35	0.93	1.00	0.11	0.22	0.06	0.61	0.82	0.21	0.07	0.33	0.72	0.48
36	0.93	0.97	0.11	0.24	0.05	0.61	0.82	0.20	0.07	0.32	0.72	0.48
37	0.93	0.93	0.11	0.30	0.05	0.61	0.83	0.19	0.07	0.31	0.72	0.48
38	0.93	0.93	0.11	0.30	0.05	0.61	0.83	0.18	0.07	0.30	0.72	0.48
39	0.93	0.93	0.11	0.30	0.05	0.61	0.83	0.17	0.08	0.29	0.72	0.48
40	0.93	0.93	0.11	0.30	0.05	0.62	0.84	0.17	0.08	0.28	0.72	0.48
41	0.93	0.93	0.11	0.30	0.04	0.62	0.84	0.16	0.08	0.27	0.72	0.48
42	0.93	0.93	0.11	0.30	0.04	0.62	0.84	0.15	0.08	0.27	0.72	0.48

周期	各储罐液位水平										加工水平
	商储罐		码头罐		厂内罐						加工装置
	H01	H02	H03	H04	F01	F02	F03	F04	F05	F06	CDU1	CDU2
1	0.67	0.17	0.87	0.06	0.12	0.70	0.84	0.47	0.15	0.48	1.00	0.48
2	0.63	0.34	0.87	0.11	0.12	0.68	0.82	0.47	0.13	0.52	1.00	0.48
3	0.60	0.50	0.87	0.17	0.12	0.65	0.80	0.46	0.11	0.53	1.00	0.48
4	0.57	0.67	0.87	0.22	0.12	0.62	0.78	0.45	0.09	0.55	1.00	0.48
5	0.53	0.83	0.87	0.28	0.12	0.60	0.76	0.44	0.07	0.57	1.00	0.48
6	0.50	1.00	0.87	0.33	0.11	0.57	0.74	0.44	0.07	0.57	1.00	0.48
7	0.67	1.00	0.87	0.33	0.11	0.57	0.75	0.43	0.04	0.58	0.72	0.48
8	0.84	1.00	0.80	0.30	0.11	0.57	0.75	0.42	0.04	0.58	0.72	0.48
9	0.93	1.00	0.73	0.26	0.11	0.57	0.75	0.41	0.04	0.57	0.72	0.48
10	0.93	1.00	0.66	0.22	0.11	0.57	0.75	0.40	0.05	0.56	0.72	0.48
11	0.93	1.00	0.59	0.22	0.10	0.58	0.76	0.40	0.05	0.55	0.72	0.48
12	0.93	1.00	0.52	0.22	0.10	0.58	0.76	0.39	0.05	0.54	0.72	0.48
13	0.93	1.00	0.45	0.22	0.10	0.58	0.76	0.38	0.05	0.53	0.72	0.48
14	0.93	1.00	0.38	0.22	0.10	0.58	0.77	0.37	0.05	0.52	0.72	0.48
15	0.93	1.00	0.31	0.22	0.10	0.58	0.77	0.36	0.05	0.51	0.72	0.48
16	0.93	1.00	0.25	0.22	0.09	0.58	0.77	0.36	0.05	0.50	0.72	0.48
17	0.93	1.00	0.18	0.22	0.09	0.58	0.77	0.35	0.05	0.49	0.72	0.48
18	0.93	1.00	0.11	0.22	0.09	0.59	0.78	0.34	0.05	0.48	0.72	0.48
19	0.93	1.00	0.11	0.22	0.09	0.59	0.78	0.33	0.05	0.48	0.72	0.48
20	0.93	1.00	0.11	0.22	0.09	0.59	0.78	0.32	0.06	0.47	0.72	0.48
21	0.93	1.00	0.11	0.22	0.08	0.59	0.78	0.32	0.06	0.46	0.72	0.48
22	0.93	1.00	0.11	0.22	0.08	0.59	0.79	0.31	0.06	0.45	0.72	0.48
23	0.93	1.00	0.11	0.22	0.08	0.59	0.79	0.30	0.06	0.44	0.72	0.48
24	0.93	1.00	0.11	0.22	0.08	0.59	0.79	0.29	0.06	0.43	0.72	0.48
25	0.93	1.00	0.11	0.22	0.08	0.60	0.79	0.28	0.06	0.42	0.72	0.48
26	0.93	1.00	0.11	0.22	0.07	0.60	0.80	0.28	0.06	0.41	0.72	0.48
27	0.93	1.00	0.11	0.22	0.07	0.60	0.80	0.27	0.06	0.40	0.72	0.48
28	0.93	1.00	0.11	0.22	0.07	0.60	0.80	0.26	0.06	0.39	0.72	0.48
29	0.93	1.00	0.11	0.22	0.07	0.60	0.81	0.25	0.07	0.38	0.72	0.48
30	0.93	1.00	0.11	0.22	0.07	0.60	0.81	0.25	0.07	0.37	0.72	0.48
31	0.93	1.00	0.11	0.22	0.06	0.60	0.81	0.24	0.07	0.37	0.72	0.48
32	0.93	1.00	0.11	0.22	0.06	0.60	0.81	0.23	0.07	0.36	0.72	0.48
33	0.93	1.00	0.11	0.22	0.06	0.61	0.82	0.22	0.07	0.35	0.72	0.48
34	0.93	1.00	0.11	0.22	0.06	0.61	0.82	0.21	0.07	0.34	0.72	0.48
35	0.93	1.00	0.11	0.22	0.06	0.61	0.82	0.21	0.07	0.33	0.72	0.48
36	0.93	0.97	0.11	0.24	0.05	0.61	0.82	0.20	0.07	0.32	0.72	0.48
37	0.93	0.93	0.11	0.30	0.05	0.61	0.83	0.19	0.07	0.31	0.72	0.48
38	0.93	0.93	0.11	0.30	0.05	0.61	0.83	0.18	0.07	0.30	0.72	0.48
39	0.93	0.93	0.11	0.30	0.05	0.61	0.83	0.17	0.08	0.29	0.72	0.48
40	0.93	0.93	0.11	0.30	0.05	0.62	0.84	0.17	0.08	0.28	0.72	0.48
41	0.93	0.93	0.11	0.30	0.04	0.62	0.84	0.16	0.08	0.27	0.72	0.48
42	0.93	0.93	0.11	0.30	0.04	0.62	0.84	0.15	0.08	0.27	0.72	0.48

方法	油轮在港时间/h	油罐付油事件数/个	油罐油种切换事件数/个	装置加工切换事件数/个	加工计划执行率/%	求解平均时间/min
数学规划	36.0	14	1	2	99.96	2.5
本算法	34.2	12	1	1	99.98	0.0023
降低/提升率	降低5%	降低14.3%	持平	降低50%	提升0.02%	提升99.9%