邓天虎: Data-driven Convex Policy Optimization in an Assemble-to-order System
发布日期:2023-10-11  字号:   【打印

报告时间:2023年10月13日(星期五)14:30-15:30

报告地点:管理学院新大楼第二学术报告厅

:邓天虎 博士

工作单位清华大学

举办单位管理学院

报告简介

This paper investigates the optimization of periodic-review assemble-to-order (ATO) production systems with multiple products assembled from multiple components, under the data-driven setting where only historical demand data is available and demand distributions are unknown. To address this challenge, we propose a semi-model-based fitted Q iteration (S-FQI) algorithm framework that leverages the known transition dynamics. We provide a proof of the statistical convergence rate of the proposed algorithm concerning the number of iterations, the number of demand samples, and the number of generated trajectories.

Additionally, we introduce the convex-TD3 (CTD3) algorithm to tackle practical challenges by incorporating the convex property of ATO systems and utilizing an input convex neural network (ICNN) to improve efficiency and effectiveness.

报告人简介

邓天虎,邓天虎(博士,副教授)目前就职于清华大学工业工程系。2013年于美国加州大学伯克利分校获得工业工程与运筹博士学位,2008年于清华大学工业工程系获得学士学位。目前研究方向侧重智慧供应链。以第一作者和通讯作者在Manufacturing & Service Operations Management、Operations Research等国际学术期刊和学术会议发表论文20余篇。

点击排行榜