报告时间:2025年7月22日(星期二)10:00-11:30
报告地点:翡翠湖校区科教楼A座1楼第三会议室
报 告 人:王璐 教授
工作单位:美国密歇根大学
举办单位:计算机与信息学院(人工智能学院)
报告简介:
Many real-world problems involve multiple competing priorities, and decision rules differ when trade-offs are present. Correspondingly, there may be more than one feasible decision that leads to empirically sufficient optimization. In this talk, we present a concept of “tolerant regime,” which provides a set of individualized feasible decision rules under a prespecified tolerance rate. A multi-objective tree-based reinforcement learning (MOT-RL) method is developed to directly estimate the tolerant DTR (tDTR) that optimizes multiple objectives in a multistage multi-treatment setting. At each stage, MOT-RL constructs an unsupervised decision tree by modeling the counterfactual mean outcome of each objective via semiparametric regression and maximizing a purity measure constructed by the scalarized augmented inverse probability weighted estimators (SAIPWE). The algorithm is implemented in a backward inductive manner through multiple decision stages, and it estimates the optimal DTR and tDTR depending on the decision-maker’s preferences. Mult-objective tree-based reinforcement learning is robust, efficient, easy-to-interpret, and flexible to different settings.
报告人简介:
王璐,美国密西根大学生物统计学系教授,美国统计协会会士(ASA Fellow),国际统计学会当选会员(Elected Member of ISI)。2002年本科毕业于北京大学,2008年博士毕业于哈佛大学。研究领域包括评估动态治疗方案的统计方法、个性化医疗、因果推断、非参数和半参数回归、缺失数据分析、以及纵向(相关/聚类)数据分析等。在JASA、Biometrika、Biometrics、AoAS等学术期刊上发表论文180余篇,并合著了一章书籍。现任JASA和Biometrics的副主编。