光華講壇——社會名流與企業家論壇第6756期
主題:Bayesian Knockoff Filter for False Discovery Control用于錯誤發現控制的貝葉斯 Knockoff 篩選方法
主講人:香港大學計算與數據科學學院副院長 尹國圣教授
主持人:統計與數據科學學院 林華珍教授
時間:6月9日10:30-11:30
地點:柳林校區弘遠樓408會議室
主辦單位:統計與數據科學學院 科研處
主講人簡介:
Guosheng Yin is Patrick Poon Endowed Chair Professor in Department of Statistics and Actuarial Science and Associate Director in School of Computing and Data Science at University of Hong Kong. After receiving Ph.D. in Biostatistics from University of North Carolina at Chapel Hill in 2003, he worked as Assistant/Associate Professor in Department of Biostatistics at University of Texas M.D. Anderson Cancer Center as well as Chair in Statistics in Department of Mathematics at Imperial College London. He was Head of Department of Statistics and Actuarial Science at University of Hong Kong in 2017-2023. He was elected as a fellow of American Statistical Association and a fellow of Institute of Mathematical Statistics. He served as associate editor for Journal of American Statistical Association, Bayesian Analysis, Contemporary Clinical Trials etc. He has published over 260 peer-reviewed papers in statistical, medical journals and AI and machine learning conferences, as well as two books on clinical trial designs.
尹國圣教授是香港大學統計與精算學系潘燊昌基金講席教授,同時擔任香港大學計算與數據科學學院副院長。他于2003年在北卡羅來納大學教堂山分校獲得生物統計學博士學位后,曾在德克薩斯大學MD安德森癌癥中心生物統計學系擔任助理/副教授,并曾在帝國理工大學數學系擔任統計學講座教授。2017年至2023年期間,他曾擔任香港大學統計與精算學系系主任。尹教授被選為美國統計協會會士和國際數理統計學會會士。他曾擔任《美國統計協會雜志》《貝葉斯分析》《當代臨床試驗》等期刊的副主編。至今,他在統計學、醫學期刊以及人工智能與機器學習會議上發表了260余篇同行評審論文,并出版了兩本關于臨床試驗設計的專著。
內容提要:
In many scientific fields, researchers are interested in discovering important features with substantial effect on the response from a large number of features while controlling the proportion of false discoveries. By incorporating the knockoff procedure in a fully Bayesian framework, we develop the Bayesian knockoff filter (BKF) for selecting features that have important effect on the response. In contrast to the fixed knockoff variables in a frequentist procedure, we allow the knockoff variables to be continuously updated in each iteration of the Markov chain Monte Carlo. Based on the posterior samples and the elaborated greedy selection procedure, our method can distinguish the truly important features from unimportant ones and the Bayesian false discovery rate can be controlled at a desirable level. Numerical experiments on both synthetic and real data demonstrate the advantages of our BKF over existing knockoff methods and Bayesian variable selection approaches, i.e., the BKF possesses higher power and yields a lower false discovery rate, especially for weak signals.
在許多科學領域,研究人員關注于從大量特征中發現對響應變量具有顯著影響的重要特征,同時控制錯誤發現比例。我們在一個完全貝葉斯的框架中引入了 knockoff 程序,提出了貝葉斯 knockoff 篩選方法(Bayesian Knockoff Filter, BKF),用于選擇對響應變量有重要影響的特征。與頻率學派方法中固定的 knockoff 變量不同,我們的方法允許在馬爾可夫鏈蒙特卡洛(MCMC)迭代的每一步中持續更新 knockoff 變量?;诤篁灅颖竞途脑O計的貪婪選擇過程,我們的方法能夠區分真正重要的特征與不重要的特征,并且可以在期望的水平上控制貝葉斯錯誤發現率(Bayesian FDR)。在合成數據和真實數據上的數值實驗表明,與現有的 knockoff 方法和貝葉斯變量選擇方法相比,BKF 具有更高的檢測能力(power)和更低的錯誤發現率,尤其在識別弱信號方面表現更為優越。