數(shù)據(jù)科學(xué)學(xué)院師生26篇論文被頂級(jí)國(guó)際會(huì)議NeurIPS 2023接收
香港中文大學(xué)(深圳)數(shù)據(jù)科學(xué)學(xué)院師生共26篇論文被機(jī)器學(xué)習(xí)和計(jì)算神經(jīng)科學(xué)領(lǐng)域的頂級(jí)國(guó)際會(huì)議NeurIPS (Conference on Neural Information Processing Systems?神經(jīng)信息處理系統(tǒng)大會(huì), 簡(jiǎn)稱(chēng)NeurIPS或NIPS)?2023接收。論文來(lái)自學(xué)院18位教授、1位博士后及10位博士生、1位碩士生,除研究生外,學(xué)院的本科生也積極參與科研,論文作者中還包括2位學(xué)院本科生。NeurIPS 2023的接收率為26.1%。
2位本科生:
盧藝文、施展
?
1位碩士生:
晏志遠(yuǎn)
?
10位博士生:
董婧、李子牛、路舜麟、喬冠仁、孫子恒、王遠(yuǎn)程、魏少魁、楊超、張明達(dá)、朱明麗
?
1位博士后:
李文浩
?
18位教授:
丁宏強(qiáng)、樊繼聰、李彤欣、李海洲、李爽、李文燁、李肖、劉桂良、羅智泉、馬晨昊、茅劍鋒、孫若愚、王趵翔、王本友、吳保元、武執(zhí)政、查宏遠(yuǎn)、張瑞茂
?
?
NeurlPS簡(jiǎn)介
神經(jīng)信息處理系統(tǒng)大會(huì)(NeurIPS/NIPS)是機(jī)器學(xué)習(xí)和計(jì)算神經(jīng)科學(xué)領(lǐng)域的頂尖國(guó)際會(huì)議。在中國(guó)計(jì)算機(jī)學(xué)會(huì)的國(guó)際學(xué)術(shù)會(huì)議排名中,NeurIPS是人工智能領(lǐng)域的A類(lèi)學(xué)術(shù)會(huì)議。大會(huì)討論的內(nèi)容包含深度學(xué)習(xí)、計(jì)算機(jī)視覺(jué)、大規(guī)模機(jī)器學(xué)習(xí)、學(xué)習(xí)理論、優(yōu)化、稀疏理論等眾多細(xì)分領(lǐng)域。該會(huì)議固定在每年的12月舉行, 由NIPS基金會(huì)主辦,今年是該會(huì)議舉辦的第37屆,將于12月10日至12月16日在美國(guó)新奧爾良會(huì)議中心舉行。
?
來(lái)源:NeurIPS官網(wǎng)、百度百科
?
更多學(xué)生信息,詳見(jiàn):https://mp.weixin.qq.com/s/fmn4Lxc7bl1EAM17Xf1Zcg
26篇論文詳情?
1.?Federated Spectral Clustering via Secure Similarity Reconstruction
作者:
Dong Qiao,?Chris Ding, Jicong Fan
簡(jiǎn)介:
聯(lián)邦學(xué)習(xí)在保護(hù)數(shù)據(jù)隱私上具有優(yōu)勢(shì)。自谷歌提出聯(lián)邦學(xué)習(xí)的概念以來(lái),有很多基于聯(lián)邦學(xué)習(xí)框架的安全學(xué)習(xí)算法已經(jīng)被提出,以應(yīng)對(duì)數(shù)據(jù)泄露和信息安全威脅帶來(lái)的挑戰(zhàn)。無(wú)監(jiān)督學(xué)習(xí)在實(shí)際生產(chǎn)中有著廣泛的應(yīng)用。然而,回顧現(xiàn)有的文獻(xiàn),我們發(fā)現(xiàn)相關(guān)的研究,特別是關(guān)于聚類(lèi)的聯(lián)邦學(xué)習(xí)研究,還較少。在這篇文章中,我們提出了一個(gè)基于核函數(shù)因子分解的聯(lián)邦譜聚類(lèi)方法,用于分布式數(shù)據(jù)的安全聚類(lèi)。我們基于聯(lián)邦學(xué)習(xí)的基本框架,隱式地構(gòu)建了一個(gè)近似的相似度矩陣?;谠撓嗨贫染仃?,我們的方法能夠在不直接獲取終端敏感數(shù)據(jù)的情況下執(zhí)行譜聚類(lèi)任務(wù)。為了說(shuō)明所提算法的有效性,我們證明了算法的收斂性、相似度重構(gòu)的殘差上界以及保證聚類(lèi)結(jié)果的充分條件。除此之外,我們也證明了所提算法可以通過(guò)加噪音的方式滿(mǎn)足差分隱私。合成數(shù)據(jù)集和真實(shí)數(shù)據(jù)集上的實(shí)驗(yàn)結(jié)果顯示我們的算法是有效且可比的。
Abstracts:
Federated learning has a significant advantage in protecting data and information privacy. Many scholars proposed various secure learning methods within the framework of federated learning but the study on secure federated unsupervised learning especially clustering is limited. We in this work propose a secure kernelized factorization method for federated spectral clustering on distributed data. The method is non-trivial because the kernel or similarity matrix for spectral clustering is computed by data pairs, which violates the principle of privacy protection. Our method implicitly constructs an approximation for the kernel matrix on distributed data such that we can perform spectral clustering under the constraint of privacy protection. We provide a convergence guarantee of the optimization algorithm, a reconstruction error bound of the Gaussian kernel matrix, and the sufficient condition of correct clustering of our method. We also present guarantees of differential privacy. Numerical results on synthetic and real datasets demonstrate that the proposed method is efficient and accurate in comparison to the baselines.
?
?
鏈接:
https://nips.cc/virtual/2023/poster/71656
?
?
2.?Lovász Principle for Unsupervised Graph Representation Learning
作者:
Ziheng Sun (SDS博士生), Chris Ding, Jicong Fan
簡(jiǎn)介:
本文側(cè)重于圖級(jí)表示學(xué)習(xí),旨在將圖表示為可直接用于圖分類(lèi)等下游任務(wù)的向量。我們受到了圖論中Lovász數(shù)的啟發(fā)并提出了一種名為L(zhǎng)ovász原理的新型圖級(jí)表示學(xué)習(xí)原理。Lovász數(shù)是一個(gè)實(shí)數(shù),是圖Shannon容量的上界,與圖的各種全局特征密切相關(guān)。具體而言,我們展示了用于計(jì)算Lovász數(shù)的傘柄向量可能是圖表示的合適選擇,因?yàn)樗蹲搅藞D的全局特性。為了處理直接應(yīng)用傘柄向量帶來(lái)的困難和問(wèn)題,我們將Lovász原理應(yīng)用于圖神經(jīng)網(wǎng)絡(luò)來(lái)解決這些問(wèn)題。此外,我們提出了一個(gè)增強(qiáng)版的Lovász原理來(lái)更高效地利用子圖的Lovász數(shù)。實(shí)驗(yàn)證明,我們的Lovász原理在無(wú)監(jiān)督和半監(jiān)督圖級(jí)表示學(xué)習(xí)任務(wù)中與基線(xiàn)方法相比取得了具有競(jìng)爭(zhēng)力的表現(xiàn)。
Abstracts:
This paper focuses on graph-level representation learning that aims to represent graphs as vectors that can be directly utilized in downstream tasks such as graph classification. We propose a novel graph-level representation learning principle called Lovász principle, which is motivated by the Lovász number in graph theory. The Lovász number is a real number that is an upper bound for graph Shannon capacity and is strongly connected with various global characteristics of graph. Specifically, we show that the handle vector for computing the Lovász number is potentially a suitable choice for graph representation, as it captures a graph's global properties, though a direct application of the handle vector is difficult and problematic. We propose to use neural networks to address the problems and hence provide the Lovász principle. Moreover, we propose an enhanced Lovász principle that is able to exploit the subgraph Lovász numbers directly and efficiently. The experiments demonstrate that our Lovász principles achieve competitive performance compared to the baselines in unsupervised and semi-supervised graph-level representation learning tasks.
?
鏈接:
https://nips.cc/virtual/2023/poster/73041
?
?
3.?Graph Convolutional Kernel Machine versus Graph Convolutional Networks
作者:
Zhihao Wu, Zhao Zhang,?Jicong Fan
簡(jiǎn)介:
具有一兩個(gè)隱藏層的圖卷積網(wǎng)絡(luò)(GCN)已廣泛用于處理各個(gè)學(xué)科中普遍存在的圖數(shù)據(jù)。許多研究表明,使 GCN 更深的增益很小,甚至是負(fù)的。這意味著圖數(shù)據(jù)的復(fù)雜性通常是有限的,淺層模型通常足以提取節(jié)點(diǎn)分類(lèi)等各種任務(wù)的表達(dá)特征。因此,在這項(xiàng)工作中,我們提出了一個(gè)稱(chēng)為圖卷積核機(jī)(GCKM)的框架,用于基于圖的機(jī)器學(xué)習(xí)。GCKM 建立在與圖卷積集成的核函數(shù)之上。一個(gè)例子是用于節(jié)點(diǎn)分類(lèi)的圖卷積核支持向量機(jī)(GCKSVM),我們分析了泛化誤差界并討論了圖結(jié)構(gòu)的影響。與 GCN 相比,GCKM 在架構(gòu)設(shè)計(jì)、超參數(shù)調(diào)整和優(yōu)化方面需要更少的工作。更重要的是,GCKM保證獲得全局最優(yōu)解,并且具有很強(qiáng)的泛化能力和高可解釋性。GCKM 是可組合的,可以擴(kuò)展到大規(guī)模數(shù)據(jù),并且適用于各種任務(wù)(例如,節(jié)點(diǎn)或圖分類(lèi)、聚類(lèi)、特征提取、降維)。基準(zhǔn)數(shù)據(jù)集上的數(shù)值結(jié)果表明,除了上述優(yōu)點(diǎn)之外,GCKM 與 GCN 相比至少具有有競(jìng)爭(zhēng)力的準(zhǔn)確性。
Abstracts:
Graph convolutional networks (GCN) with one or two hidden layers have been widely used in handling graph data that are prevalent in various disciplines. Many studies showed that the gain of making GCNs deeper is tiny or even negative. This implies that the complexity of graph data is often limited and shallow models are often sufficient to extract expressive features for various tasks such as node classification. Therefore, in this work, we present a framework called graph convolutional kernel machine (GCKM) for graph-based machine learning. GCKMs are built upon kernel functions integrated with graph convolution. An example is the graph convolutional kernel support vector machine (GCKSVM) for node classification, for which we analyze the generalization error bound and discuss the impact of the graph structure. Compared to GCNs, GCKMs require much less effort in architecture design, hyperparameter tuning, and optimization. More importantly, GCKMs are guaranteed to obtain globally optimal solutions and have strong generalization ability and high interpretability. GCKMs are composable, can be extended to large-scale data, and are applicable to various tasks (e.g., node or graph classification, clustering, feature extraction, dimensionality reduction). The numerical results on benchmark datasets show that, besides the aforementioned advantages, GCKMs have at least competitive accuracy compared to GCNs.
鏈接:
https://nips.cc/virtual/2023/poster/71620
?
?
4.?Boosting Spectral Clustering on Incomplete Data via Kernel Correction and Affinity Learning
作者:
Fangchen Yu,?Runze Zhao,?Zhan Shi (SDS本科生),?Yiwen Lu (SDS本科生), Jicong Fan, Yicheng Zeng,?Jianfeng Mao,?Wenye Li
簡(jiǎn)介:
譜聚類(lèi)方法因其簡(jiǎn)單性和有效性在非凸數(shù)據(jù)的聚類(lèi)中備受歡迎。在譜聚類(lèi)中,相似性度量衡量數(shù)據(jù)樣本之間的局部近鄰關(guān)系,使用高質(zhì)量的相似性度量對(duì)構(gòu)建有效的相似性圖是非常重要的。然而,缺失數(shù)據(jù)可能導(dǎo)致不準(zhǔn)確的相似性度量,從而降低聚類(lèi)性能。為了解決這些問(wèn)題,我們提出了一個(gè)無(wú)插補(bǔ)的框架,和兩類(lèi)新穎的方法來(lái)改進(jìn)缺失數(shù)據(jù)上的譜聚類(lèi)。首先,我們引入了一種新的核校正方法,該方法增強(qiáng)了對(duì)缺失數(shù)據(jù)估計(jì)的核矩陣的質(zhì)量,且具有理論保證,從而使基于預(yù)定義核的經(jīng)典譜聚類(lèi)受益。其次,我們開(kāi)發(fā)了一系列新的相似性學(xué)習(xí)方法,基于自表達(dá)框架和Lp-范數(shù),并構(gòu)建具有自適應(yīng)擴(kuò)展的內(nèi)稟相似性矩陣。我們的方法在基準(zhǔn)數(shù)據(jù)集上超越了現(xiàn)有的數(shù)據(jù)插補(bǔ)和距離校準(zhǔn)技術(shù),為各種實(shí)際應(yīng)用中缺失數(shù)據(jù)的譜聚類(lèi)提供了有前景的解決方案。
Abstracts:
Spectral clustering has gained popularity for clustering non-convex data due to its simplicity and effectiveness. It is essential to construct a similarity graph using a high-quality affinity measure that models the local neighborhood relations among data samples. However, incomplete data can lead to inaccurate affinity measures, resulting in degraded clustering performance. To address these issues, we propose an imputation-free framework with two novel approaches to improve spectral clustering on incomplete data. Firstly, we introduce a new kernel correction method that enhances the quality of the kernel matrix estimated on incomplete data with a theoretical guarantee, benefiting classical spectral clustering on pre-defined kernels. Secondly, we develop a series of new affinity learning methods that equips the self-expressive framework with Lp-norm to construct an intrinsic affinity matrix with adaptive extensions. Our methods outperform existing data imputation and distance calibration techniques on benchmark datasets, offering a promising solution to spectral clustering on incomplete data in various real-world applications.
https://nips.cc/virtual/2023/poster/70019
?
?
5.?Anytime-Constrained Reinforcement Learning with Policy Prior
作者:
Jianyi Yang, Pengfei Li,?Tongxin Li,?Adam Wierman, Shaolei Ren
簡(jiǎn)介:
本文研究了“隨時(shí)約束馬爾可夫決策過(guò)程”(A-CMDP)問(wèn)題?,F(xiàn)有關(guān)于約束馬爾可夫決策過(guò)程(CMDPs)的研究目標(biāo)是在隨機(jī)動(dòng)態(tài)中優(yōu)化預(yù)期獎(jiǎng)勵(lì),同時(shí)約束預(yù)期成本,但在具體的特定時(shí)刻中,成本仍可能過(guò)高且不盡人意。相對(duì)而言,A-CMDP的目標(biāo)是在每輪任何時(shí)刻中保證有界成本的前提下,優(yōu)化預(yù)期獎(jiǎng)勵(lì),以應(yīng)對(duì)策略先驗(yàn)。論文提出了一種新的算法,名為“隨時(shí)約束強(qiáng)化學(xué)習(xí)”(ACRL),并可靠地確保了隨時(shí)的成本約束。遺憾分析理論顯示,該策略在隨時(shí)約束下會(huì)漸近地匹配最優(yōu)獎(jiǎng)勵(lì)。此外,關(guān)于碳智能計(jì)算的應(yīng)用實(shí)驗(yàn)證實(shí)了ACRL在獎(jiǎng)勵(lì)性能和成本約束保證方面的有效性。
Abstracts:
This paper studies the problem of Anytime-Constrained Markov Decision Process (A-CMDP). Existing works on Constrained Markov Decision Processes (CMDPs) aim to optimize the expected reward while constraining the expected cost over random dynamics, but the cost in a specific episode can still be unsatisfactorily high. In contrast, the goal of A-CMDP is to optimize the expected reward while guaranteeing a bounded cost in each round of any episode against a policy prior.? We propose a new algorithm, called Anytime-Constrained Reinforcement Learning (ACRL), which provably guarantees the anytime cost constraints. The regret analysis shows the policy asymptotically matches the optimal reward achievable under anytime constraints. Experiments on the application of carbon-intelligent computing verify the reward performance and cost constraint guarantee of ACRL.
?
?
6.?Beyond Black-Box Advice: Learning-Augmented Algorithms for MDPs with Q-Value Predictions
作者:
Tongxin Li, Yiheng Lin, Shaolei Ren, Adam Wierman
簡(jiǎn)介:
本文在單軌跡時(shí)變馬爾可夫決策過(guò)程(MDP)的背景下,深入探討了一致性與魯棒性之間的理論權(quán)衡,特別是在處理不受信任的機(jī)器學(xué)習(xí)建議的情境中。與一般以來(lái)自黑盒的建議處理方式不同,研究考慮了一個(gè)灰箱環(huán)境,在這個(gè)環(huán)境中,不僅有黑盒決策,也可以獲得黑盒決策生成時(shí)的附加信息。論文在一個(gè)包含連續(xù)與離散狀態(tài)/動(dòng)作空間的廣義MDP模型下,基于不可信Q價(jià)值函數(shù)建議,證明了一種具有創(chuàng)新性的一致性與魯棒性權(quán)衡。論文研究結(jié)果凸顯了,利用Q價(jià)值函數(shù)灰箱模型可以實(shí)現(xiàn)機(jī)器學(xué)習(xí)建議與魯棒基線(xiàn)之間的動(dòng)態(tài)平衡,因此可以獲得接近最優(yōu)的性能保證。在理論上超越了僅依賴(lài)黑箱建議所能實(shí)現(xiàn)的性能。
Abstracts:
We study the tradeoff between consistency and robustness in the context of a single-trajectory time-varying Markov Decision Process (MDP) with untrusted machine-learned advice. Our work departs from the typical approach of treating advice as coming from black-box sources by instead considering a setting where additional information about? how the advice is generated is available. We prove a first-of-its-kind consistency and robustness tradeoff given Q-value advice under a general MDP model that includes both continuous and discrete state/action spaces. Our results highlight that utilizing Q-value advice enables dynamic pursuit of the better of machine-learned advice and a robust baseline, thus result in near-optimal performance guarantees, which provably improves what can be obtained solely with black-box advice.
鏈接:
https://arxiv.org/abs/2307.10524
?
?
7.?Disentangling Voice and Content with Self-Supervision for Speaker Recognition
作者:
Tianchi Liu, Kong Aik Lee, Qiongqiong Wang,?Haizhou Li
簡(jiǎn)介:
針對(duì)說(shuō)話(huà)者識(shí)別,由于語(yǔ)音中混合了說(shuō)話(huà)者特征和內(nèi)容,因此從語(yǔ)音中提取準(zhǔn)確的說(shuō)話(huà)者表示是困難的。本文提出了一個(gè)同時(shí)建模語(yǔ)音中說(shuō)話(huà)者特征和內(nèi)容變異性的解耦框架。這一框架通過(guò)三個(gè)高斯推斷層實(shí)現(xiàn),每個(gè)推斷層都包括一個(gè)可學(xué)習(xí)的transition模型,用于提取不同的語(yǔ)音成分。值得注意的是,我們專(zhuān)門(mén)設(shè)計(jì)了一個(gè)強(qiáng)化的transition模型,用于建模復(fù)雜的語(yǔ)音動(dòng)態(tài)。我們還提出了一種自監(jiān)督方法,可以在沒(méi)有除說(shuō)話(huà)者身份之外的標(biāo)簽的情況下動(dòng)態(tài)解耦內(nèi)容。所提出的框架的有效性通過(guò)在VoxCeleb和SITW數(shù)據(jù)集上進(jìn)行的實(shí)驗(yàn)進(jìn)行驗(yàn)證,其中EER和minDCF平均降低了分別為9.56%和8.24%。由于不需要額外的模型訓(xùn)練或數(shù)據(jù),因此它在實(shí)際應(yīng)用中易于使用。
Abstracts:
For speaker recognition, it is difficult to extract an accurate speaker representation from speech because of its mixture of speaker traits and content. This paper proposes a disentanglement framework that simultaneously models speaker traits and content variability in speech. It is realized with the use of three Gaussian inference layers, each consisting of a learnable transition model that extracts distinct speech components. Notably, a strengthened transition model is specifically designed to model complex speech dynamics. We also propose a self-supervision method to dynamically disentangle content without the use of labels other than speaker identities. The efficacy of the proposed framework is validated via experiments conducted on the VoxCeleb and SITW datasets with 9.56% and 8.24% average reductions in EER and minDCF, respectively. Since neither additional model training nor data is specifically needed, it is easily applicable in practical use
?
?
8.?Discovering Intrinsic Spatial-Temporal Logic Rules to Explain Human Actions.
作者:
Chengzhi Cao,?Chao Yang (SDS博士生), Ruimao Zhang, Shuang Li
簡(jiǎn)介:
我們通過(guò)分析人類(lèi)運(yùn)動(dòng)的軌跡,提出了一個(gè)基于邏輯的知識(shí)驅(qū)動(dòng)的人類(lèi)運(yùn)動(dòng)建模框架。我們的方法受到這樣一個(gè)事實(shí)的啟發(fā),即人類(lèi)的行為通常由他們的意圖或欲望驅(qū)動(dòng),并受到環(huán)境因素的影響,如與周?chē)矬w的空間關(guān)系。在本文中,我們引入了一組時(shí)空邏輯規(guī)則作為解釋人類(lèi)行為的知識(shí)。這些規(guī)則將從觀(guān)測(cè)數(shù)據(jù)中自動(dòng)發(fā)現(xiàn)。為了學(xué)習(xí)模型參數(shù)和規(guī)則內(nèi)容,我們?cè)O(shè)計(jì)了一種期望最大化(EM)算法,該算法將規(guī)則內(nèi)容視為潛在變量。EM算法在E步和M步之間交替:在E步中,評(píng)估潛在規(guī)則內(nèi)容上的后驗(yàn)分布;在M步驟中,通過(guò)最大化當(dāng)前期望的對(duì)數(shù)似然性來(lái)聯(lián)合優(yōu)化規(guī)則生成器和模型參數(shù)。我們的模型可能在體育分析、機(jī)器人和自動(dòng)駕駛汽車(chē)等領(lǐng)域有廣泛的應(yīng)用,在這些領(lǐng)域,理解人類(lèi)運(yùn)動(dòng)至關(guān)重要。我們?cè)谛腥撕蚇BA籃球運(yùn)動(dòng)員數(shù)據(jù)集上展示了該模型優(yōu)越的可解釋性和預(yù)測(cè)性能,兩者都取得了有希望的結(jié)果。
Abstracts:
We propose a logic-informed knowledge-driven modeling framework for human 1 movements by analyzing their trajectories. Our approach is inspired by the fact that 2 human actions are usually driven by their intentions or desires, and are influenced 3 by environmental factors such as the spatial relationships with surrounding objects. 4 In this paper, we introduce a set of spatial-temporal logic rules as knowledge 5 to explain human actions. These rules will be automatically discovered from 6 observational data. To learn the model parameters and the rule content, we design 7 an expectation-maximization (EM) algorithm, which treats the rule content as 8 latent variables. The EM algorithm alternates between the E-step and M-step: 9 in the E-step, the posterior distribution over the latent rule content is evaluated; 10 in the M-step, the rule generator and model parameters are jointly optimized by 11 maximizing the current expected log-likelihood. Our model may have a wide 12 range of applications in areas such as sports analytics, robotics, and autonomous 13 cars, where understanding human movements are essential. We demonstrate the 14 model’s superior interpretability and prediction performance on pedestrian and 15 NBA basketball player datasets, both achieving promising results.
鏈接:
https://arxiv.org/pdf/2306.12244
?
?
9.?ReSync: Riemannian Subgradient-based Robust Rotation Synchronization
作者:
Huikang Liu,?Xiao Li,?Anthony Man-Cho So.
簡(jiǎn)介:
本文介紹了ReSync,這是一個(gè)基于黎曼梯度的算法,用于解決在各種工程應(yīng)用中出現(xiàn)的魯棒旋轉(zhuǎn)同步問(wèn)題。ReSync解決了在旋轉(zhuǎn)群上的最小非平方最小化公式,該公式是非光滑且非凸的,并且旨在直接恢復(fù)潛在的旋轉(zhuǎn)。在隨機(jī)損壞設(shè)置下為ReSync提供了強(qiáng)大的理論保證。具體來(lái)說(shuō),首先證明ReSync的初始化程序產(chǎn)生了一個(gè)位于地面真實(shí)旋轉(zhuǎn)周?chē)植繀^(qū)域的合適初始點(diǎn),接著建立了上述公式的弱銳度性質(zhì),然后利用這個(gè)性質(zhì)推導(dǎo)出ReSync對(duì)地面真實(shí)旋轉(zhuǎn)的局部線(xiàn)性收斂性。通過(guò)結(jié)合這些保證,得出ReSync在適當(dāng)條件下線(xiàn)性收斂到地面真實(shí)旋轉(zhuǎn)的結(jié)論。實(shí)驗(yàn)結(jié)果證明了ReSync的有效性。
Abstracts:
This work presents ReSync, a Riemannian subgradient-based algorithm for solving the robust rotation synchronization problem, which arises in various engineering applications. ReSync solves a least-unsquared minimization formulation over the rotation group, which is nonsmooth and nonconvex, and aims at recovering the underlying rotations directly. We provide strong theoretical guarantees for ReSync under the random corruption setting. Specifically, we first show that the initialization procedure of ReSync yields a proper initial point that lies in a local region around the ground-truth rotations. We next establish the weak sharpness property of the aforementioned formulation and then utilize this property to derive the local linear convergence of ReSync to the ground-truth rotations. By combining these guarantees, we conclude that ReSync converges linearly to the ground-truth rotations under appropriate conditions. Experiment results demonstrate the effectiveness of ReSync.
https://arxiv.org/abs/2305.15136
?
?
10.?An Alternative to Variance: Gini Deviation for Risk-averse Policy Gradient.
作者:
Yudong Luo,?Guiliang Liu,?Pascal Poupart, Yangchen Pan
簡(jiǎn)介:
將策略回報(bào)的方差限制在一定范圍內(nèi)是風(fēng)險(xiǎn)規(guī)避強(qiáng)化學(xué)習(xí)(RL)中的常見(jiàn)選擇,因?yàn)樗星逦臄?shù)學(xué)定義和易于解釋的特性。傳統(tǒng)方法直接限制總回報(bào)的方差。最近的方法將每步獎(jiǎng)勵(lì)的方差作為替代。我們深入研究了這些基于方差的方法的局限性,例如對(duì)數(shù)值規(guī)模的敏感性和阻礙策略學(xué)習(xí),并建議使用另一種風(fēng)險(xiǎn)度量,Gini偏差,作為替代。我們研究了這種新的風(fēng)險(xiǎn)度量的各種屬性,并推導(dǎo)出一種策略梯度算法來(lái)最小化它。在風(fēng)險(xiǎn)規(guī)避可以明確定義的領(lǐng)域進(jìn)行的實(shí)證評(píng)估表明,我們的算法可以緩解基于方差的風(fēng)險(xiǎn)度量的局限性,并在其他算法無(wú)法學(xué)習(xí)到合理策略的情況下,以方差和Gini偏差的低風(fēng)險(xiǎn)獲得高回報(bào)。
Abstracts:
Restricting the variance of a policy’s return is a popular choice in risk-averse Reinforcement Learning (RL) due to its clear mathematical definition and easy interpretability. Traditional methods directly restrict the total return variance. Recent methods restrict the per-step reward variance as a proxy. We thoroughly examine the limitations of these variance-based methods, such as sensitivity to numerical scale and hindering of policy learning, and propose to use an alternative risk measure, Gini deviation, as a substitute. We study various properties of this new risk measure and derive a policy gradient algorithm to minimize it. Empirical evaluation in domains where risk-aversion can be clearly defined, shows that our algorithm can mitigate the limitations of variance-based risk measures and achieves high return with low risk in terms of variance and Gini deviation when others fail to learn a reasonable policy.
https://arxiv.org/pdf/2307.08873.pdf
?
?
11.?Multi-Modal Inverse Constrained Reinforcement Learning from a Mixture of Demonstrations
作者:
Guanren Qiao (SDS博士生), Guiliang Liu,?Pascal Poupart, Zhiqiang Xu
簡(jiǎn)介:
逆向約束強(qiáng)化學(xué)習(xí)(Inverse Constraint Reinforcement Learning,ICRL)旨在以數(shù)據(jù)驅(qū)動(dòng)的方式恢復(fù)專(zhuān)家代理遵循的基本約束?,F(xiàn)有的ICRL算法通常假設(shè)示范數(shù)據(jù)由單一類(lèi)型的專(zhuān)家生成。然而,在實(shí)踐中,示范通常包含從遵循不同約束的各種專(zhuān)家代理收集的軌跡的混合,這使得用統(tǒng)一的約束函數(shù)解釋專(zhuān)家行為變得具有挑戰(zhàn)性。為了解決這個(gè)問(wèn)題,我們提出了一種多模式逆向約束強(qiáng)化學(xué)習(xí)(Multi-Modal Inverse Constrained Reinforcement Learning,MMICRL)算法,用于同時(shí)估計(jì)對(duì)應(yīng)于不同類(lèi)型專(zhuān)家的多個(gè)約束。MMICRL構(gòu)建了一個(gè)基于流的密度估計(jì)器,從示范中實(shí)現(xiàn)了無(wú)監(jiān)督的專(zhuān)家識(shí)別,以推斷特定于代理的約束。根據(jù)這些約束,MMICRL使用一種新穎的多模式約束策略?xún)?yōu)化目標(biāo)來(lái)模仿專(zhuān)家策略,該目標(biāo)最小化了代理?xiàng)l件下的策略熵并最大化了無(wú)條件的策略熵。為了增強(qiáng)魯棒性,我們將這個(gè)目標(biāo)納入對(duì)比學(xué)習(xí)框架中。這種方法使得模仿策略能夠捕捉到專(zhuān)家代理之間的行為多樣性。在離散和連續(xù)環(huán)境中進(jìn)行的大量實(shí)驗(yàn)證明,MMICRL在約束恢復(fù)和控制性能方面優(yōu)于其他基線(xiàn)算法。
Abstracts:
Inverse Constraint Reinforcement Learning (ICRL) aims to recover the underlying constraints respected by expert agents in a data-driven manner. Existing ICRL algorithms typically assume that the demonstration data is generated by a single type of expert. However, in practice, demonstrations often comprise a mixture of trajectories collected from various expert agents respecting different constraints, making it challenging to explain expert behaviors with a unified constraint function. To tackle this issue, we propose a Multi-Modal Inverse Constrained Reinforcement Learning (MMICRL) algorithm for simultaneously estimating multiple constraints corresponding to different types of experts. MMICRL constructs a flow-based density estimator that enables unsupervised expert identification from demonstrations, so as to infer the agent-specific constraints. Following these constraints, MMICRL imitates expert policies with a novel multi-modal constrained policy optimization objective that minimizes the agent-conditioned policy entropy and maximizes the unconditioned one. To enhance robustness, we incorporate this objective into the contrastive learning framework. This approach enables imitation policies to capture the diversity of behaviors among expert agents. Extensive experiments in both discrete and continuous environments show that MMICRL outperforms other baselines in terms of constraint recovery and control performance.
?
?
12.?PAC-Bayesian Spectrally-Normalized Bounds for Adversarially Robust Generalization
作者:
Jiancong Xiao,?Ruoyu Sun, Zhi-Quan Luo
簡(jiǎn)介:
深度神經(jīng)網(wǎng)絡(luò)(DNNs)容易受到對(duì)抗性攻擊。經(jīng)驗(yàn)發(fā)現(xiàn),對(duì)抗性魯棒泛化在建立對(duì)抗性攻擊的防御算法中至關(guān)重要。因此,研究魯棒泛化的理論保證是有趣的。本文重點(diǎn)研究 PAC-Bayes 分析(Neyshabur 等人,2017b)。主要的挑戰(zhàn)在于將標(biāo)準(zhǔn)設(shè)置中的一個(gè)關(guān)鍵成分,即權(quán)重?cái)_動(dòng)界限,擴(kuò)展到魯棒設(shè)置中?,F(xiàn)有的嘗試嚴(yán)重依賴(lài)額外的強(qiáng)假設(shè),導(dǎo)致界限寬松。在本文中,我們解決了這個(gè)問(wèn)題,并為 DNNs 提供了一個(gè)光譜歸一化的魯棒泛化界限。我們的界限至少與標(biāo)準(zhǔn)的泛化界限一樣緊密,只是在擾動(dòng)強(qiáng)度 $\epsilon$ 的一個(gè)因子上有所不同。與現(xiàn)有的魯棒泛化界限相比,我們的界限有兩個(gè)顯著的優(yōu)點(diǎn):1)它不依賴(lài)額外的假設(shè),和 2)它明顯更為緊密。我們提出了一個(gè)框架,使我們能夠得出更為普遍的結(jié)果。具體來(lái)說(shuō),我們將主要結(jié)果擴(kuò)展到 1)對(duì)抗一般非-$\ell_p$ 攻擊的魯棒性,和 2)其他神經(jīng)網(wǎng)絡(luò)架構(gòu),如 ResNet。
Abstracts:
Deep neural networks (DNNs) are vulnerable to adversarial attacks. It is found empirically that adversarially robust generalization is crucial in establishing defense algorithms against adversarial attacks. Therefore, it is interesting to study the theoretical guarantee of robust generalization. This paper focuses on PAC-Bayes analysis (Neyshabur et al., 2017b). The main challenge lies in extending the key ingredient, which is a weight perturbation bound in standard settings, to the robust settings. Existing attempts heavily rely on additional strong assumptions, leading to loose bounds. In this paper, we address this issue and provide a spectrally-normalized robust generalization bound for DNNs. Our bound is at least as tight as the standard generalization bound, differing only by a factor of the perturbation strength $\epsilon$. In comparison to existing robust generalization bounds, our bound offers two significant advantages: 1) it does not depend on additional assumptions, and 2) it is considerably tighter. We present a framework that enables us to derive more general results. Specifically, we extend the main result to 1) adversarial robustness against general non-$\ell_p$ attacks, and 2) other neural network architectures, such as ResNet.
?
13.?Imitation Learning from Imperfection: Theoretical Justifications and Algorithms
作者:
Ziniu Li (SDS博士生),?Tian Xu, Zeyu Qin, Yang Yu,?Zhi-Quan Luo
簡(jiǎn)介:
模仿學(xué)習(xí)(IL)算法擅長(zhǎng)于從專(zhuān)家數(shù)據(jù)中獲取高質(zhì)量的策略,以處理順序決策任務(wù)。但是,當(dāng)面臨專(zhuān)家數(shù)據(jù)有限的情況時(shí),它們的效果會(huì)受到阻礙。為了應(yīng)對(duì)這個(gè)挑戰(zhàn),出現(xiàn)了一種名為(離線(xiàn))IL與補(bǔ)充數(shù)據(jù)的新框架,通過(guò)整合一個(gè)額外但不完美的數(shù)據(jù)集來(lái)增強(qiáng)學(xué)習(xí),該數(shù)據(jù)集是從次優(yōu)策略中低成本獲取的。然而,由于可能包含了超出專(zhuān)家分布的樣本,學(xué)習(xí)變得具有挑戰(zhàn)性。在這項(xiàng)工作中,我們首次對(duì)這個(gè)框架進(jìn)行了數(shù)學(xué)形式化,揭示了它的限制。我們的理論分析顯示,一個(gè)簡(jiǎn)單的方法——將行為克隆(BC)算法的概念應(yīng)用于合并的專(zhuān)家和補(bǔ)充數(shù)據(jù)集——可能不及只依賴(lài)專(zhuān)家數(shù)據(jù)的普通BC算法。這個(gè)不足是由于兩個(gè)數(shù)據(jù)源之間的分布偏移造成的。為了解決這個(gè)問(wèn)題,我們提出了一種新的基于重要性抽樣的技術(shù),用于選擇專(zhuān)家分布內(nèi)的數(shù)據(jù)。我們證明,所提出的方法理論上消除了簡(jiǎn)單方法的差距,突顯了在處理不完美數(shù)據(jù)時(shí)的效果。實(shí)證研究表明,我們的方法在包括機(jī)器人運(yùn)動(dòng)控制、Atari視頻游戲和圖像分類(lèi)在內(nèi)的任務(wù)中,超越了先前的最先進(jìn)方法。總的來(lái)說(shuō),我們的工作強(qiáng)調(diào)了通過(guò)有效的數(shù)據(jù)選擇利用多樣化數(shù)據(jù)源來(lái)改善模仿學(xué)習(xí)的潛力。
Abstracts:
Imitation learning (IL) algorithms excel in acquiring high-quality policies from expert data for sequential decision-making tasks. But, their effectiveness is hampered when faced with limited expert data. To tackle this challenge, a novel framework called (offline) IL with supplementary data has emerged, which enhances learning by incorporating an additional yet imperfect dataset obtained inexpensively from sub-optimal policies. Nonetheless, learning becomes challenging due to the potential inclusion of out-of-expert-distribution samples. In this work, we pioneer the mathematical formalization of this framework, uncovering its limitations. Our theoretical analysis reveals that a naive approach—applying the behavioral cloning (BC) algorithm concept to the combined set of expert and supplementary data—may fall short of vanilla BC, which solely relies on expert data. This deficiency arises due to the distribution shift between the two data sources. To address this issue, we propose a new importance-sampling-based technique for selecting data within the expert distribution. We prove that the proposed method theoretically eliminates the gap of the naive approach, highlighting its efficacy when handling imperfect data. Empirical studies demonstrate that our method outperforms previous state-of-the-art methods in tasks including robotics locomotion control, Atari video games, and image classification. Overall, our work underscores the potential of improving IL by leveraging diverse data sources through effective data selection.
鏈接:
https://openreview.net/forum?id=vO04AzsB49
?
14.?Can LLM Already Serve as A Database Interface? A BIg Bench for Large-Scale Database Grounded Text-to-SQLs
作者:
Jinyang Li, Binyuan Hui, GE QU, Binhua Li, Jiaxi Yang, Bowen Li, Bailin Wang, Bowen Qin, Ruiying Geng, Nan Huo, Xuanhe Zhou,?Chenhao Ma,?Guoliang Li, Kevin Chang, Fei Huang, Reynold Cheng, Yongbin Li
簡(jiǎn)介:
Text-to-SQL解析旨在將自然語(yǔ)言指令轉(zhuǎn)換為可執(zhí)行的SQL,在近年來(lái)受到了越來(lái)越多的關(guān)注。特別是Codex和ChatGPT在此任務(wù)上展示了令人印象深刻的結(jié)果。然而,大多數(shù)流行的benchmark,例如Spider和WikiSQL,主要關(guān)注具有少量數(shù)據(jù)庫(kù)內(nèi)容的數(shù)據(jù)庫(kù)模式,這在學(xué)術(shù)研究和現(xiàn)實(shí)世界應(yīng)用之間留下了差距。為了緩解這一差距,我們提出了Bird,一個(gè)大型benchmark,用于大規(guī)模數(shù)據(jù)庫(kù)的text-to-SQL任務(wù),其中包含12,751對(duì)text-to-SQL數(shù)據(jù)和95個(gè)數(shù)據(jù)庫(kù),總大小為33.4 GB,橫跨37個(gè)專(zhuān)業(yè)領(lǐng)域。我們對(duì)數(shù)據(jù)庫(kù)值的強(qiáng)調(diào)突顯了數(shù)據(jù)庫(kù)內(nèi)容的新挑戰(zhàn)、NL問(wèn)題與數(shù)據(jù)庫(kù)內(nèi)容之間的外部知識(shí)以及SQL效率,特別是在大型數(shù)據(jù)庫(kù)的背景下。為了解決這些問(wèn)題,text-to-SQL模型必須具有數(shù)據(jù)庫(kù)值理解能力,除了語(yǔ)義解析。實(shí)驗(yàn)結(jié)果顯示了在為大型數(shù)據(jù)庫(kù)生成準(zhǔn)確的text-to-SQLs時(shí),數(shù)據(jù)庫(kù)值的重要性。此外,即使是最有效的text-to-SQL模型,例如ChatGPT,在執(zhí)行準(zhǔn)確性上僅達(dá)到40.08%,這仍然遠(yuǎn)遠(yuǎn)低于人類(lèi)的92.96%的結(jié)果,證明挑戰(zhàn)仍然存在。此外,我們還提供了一個(gè)效率分析,為生成對(duì)行業(yè)有益的text-to-efficient-SQLs提供了見(jiàn)解。我們相信BIRD將有助于推進(jìn)text-to-SQL研究的實(shí)際應(yīng)用。
Abstracts:
Text-to-SQL parsing, which aims at converting natural language instructions into executable SQLs, has gained increasing attention in recent years. In particular, Codex and ChatGPT have shown impressive results in this task. However, most of the prevalent benchmarks, i.e., Spider, and WikiSQL, focus on database schema with few rows of database contents leaving the gap between academic study and real-world applications. To mitigate this gap, we present Bird, a big benchmark for large-scale database grounded in text-to-SQL tasks, containing 12,751 pairs of text-to-SQL data and 95 databases with a total size of 33.4 GB, spanning 37 professional domains. Our emphasis on database values highlights the new challenges of dirty database contents, external knowledge between NL questions and database contents, and SQL efficiency, particularly in the context of massive databases. To solve these problems, text-to-SQL models must feature database value comprehension in addition to semantic parsing. The experimental results demonstrate the significance of database values in generating accurate text-to-SQLs for big databases. Furthermore, even the most effective text-to-SQL models, i.e. ChatGPT, only achieves 40.08% in execution accuracy, which is still far from the human result of 92.96%, proving that challenges still stand. Besides, we also provide an efficiency analysis to offer insights into generating text-to-efficient-SQLs that are beneficial to industries. We believe that BIRD will contribute to advancing real-world applications of text-to-SQL research.
鏈接:
https://arxiv.org/abs/2305.03111
?
?
15.?Balanced Training for Sparse GANs
作者:
Yite Wang, Jing Wu, Naira Hovakimyan,?Ruoyu Sun
簡(jiǎn)介:
在過(guò)去的幾年中,人們對(duì)開(kāi)發(fā)更大、更深的神經(jīng)網(wǎng)絡(luò),包括像生成對(duì)抗網(wǎng)絡(luò)(Generative adversarial networks, GANs)這樣的深度生成模型越來(lái)越感興趣。然而,GANs 通常伴隨著高計(jì)算復(fù)雜度,這使得研究者開(kāi)始探索降低訓(xùn)練和推理成本的方法。在監(jiān)督學(xué)習(xí)中逐漸受到歡迎的一種方法是動(dòng)態(tài)稀疏訓(xùn)練(dynamic sparse training, DST),它在稀疏化神經(jīng)網(wǎng)絡(luò)時(shí)不僅能夠保持良好性能且享有出色的訓(xùn)練效率。盡管DST有很多潛在的好處,但由于GANs訓(xùn)練過(guò)程的對(duì)抗性,將它應(yīng)用到GANs還存在著許多挑戰(zhàn)。在本文中,我們提出了一種名為平衡比率(balance ratio, BR)的新指標(biāo),用于研究稀疏生成器和鑒別器之間的平衡。我們進(jìn)一步介紹了一種名為平衡動(dòng)態(tài)稀疏訓(xùn)練(balanced dynamic sparse training, ADAPT)的新方法,該方法嘗試在GAN訓(xùn)練中控制BR,很好地實(shí)現(xiàn)了稀疏化GANs性能和訓(xùn)練成本之間的平衡。我們將提出的方法在多個(gè)數(shù)據(jù)集上測(cè)試,其優(yōu)秀的結(jié)果證明了ADAPT的有效性。
Abstracts:
Over the past few years, there has been growing interest in developing larger and deeper neural networks, including deep generative models like generative adversarial networks (GANs). However, GANs typically come with high computational complexity,? leading researchers to explore methods for reducing the training and inference costs.? One such approach gaining popularity in supervised learning is dynamic sparse training (DST), which maintains good performance while enjoying excellent training efficiency.? Despite its potential benefits, applying DST to GANs presents challenges due to the adversarial nature of the training process. In this paper, we propose a novel metric called the balance ratio (BR) to study the balance between the sparse generator and discriminator. We also introduce a new method called balanced dynamic sparse training (ADAPT), which seeks to control the BR during GAN training to achieve a good trade-off between performance and computational cost. Our proposed method shows promising results on multiple datasets, demonstrating its effectiveness.
https://neurips.cc/virtual/2023/poster/70078
?
?
16.?Information Design in Multi-Agent Reinforcement Learning
作者:
Yue Lin,?Wenhao Li (SDS博士后),?Hongyuan Zha, Baoxiang Wang
簡(jiǎn)介:
強(qiáng)化學(xué)習(xí)(RL)受到了人類(lèi)和動(dòng)物與環(huán)境互動(dòng)的啟發(fā)。這種設(shè)定有些理想化,因?yàn)樵趯?shí)際任務(wù)中,環(huán)境中的其他智能體有自己的目標(biāo),并會(huì)根據(jù)自我智能體的行為適應(yīng)性地行動(dòng)。為了在這些環(huán)境中獲得優(yōu)秀的表現(xiàn),智能體需要影響其他智能體,使得他的行為變得更有助益且不那么有害。計(jì)算經(jīng)濟(jì)學(xué)的研究總結(jié)了兩種直接影響他人的方法:通過(guò)提供有形商品(機(jī)制設(shè)計(jì))和通過(guò)提供信息(信息設(shè)計(jì))。這篇工作研究了一組RL智能體的信息設(shè)計(jì)問(wèn)題。主要的挑戰(zhàn)有兩方面。一方面是提供的信息會(huì)立即影響智能體軌跡的轉(zhuǎn)換,這引入了額外的非平穩(wěn)性。另一方面是信息可能會(huì)被忽略,所以發(fā)送者必須提供接收者愿意尊重的信息。我們制定了馬爾可夫傳信博弈,并發(fā)展了傳信梯度和擴(kuò)展服從約束的概念來(lái)應(yīng)對(duì)這些挑戰(zhàn)。我們的算法在各種混合動(dòng)機(jī)任務(wù)上都很高效,并為計(jì)算經(jīng)濟(jì)學(xué)提供了進(jìn)一步的見(jiàn)解。
Abstracts:
Reinforcement learning (RL) is inspired by how humans and animals interact with the environment. The setting is somewhat idealized because, in actual tasks, other agents in the environment have their own goals and behave adaptively to the ego agent. To thrive in those environments, the agent needs to influence other agents so their actions become more helpful and less harmful. Research in computational economics distills two ways to influence others directly: by providing tangible goods (mechanism design) and by providing information (information design).? This work investigates information design problems for a group of RL agents. The main challenges are two-fold. One is the information provided will immediately affect the transition of the agent trajectories, which introduces additional non-stationarity. The other is the information can be ignored, so the sender must provide information that the receiver is willing to respect. We formulate the Markov signaling game, and develop the notions of signaling gradient and the extended obedience constraints that address these challenges. Our algorithm is efficient on various mixed-motive tasks and provides further insights into computational economics.?
https://github.com/YueLin301/InformationDesignMARL
?
?
17.?Learning Adversarial Low-rank Markov Decision Processes with Unknown Transition and Full-information Feedback
作者:
Canzhe Zhao, Ruofeng Yang,?Baoxiang Wang,?Xuezhou Zhang, Shuai Li
簡(jiǎn)介:
我們研究了在全信息反饋設(shè)置下的對(duì)抗性低秩馬爾可夫決策過(guò)程. 這個(gè)設(shè)定下, 轉(zhuǎn)移概率函數(shù)時(shí)未知的, 并且存在低秩矩陣分解, 同時(shí)損失函數(shù)可能會(huì)發(fā)生對(duì)抗性變化, 但會(huì)在每次迭代之后向?qū)W習(xí)者揭示。我們提出了一種基于策略?xún)?yōu)化的算法, POLO, 并證明它達(dá)到了 $\widetilde{O}\left(\frac{K^{\frac{3}{4}} A^{\frac{1}{ 2}} d\ln^{\frac{1}{4}}M}{1-\gamma}+\frac{\sqrt{K}}{(1-\gamma)^2}\right)$的后悔界, 其中 $d$ 是轉(zhuǎn)移矩陣的秩, $A$ 是動(dòng)作空間的大小,$M$ 是模型集合的大小, $\gamma$是折扣因子. 值得注意的是,我們的算法調(diào)用orcale次數(shù)較低, 并且后悔界與狀態(tài)集大小無(wú)關(guān). 據(jù)我們所知, 這是第一個(gè)結(jié)合了表示學(xué)習(xí), 探索和利用的平衡, 以實(shí)現(xiàn)具有非線(xiàn)性函數(shù)逼近和對(duì)抗性損失的強(qiáng)化學(xué)習(xí)的次線(xiàn)性后悔保證。
Abstracts:
In this work, we study the low-rank MDPs with adversarially changed losses in the full-information feedback setting. In particular, the unknown transition probability function admits a low-rank matrix decomposition \citep{REPUCB22}, and the loss functions may change adversarially but are revealed to the learner at the end of each episode. We propose a policy optimization-based algorithm POLO, and we prove that it attains the $\widetilde{O}\left(\frac{K^{\frac{3}{4}}? ?A^{\frac{1}{2}} d\ln^{\frac{1}{4}}M}{1-\gamma}+\frac{\sqrt{K}}{(1-\gamma)^2}\right)$ regret guarantee, where $d$ is rank of the transition kernel (and hence the dimension of the unknown representations), $A$ is the cardinality of the action space, $M$ is the cardinality of the model class, and $\gamma$ is the discounted factor. Notably, our algorithm is oracle-efficient and has a regret guarantee with no dependence on the size of potentially arbitrarily large state space. To the best of our knowledge, we present the first algorithm that interleaves representation learning, exploration, and exploitation to achieve the sublinear regret guarantee for RL with nonlinear function approximation and adversarial losses.
?
?
18.?Two Heads are Better Than One: A Simple Exploration Framework for Efficient Multi-Agent Reinforcement Learning
作者:
Jiahui Li, Kun Kuang,?Baoxiang Wang,?Xingchen Li, Long Chen, Fei Wu, Jun Xiao
簡(jiǎn)介:
探索策略在強(qiáng)化學(xué)習(xí)中發(fā)揮著重要作用,尤其是在稀疏獎(jiǎng)勵(lì)任務(wù)中。在協(xié)作多智能體強(qiáng)化學(xué)習(xí)(MARL)中,由于狀態(tài)空間大和智能體之間復(fù)雜的交互,設(shè)計(jì)合適的探索策略更具挑戰(zhàn)性。目前,MARL中的主流探索方法要么有助于探索大而稀疏的陌生狀態(tài),要么以高計(jì)算成本測(cè)量智能體之間的交互。我們發(fā)現(xiàn)一個(gè)有趣的現(xiàn)象,不同類(lèi)型的探索在不同的MARL場(chǎng)景中發(fā)揮著不同的作用,選擇合適的探索往往比設(shè)計(jì)精致的算法更有效。在本文中,我們提出了一種結(jié)合基于好奇心和基于影響力的探索(COIN)的探索方法,該方法簡(jiǎn)單但在各種情況下都有效。首先,COIN 基于互信息理論衡量每個(gè)智能體對(duì)其他智能體的影響,并將其設(shè)計(jì)為應(yīng)用于每個(gè)個(gè)體價(jià)值函數(shù)的內(nèi)在獎(jiǎng)勵(lì)。此外,COIN 通過(guò)添加到外在獎(jiǎng)勵(lì)中的預(yù)測(cè)誤差來(lái)計(jì)算基于好奇心的內(nèi)在獎(jiǎng)勵(lì)。為了整合這兩種內(nèi)在獎(jiǎng)勵(lì),COIN 采用了一種新穎的框架,使它們相互補(bǔ)充,并對(duì)合作 MARL 任務(wù)進(jìn)行了充分有效的探索。我們對(duì)三個(gè)具有挑戰(zhàn)性的基準(zhǔn)進(jìn)行了廣泛的實(shí)驗(yàn):星際爭(zhēng)霸 II、MACO 和 Google Football。不同場(chǎng)景的結(jié)果顯示了我們 COIN 的優(yōu)越性。
Abstracts:
Exploration strategy plays an important role in reinforcement learning, especially in sparse-reward tasks. In cooperative multi-agent reinforcement learning (MARL), designing a suitable exploration strategy is much more challenging due to the large state space and the complex interaction among agents. Currently, mainstream exploration methods in MARL either contribute to exploring the unfamiliar states which are large and sparse, or measuring the interaction among agents with high computational costs. We found an interesting phenomenon that different kinds of exploration plays a different role in different MARL scenarios, and choosing a suitable one is often more effective than designing an exquisite algorithm. In this paper, we propose a exploration method that incorporate the curiosity-based and influence-based exploration (COIN) which is simple but effective in various situations. First, COIN measures the influence of each agent on the other agents based on mutual information theory and designs it as intrinsic rewards which are applied to each individual value function. Moreover, COIN computes the curiosity-based intrinsic rewards via prediction errors which are added to the extrinsic reward. For integrating the two kinds of intrinsic rewards, COIN utilizes a novel framework in which they complement each other and lead to a sufficient and effective exploration on cooperative MARL tasks. We perform extensive experiments on three challenging benchmarks: StarCraft II, MACO, and Google Football. The results across different scenarios show the superiority of our COIN.
?
?
19.??Med-UniC: Unifying Cross-Lingual Medical Vision-Language Pre-Training by Diminishing Bias
作者:
Zhongwei Wan, Che Liu, Mi Zhang, Jie Fu,?Benyou Wang,?Sibo Cheng, Lei Ma, César Quilodrán-Casas, Rossella Arcucc
簡(jiǎn)介:
數(shù)據(jù)稀缺是醫(yī)學(xué)視覺(jué)-語(yǔ)言預(yù)訓(xùn)練(VLP)有效性的一個(gè)關(guān)鍵障礙。一個(gè)可能的解決方案是結(jié)合來(lái)自不同語(yǔ)言社區(qū)的數(shù)據(jù)集。然而,主要的挑戰(zhàn)來(lái)自于整合多樣的語(yǔ)法和語(yǔ)義、特定于語(yǔ)言的醫(yī)學(xué)術(shù)語(yǔ)以及特定于文化的隱式知識(shí)的復(fù)雜性。因此,考慮的一個(gè)關(guān)鍵方面是由于不同語(yǔ)言而產(chǎn)生的社區(qū)偏見(jiàn)。本文提出了一個(gè)名為統(tǒng)一跨語(yǔ)言醫(yī)學(xué)視覺(jué)-語(yǔ)言預(yù)訓(xùn)練(Med-UniC)的新框架,旨在整合來(lái)自英語(yǔ)和西班牙語(yǔ)這兩種最普遍的語(yǔ)言的多模態(tài)醫(yī)學(xué)數(shù)據(jù)。具體來(lái)說(shuō),我們提出了CTR(跨語(yǔ)言文本對(duì)齊規(guī)范化)來(lái)明確地統(tǒng)一來(lái)自不同語(yǔ)言社區(qū)的醫(yī)學(xué)報(bào)告的跨語(yǔ)言語(yǔ)義表示。通過(guò)潛在語(yǔ)言的解纏,優(yōu)化了CTR,使我們的優(yōu)化目標(biāo)不依賴(lài)于負(fù)樣本,從而顯著減少了在類(lèi)似的醫(yī)學(xué)報(bào)告中確定正負(fù)樣本對(duì)的偏見(jiàn)。此外,它確保了跨語(yǔ)言表示不偏向于任何特定的語(yǔ)言社區(qū)。Med-UniC在5個(gè)醫(yī)學(xué)圖像任務(wù)和10個(gè)數(shù)據(jù)集中達(dá)到了卓越的性能,涵蓋了30多種疾病,為統(tǒng)一多模態(tài)醫(yī)學(xué)數(shù)據(jù)提供了一個(gè)多功能的框架,適用于不同的語(yǔ)言社區(qū)。實(shí)驗(yàn)結(jié)果突顯了跨語(yǔ)言VLP中社區(qū)偏見(jiàn)的存在。減少這種偏見(jiàn)不僅提高了視覺(jué)-語(yǔ)言任務(wù)的性能,而且提高了單一模式的視覺(jué)任務(wù)的性能。
Abstracts:
The scarcity of data presents a critical obstacle to the efficacy of medical vision-language pre-training (VLP). A potential solution lies in the combination of datasets from various language communities. Nevertheless, the main challenge stems from the complexity of integrating diverse syntax and semantics, language-specific medical terminology, and culture-specific implicit knowledge. Therefore, one crucial aspect to consider is the presence of community bias caused by different languages. This paper presents a novel framework named Unifying Cross-Lingual Medical Vision-Language Pre-Training (\textbf{Med-UniC}), designed to integrate multi-modal medical data from the two most prevalent languages, English and Spanish. Specifically, we propose \textbf{C}ross-lingual \textbf{T}ext Alignment \textbf{R}egularization (\textbf{CTR}) to explicitly unify cross-lingual semantic representations of medical reports originating from diverse language communities. \textbf{CTR} is optimized through latent language disentanglement, rendering our optimization objective to not depend on negative samples, thereby significantly mitigating the bias from determining positive-negative sample pairs within analogous medical reports. Furthermore, it ensures that the cross-lingual representation is not biased toward any specific language community. \textbf{Med-UniC} reaches superior performance across 5 medical image tasks and 10 datasets encompassing over 30 diseases, offering a versatile framework for unifying multi-modal medical data within diverse linguistic communities. The experimental outcomes highlight the presence of community bias in cross-lingual VLP. Reducing this bias enhances the performance not only in vision-language tasks but also in uni-modal visual tasks.
鏈接:
https://arxiv.org/abs/2305.19894
?
?
20.??All In One A Chinese Multi-Modal Dataset for Multi-Affection Detection in Conversations
作者:
Yazhou Zhang, Yang Yu, Qing Guo,?Benyou Wang,?Dongming Zhao, Sagar Uprety, Dawei Song, Jing Qin, Qiuchi Li
簡(jiǎn)介:
人類(lèi)的交流具有多模態(tài)和多情感的特性。不同情感和情緒之間的相互關(guān)系使得利用多模態(tài)線(xiàn)索共同檢測(cè)多種人類(lèi)情感面臨挑戰(zhàn)。最近在這個(gè)領(lǐng)域的進(jìn)展采用了多任務(wù)學(xué)習(xí)范式,以實(shí)現(xiàn)任務(wù)之間的相互關(guān)系,但是公開(kāi)資源的稀缺性限制了這方面工作的潛力。為了填補(bǔ)這一空白,我們構(gòu)建了第一個(gè)中文多模態(tài)多情感對(duì)話(huà)(CMMA)數(shù)據(jù)集,其中包含了3,000個(gè)多方對(duì)話(huà)和來(lái)自各種電視劇風(fēng)格的21,795個(gè)多模態(tài)話(huà)語(yǔ)。CMMA包含了各種各樣的情感標(biāo)簽,包括情緒、情感、諷刺和幽默,以及某些任務(wù)對(duì)之間的新穎相互關(guān)系數(shù)值。此外,它還提供了對(duì)話(huà)中的話(huà)題和發(fā)言者信息,促進(jìn)了對(duì)話(huà)背景的更好建模。在這個(gè)數(shù)據(jù)集上,我們經(jīng)驗(yàn)性地分析了不同數(shù)據(jù)模態(tài)和對(duì)話(huà)背景對(duì)不同情感分析任務(wù)的影響,并展示了任務(wù)間關(guān)聯(lián)的實(shí)際益處。
Abstracts:
Human communication has a multi-modal and multi-affection nature. The inter-relatedness of different emotions and sentiments poses a challenge to jointly detect multiple human affections with multi-modal clues. Recent advances in this field employed multi-task learning paradigms to render the inter-relatedness across tasks, but the scarcity of publicly available resources sets a limit to the potential of works. To fill this gap, we build the first Chinese Multi-modal Multi-Affection conversation (CMMA) dataset, which contains 3,000 multi-party conversations and 21,795 multi-modal utterances collected from various styles of TV-series. CMMA contains a wide variety of affection labels, including sentiment, emotion, sarcasm and humor, as well as the novel inter-correlations values between certain pairs of tasks. Moreover, it provides the topic and speaker information in conversations, which promotes better modeling of conversational context. On the dataset, we empirically analyze the influence of different data modalities and conversational contexts on different affection analysis tasks, and exhibit the practical benefit of inter-task correlations.
鏈接:
https://neurips.cc/virtual/2023/poster/73481
?
?
21.?DeepfakeBench: A Comprehensive Benchmark of Deepfake Detection
作者:
Zhiyuan Yan (SDS碩士生),Yong Zhang, Xinhang Yuan, Siwei Lyu,?Baoyuan Wu
簡(jiǎn)介:
Deepfake檢測(cè)領(lǐng)域的一個(gè)關(guān)鍵但常被忽視的挑戰(zhàn)是:缺乏標(biāo)準(zhǔn)化、統(tǒng)一、全面的基準(zhǔn)。這會(huì)導(dǎo)致不公平的性能比較和潛在的誤導(dǎo)性結(jié)果。具體來(lái)說(shuō),數(shù)據(jù)處理流程缺乏統(tǒng)一性,導(dǎo)致輸入每個(gè)檢測(cè)模型的數(shù)據(jù)不統(tǒng)一。此外,不同方法的實(shí)驗(yàn)設(shè)置存在明顯差異,評(píng)估策略和指標(biāo)也普遍缺乏標(biāo)準(zhǔn)化。為了填補(bǔ)這一空白,我們提出了領(lǐng)域內(nèi)第一個(gè)用于 Deepfake檢測(cè)的綜合基準(zhǔn),稱(chēng)為DeepfakeBench,它提供了三個(gè)關(guān)鍵貢獻(xiàn):1)統(tǒng)一的數(shù)據(jù)管理系統(tǒng),以確保所有檢測(cè)器的輸入一致,2)針對(duì)最新的SOTA方法集成的一套統(tǒng)一訓(xùn)練框架,以及3)標(biāo)準(zhǔn)化的評(píng)估指標(biāo)和協(xié)議,以提高透明度和可重復(fù)性。DeepfakeBench具有可擴(kuò)展、模塊化的代碼庫(kù),包含15種最先進(jìn)的檢測(cè)方法、9個(gè)deepfake數(shù)據(jù)集、一系列deepfake檢測(cè)評(píng)估協(xié)議和分析工具以及綜合評(píng)估。此外,我們根據(jù)從不同角度(例如數(shù)據(jù)增強(qiáng)、骨干網(wǎng)絡(luò))基于這些評(píng)估進(jìn)行了廣泛分析并提供了新的見(jiàn)解。我們希望我們的努力能夠促進(jìn)未來(lái)的研究并促進(jìn)這個(gè)日益重要的領(lǐng)域的創(chuàng)新。?
Abstracts:
A critical yet frequently overlooked challenge in the field of deepfake detection is the lack of a standardized, unified, comprehensive benchmark. This issue leads to unfair performance comparisons and potentially misleading results. Specifically, there is a lack of uniformity in data processing pipelines, resulting in inconsistent data inputs for detection models. Additionally, there are noticeable differences in experimental settings, and evaluation strategies and metrics lack standardization. To fill this gap, we present the first comprehensive benchmark for deepfake detection, called DeepfakeBench, which offers three key contributions: 1) a unified data management system to ensure consistent input across all detectors, 2) an integrated framework for state-of-the-art methods implementation, and 3) standardized evaluation metrics and protocols to promote transparency and reproducibility. Featuring an extensible, modular-based codebase, DeepfakeBench contains 15 state-of-the-art detection methods, 9 deepfake datasets, a series of deepfake detection evaluation protocols and analysis tools, as well as comprehensive evaluations. Moreover, we provide new insights based on extensive analysis of these evaluations from various perspectives (e.g., data augmentations, backbones). We hope that our efforts could facilitate future research and foster innovation in this increasingly critical domain. All codes, evaluations, and analyses of our benchmark are publicly available at this https URL.
鏈接:
https://arxiv.org/abs/2307.01426
?
?
22.?Shared Adversarial Unlearning: Backdoor Mitigation by Unlearning Shared Adversarial Examples
作者:
Shaokui Wei (SDS博士生), Mingda Zhang (SDS博士生), Hongyuan Zha, Baoyuan Wu?
簡(jiǎn)介:
后門(mén)攻擊是機(jī)器學(xué)習(xí)的重大安全威脅,對(duì)手可以將帶有觸發(fā)器的樣本注入訓(xùn)練集,從而訓(xùn)練一個(gè)后門(mén)模型,該模型可以預(yù)測(cè)帶有特定觸發(fā)器的樣本到特定的目標(biāo)類(lèi)別,而在良性樣本上表現(xiàn)正常。在這篇論文中,我們探索了使用小的干凈數(shù)據(jù)集凈化后門(mén)模型的任務(wù)。通過(guò)建立后門(mén)風(fēng)險(xiǎn)和對(duì)抗風(fēng)險(xiǎn)之間的聯(lián)系,我們推導(dǎo)出了一個(gè)新穎的后門(mén)風(fēng)險(xiǎn)上界,其主要捕捉了后門(mén)模型和凈化模型之間的共享對(duì)抗樣本(SAEs)的風(fēng)險(xiǎn)。這個(gè)上界進(jìn)一步提出了一種新穎的雙層優(yōu)化問(wèn)題,用于利用對(duì)抗訓(xùn)練技術(shù)減輕后門(mén)的影響。為了解決這個(gè)問(wèn)題,我們提出了共享對(duì)抗反學(xué)習(xí)(SAU)。具體而言,SAU首先生成SAEs,然后反學(xué)習(xí)生成的SAEs,以便它們被凈化模型正確分類(lèi)或由兩個(gè)模型以不同的方式分類(lèi),從而在凈化模型中減輕后門(mén)的影響。在各種基準(zhǔn)數(shù)據(jù)集和網(wǎng)絡(luò)架構(gòu)上的實(shí)驗(yàn)表明,我們提出的方法在后門(mén)防御方面達(dá)到了最先進(jìn)的性能。
Abstracts:
Backdoor attacks are serious security threats to machine learning models where an adversary can inject poisoned samples into the training set, causing a backdoored model which predicts poisoned samples with particular triggers to particular target classes, while behaving normally on benign samples. In this paper, we explore the task of purifying a backdoored model using a small clean dataset. By establishing the connection between backdoor risk and adversarial risk, we derive a novel upper bound for backdoor risk, which mainly captures the risk on the shared adversarial examples (SAEs) between the backdoored model and the purified model. This upper bound further suggests a novel bi-level optimization problem for mitigating backdoor using adversarial training techniques. To solve it, we propose Shared Adversarial Unlearning (SAU). Specifically, SAU first generates SAEs, and then, unlearns the generated SAEs such that they are either correctly classified by the purified model and/or differently classified by the two models, such that the backdoor effect in the backdoored model will be mitigated in the purified model. Experiments on various benchmark datasets and network architectures show that our proposed method achieves state-of-the-art performance for backdoor defense.
鏈接:
https://arxiv.org/pdf/2307.10562
?
23.?Neural Polarizer: A Lightweight and Effective Backdoor Defense via Purifying Poisoned Features
作者:
Mingli Zhu (SDS博士生), Shaokui Wei (SDS博士生), Hongyuan Zha, Baoyuan Wu
簡(jiǎn)介:
最近的研究證明了深度神經(jīng)網(wǎng)絡(luò)對(duì)后門(mén)攻擊的敏感性。給定一個(gè)后門(mén)模型,盡管觸發(fā)器信息和良性信息共存,但其對(duì)具有觸發(fā)的中毒樣本的預(yù)測(cè)將由觸發(fā)信息主導(dǎo)。受光學(xué)偏振器機(jī)制的啟發(fā),偏振器可以通過(guò)特定偏振的光波,同時(shí)過(guò)濾其他偏振的光波,我們提出了一種新穎的后門(mén)防御方法,通過(guò)在后門(mén)模型中插入可學(xué)習(xí)的神經(jīng)偏振器作為中間層,以便通過(guò)過(guò)濾觸發(fā)信息來(lái)凈化中毒樣本,同時(shí)保持良性信息。神經(jīng)偏振器被實(shí)例化為一個(gè)輕量級(jí)線(xiàn)性變換層,它是通過(guò)基于有限的干凈數(shù)據(jù)集解決精心設(shè)計(jì)的雙層優(yōu)化問(wèn)題來(lái)學(xué)習(xí)的。與其他經(jīng)常調(diào)整后門(mén)模型所有參數(shù)的基于微調(diào)的防御方法相比,所提出的方法只需要額外學(xué)習(xí)一層,因此效率更高,并且需要更少的干凈數(shù)據(jù)。大量的實(shí)驗(yàn)證明了我們的方法在消除各種神經(jīng)網(wǎng)絡(luò)架構(gòu)和數(shù)據(jù)集中的后門(mén)方面的有效性和效率,特別是在干凈數(shù)據(jù)非常有限的情況下。
Abstracts:
Recent studies have demonstrated the susceptibility of deep neural networks to backdoor attacks. Given a backdoored model, its prediction of a poisoned sample with trigger will be dominated by the trigger information, though trigger information and benign information coexist. Inspired by the mechanism of the optical polarizer that a polarizer could pass light waves with particular polarizations while filtering light waves with other polarizations, we propose a novel backdoor defense method by inserting a learnable neural polarizer into the backdoored model as an intermediate layer, in order to purify the poisoned sample via filtering trigger information while maintaining benign information. The neural polarizer is instantiated as one lightweight linear transformation layer, which is learned through solving a well designed bi-level optimization problem, based on a limited clean dataset. Compared to other fine-tuning-based defense methods which often adjust all parameters of the backdoored model, the proposed method only needs to learn one additional layer, such that it is more efficient and requires less clean data. Extensive experiments demonstrate the effectiveness and efficiency of our method in removing backdoors across various neural network architectures and datasets, especially in the case of very limited clean data.
鏈接:
https://arxiv.org/pdf/2306.16697.pdf
?
24.?AUDIT: Audio Editing by Following Instructions with Latent Diffusion Models
作者:
Yuancheng Wang (SDS博士生),?Zeqian Ju, Xu Tan, Lei He,?Zhizheng Wu,?Jiang Bian,?Sheng Zhao
簡(jiǎn)介:
音頻編輯適用于多種目的,例如添加背景音效、替換樂(lè)器伴奏或者修復(fù)損壞的音頻。最近,一些基于深度擴(kuò)散模型的方法通過(guò)使用以輸出音頻的文本描述為條件的擴(kuò)散和去噪過(guò)程實(shí)現(xiàn)了零樣本音頻編輯。然而,這些方法仍然存在一些問(wèn)題:1)它們沒(méi)有經(jīng)過(guò)編輯任務(wù)的訓(xùn)練,無(wú)法保證良好的編輯效果;2)他們可能會(huì)錯(cuò)誤地修改不需要編輯的音頻片段;3)他們需要輸出音頻的完整描述,這在實(shí)際場(chǎng)景中并不總是可用或必需的。在這項(xiàng)工作中,我們提出了 AUDIT,一種基于潛在擴(kuò)散模型的指令引導(dǎo)音頻編輯模型。具體來(lái)說(shuō),AUDIT具有三個(gè)主要設(shè)計(jì)特點(diǎn):1)我們?yōu)椴煌囊纛l編輯任務(wù)構(gòu)建三元組訓(xùn)練數(shù)據(jù)(指令、輸入音頻、輸出音頻),并使用指令和輸入(待編輯)音頻作為條件訓(xùn)練擴(kuò)散模型并生成輸出 (編輯)音頻;2)通過(guò)比較輸入和輸出音頻的差異,自動(dòng)學(xué)習(xí)只修改需要編輯的片段;3)只需要編輯指令,而不需要完整的目標(biāo)音頻描述作為文本輸入。AUDIT 在多個(gè)音頻編輯任務(wù)(例如添加、刪除、替換、修復(fù)、超分辨率)的客觀(guān)和主觀(guān)指標(biāo)方面均取得了最先進(jìn)的結(jié)果。
Abstracts:
Audio editing is applicable for various purposes, such as adding background sound effects, replacing a musical instrument, and repairing damaged audio. Recently, some diffusion-based methods achieved zero-shot audio editing by using a diffusion and denoising process conditioned on the text description of the output audio. However, these methods still have some problems: 1) they have not been trained on editing tasks and cannot ensure good editing effects; 2) they can erroneously modify audio segments that do not require editing; 3) they need a complete description of the output audio, which is not always available or necessary in practical scenarios. In this work, we propose AUDIT, an instruction-guided audio editing model based on latent diffusion models. Specifically, AUDIT has three main design features: 1) we construct triplet training data (instruction, input audio, output audio) for different audio editing tasks and train a diffusion model using instruction and input (to be edited) audio as conditions and generating output (edited) audio; 2) it can automatically learn to only modify segments that need to be edited by comparing the difference between the input and output audio; 3) it only needs edit instructions instead of full target audio descriptions as text input. AUDIT achieves state-of-the-art results in both objective and subjective metrics for several audio editing tasks (e.g., adding, dropping, replacement, inpainting, super-resolution).
鏈接:
https://arxiv.org/abs/2304.00830
?
25.?Motion-X A Large-scale 3D Expressive Whole-body Human Motion Dataset
作者:
Jing Lin, Ailing Zeng,?Shunlin Lu (SDS 博士生),?Yuanhao Cai,?Ruimao Zhang,?Haoqian Wang, Lei Zhang
簡(jiǎn)介:
在本文中,我們提出了Motion-X,這是一個(gè)大規(guī)模的3D表情全身運(yùn)動(dòng)數(shù)據(jù)集?,F(xiàn)有的運(yùn)動(dòng)數(shù)據(jù)集主要包含僅限于身體的姿勢(shì),缺少面部表情、手勢(shì)和詳細(xì)的姿勢(shì)描述。而且,這些數(shù)據(jù)集主要是在實(shí)驗(yàn)室環(huán)境中以手工標(biāo)注文本描述的方式收集而來(lái),這大大限制了它們的可擴(kuò)展性。為了克服這些限制,我們開(kāi)發(fā)了一個(gè)全身運(yùn)動(dòng)和文本注釋流程,它可以自動(dòng)注釋來(lái)自單視圖或多視圖視頻的運(yùn)動(dòng),并為每個(gè)視頻提供全面的語(yǔ)義標(biāo)簽,以及為每個(gè)幀提供詳細(xì)的全身姿勢(shì)描述。這個(gè)流程具有高精度、成本效益,并且可擴(kuò)展,適用于進(jìn)一步的研究。基于此,我們構(gòu)建了Motion-X,它包含了1370萬(wàn)精確的3D全身姿勢(shì)注釋?zhuān)碨MPL-X),涵蓋了來(lái)自大量場(chǎng)景的96K運(yùn)動(dòng)序列。此外,Motion-X還提供了1370萬(wàn)幀級(jí)全身姿勢(shì)描述和96K序列級(jí)語(yǔ)義標(biāo)簽。全面的實(shí)驗(yàn)驗(yàn)證了注釋流程的準(zhǔn)確性,以及Motion-X在增強(qiáng)表情豐富、多樣化和自然運(yùn)動(dòng)生成以及3D全身人體網(wǎng)格恢復(fù)方面的顯著優(yōu)勢(shì)。
Abstracts:
In this paper, we present Motion-X, a large-scale 3D expressive whole-body motion dataset. Existing motion datasets predominantly contain body-only poses, lacking facial expressions, hand gestures, and fine-grained pose descriptions. Moreover, they are primarily collected from limited laboratory scenes with textual descriptions manually labeled, which greatly limits their scalability. To overcome these limitations, we develop a whole-body motion and text annotation pipeline, which can automatically annotate motion from either single- or multi-view videos and provide comprehensive semantic labels for each video and fine-grained whole-body pose descriptions for each frame. This pipeline is of high precision, cost-effective, and scalable for further research. Based on it, we construct Motion-X, which comprises 13.7M precise 3D whole-body pose annotations (i.e., SMPL-X) covering 96K motion sequences from massive scenes. Besides, Motion-X provides 13.7M frame-level whole-body pose descriptions and 96K sequence-level semantic labels. Comprehensive experiments demonstrate the accuracy of the annotation pipeline and the significant benefit of Motion-X in enhancing expressive, diverse,and natural motion generation, as well as 3D whole-body human mesh recovery.
鏈接:
https://arxiv.org/pdf/2307.00818.pdf
?
26.?A Batch-to-Online Transformation under Random-Order Model
作者:
Jing Dong(SDS博士生),Yuichi Yoshida
簡(jiǎn)介:
我們介紹了一個(gè)轉(zhuǎn)換框架,可用于開(kāi)發(fā)低功耗在線(xiàn)算法通過(guò)離線(xiàn)近似算法在隨機(jī)順序模型中近似后悔。我們首先給出一個(gè)通用歸約定理,將具有低平均靈敏度的離線(xiàn)近似算法轉(zhuǎn)換為具有低近似遺憾的在線(xiàn)算法。然后,我們證明可以使用核心集構(gòu)造方法將離線(xiàn)近似算法轉(zhuǎn)換為低靈敏度版本。為了展示我們方法的多功能性,我們將其應(yīng)用于各種問(wèn)題,包括在線(xiàn)聚類(lèi)、在線(xiàn)矩陣近似和在線(xiàn)回歸,并成功實(shí)現(xiàn)每個(gè)問(wèn)題的多對(duì)數(shù)近似后悔。此外,我們表明,在所有三種情況下,我們的算法也具有較低的不一致性,這在某些在線(xiàn)應(yīng)用程序中可能是需要的。
Abstracts:
We introduce a transformation framework that can be utilized to develop online algorithms with low?
approximate regret in the random-order model from offline approximation algorithms. We first give a general reduction theorem that transforms an offline approximation algorithm with low average sensitivity to an online algorithm with low approximate regret. We then demonstrate that offline approximation algorithms can be transformed into a low-sensitivity version using a coreset construction method. To showcase the versatility of our approach, we apply it to various problems, including online clustering, online matrix approximation, and online regression, and successfully achieve polylogarithmic approximate regret for each problem. Moreover, we show that in all three cases, our algorithm also enjoys low inconsistency, which may be desired in some online applications.
https://openreview.net/forum?id=B6HSIgvyJ3&referrer=%5BAuthor%20Console%5D(%2Fgroup%3Fid%3DNeurIPS.cc%2F2023%2FConference%2FAuthors%23your-submissions)
?
