Skip to content
English
中文
English
中文
主頁
關於
研究領域
項目一 視覺智能
項目二 語音及語言智能
項目三 視覺及語言醫療健康人工智能
項目四 視像主導的智慧城市服務
項目五 人工智能驅動的設計及自動化系統
研究應用
基於人工智能的個性化設計和製造
自動生成字幕撮要技術
文本生成機械人
構音障礙的語音重建技術
用於基礎設施輔助自動駕駛的邊緣人工智能技術
Everest – 高速視頻分析系統
飛秒投影納米打印機
對話式信息搜尋系統構建技術
智能輔助及替代式溝通
最新消息
學術期刊
加入我們
聯絡我們
English
中文
主頁
關於
研究領域
項目一 視覺智能
項目二 語音及語言智能
項目三 視覺及語言醫療健康人工智能
項目四 視像主導的智慧城市服務
項目五 人工智能驅動的設計及自動化系統
研究應用
基於人工智能的個性化設計和製造
自動生成字幕撮要技術
文本生成機械人
構音障礙的語音重建技術
用於基礎設施輔助自動駕駛的邊緣人工智能技術
Everest – 高速視頻分析系統
飛秒投影納米打印機
對話式信息搜尋系統構建技術
智能輔助及替代式溝通
最新消息
學術期刊
加入我們
聯絡我們
English
中文
主頁
關於
研究領域
項目一 視覺智能
項目二 語音及語言智能
項目三 視覺及語言醫療健康人工智能
項目四 視像主導的智慧城市服務
項目五 人工智能驅動的設計及自動化系統
研究應用
基於人工智能的個性化設計和製造
自動生成字幕撮要技術
文本生成機械人
構音障礙的語音重建技術
用於基礎設施輔助自動駕駛的邊緣人工智能技術
飛秒投影納米打印機
FlashLight: 人工智能影片細節搜尋系統
對話式信息搜尋系統構建技術
從圖像至影像更一致及可控的人工智能視訊生成技術
智能輔助及替代式溝通
首款可控粵語個性化語音合成技術
最新消息
學術期刊
加入我們
聯絡我們
主頁
關於
研究領域
項目一 視覺智能
項目二 語音及語言智能
項目三 視覺及語言醫療健康人工智能
項目四 視像主導的智慧城市服務
項目五 人工智能驅動的設計及自動化系統
研究應用
基於人工智能的個性化設計和製造
自動生成字幕撮要技術
文本生成機械人
構音障礙的語音重建技術
用於基礎設施輔助自動駕駛的邊緣人工智能技術
飛秒投影納米打印機
FlashLight: 人工智能影片細節搜尋系統
對話式信息搜尋系統構建技術
從圖像至影像更一致及可控的人工智能視訊生成技術
智能輔助及替代式溝通
首款可控粵語個性化語音合成技術
最新消息
學術期刊
加入我們
聯絡我們
學術期刊
以下學術期刊和會議論文按首席研究員的英文姓氏排序
陳苑茵教授
R. Yuen-Yan Chan, “EEG Transformer for Classifying Students’ Epistemic Cognition States in Educational Contexts,” in IEEE Access, vol. 13, pp. 23935-23949, 2025.
R. Yuen-Yan Chan, “Towards Rate-Distortion Analysis in Symbol-Based Assistive Communication,” in ISIT 2024, Jul. 2024.
C. Tong, R. Chan, “Gaze-Based Interaction Adaptation for People with Involuntary Head Movements (Student Abstract),” In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 38, no. 21, pp. 23669-23670, 2024.
R. Y. -Y. Chan, C. M. V. Wong and Y. N. Yum, “Predicting Behavior Change in Students With Special Education Needs Using Multimodal Learning Analytics,” In IEEE Access, vol. 11, pp. 63238-63251, 2023.
C. M. V. Wong, R. Y.Y. Chan, Y. N. Yum, K.Wang. “Internet of Things (IoT)-Enhanced Applied Behavior Analysis (ABA) for Special Education Needs,” Sensors 2021, 21(19), 6693, 16 pages. 2021.
陳世祈教授
M. Ye, Y. Lei, S. Gu, Y. Wang, S. Chen, “Colloidal Silver Nanoparticles-assisted High-precision Parallel Laser Writing on Glass and Optical Crystals,” Laser & Photonics Reviews, pp. e00047, 2025.
C. Liu, X. Yan, S. Li, M. Mohammadnejad, B. Deng, N.X. Fang, Y. Habibi, S. Chen, X. Zhao,“A Metre-scale Vertical Origami Hydrogel Panel for Atmospheric Water Harvesting in Death Valley,” Nature Water, Vol. 3, pp. 714-722, 2025.
S. Gu, B. Chen, X. Xu, F. Han, S. Chen, “3D Nanofabrication via Directed Material Assembly: Mechanism, Method, and Future,” Advanced Materials, pp. 2312915, 2025.
J. Arab, P. Sharma, S. Chen, “Fabrication of Micro-holes in PMMA using Micro-ECDM Process: Geometric Characteristics and EC Discharge Behaviour,” Journal of the Electrochemical Society, Vol. 171, No. 5, pp. 053506, 2024.
Y. Wang, Z. Bi, Y. Song, L. Duan, S. C. Chen, “Selective Activation of Photoactivatable Fluorescent Protein based on Binary Holography,” Biomedical Optics Express, Vol.15, no. 5, pp. 3382-3393, 2024.
S. Gu, F. Han, M. Ye, S. C. Chen, “Multi-material 3D Nanofabrication based on Ultrafast Lasers and Swellable Hydrogels,” ASPE 2023, 2023
Z. Deng, S. Gu, Z. Feng, S. C. Chen, “High-throughput Maskless Optical Lithography System based on Digital Holography,” ASPE 2023, 2023
B. Ding, C. Li and S. C. Chen, “Design and Characterization of an XYθZ Nanopositioner,” ASPE 2023, 2023
B. Ding, X. Li, C. Li, Y. Li, S. C. Chen, “A Survey on the Mechanical Design for Piezo-actuated Compliant Micro-positioning Stages,” Review of Scientific Instruments, Vol. 94, pp. 101502, 2023
T. Singh, J. Arab, S. C. Chen, “Improvement on Surface Quality of Inconel-718 Slits via Laser Cutting and Wire Electrochemical Machining Processes,” Optics and Laser Technology, Vol. 167, pp. 109637, 2023
W. Ouyang, X. Xu, W. Lu, N. Zhao, F. Han, and S. Chen,“Ultrafast 3D Nanofabrication via Digital Holography,” Nature Communications, Vol. 14, pp. 1716, 2023.
C. Li and S. Chen, “Design of Compliant Mechanisms based on Compliant Building Elements. Part I: Principles,” Precision Engineering, Vol. 81, pp. 207-220, 2023.
C. Li and S. Chen, “Design of Compliant Mechanisms based on Compliant Building Elements. Part II: Practice,” Precision Engineering, Vol. 81, pp. 8-21, 2023.
F. Han, S. Gu, A. Klimas, N. Zhao, Y. Zhao, and S. Chen, “3D Nanofabrication via Ultrafast Laser Patterning and Kinetically-regulated Material Assembly,” Science, Vol. 378, No. 6626, pp. 1325-1331, 2022.
X. Li, W. Liu, F. Goudail, and S. Chen, “Optimal Nonlinear Stokes-Mueller Polarimetry for Multi-photon Processes,” Optics Letters, Vol. 47, No. 13, pp. 3287-3290, 2022.
X. Li, J. Xu, L. Zhang, H. Hu, and S. Chen, “Underwater Image Restoration via Stokes Decomposition,” Optics Letters, Vol. 47, No. 11, pp. 2854-2857, 2022.
S. Yang, F. Li, M.M. Gong, L. Zhang, Z.W. Zhu, H.B. Shen, S.C Chen. “Generation of Q-switched and Mode-locked Pulses based on PbS/CdS Saturable Absorber in an Er-doped Fiber Laser,” Journal of Material Chemistry C, 10: 5956-5961. 2022.
X. Li, W. Lu, X. Xu, Y. Wang, S.C. Chen. “Advanced Optical Methods and Materials for Fabricating 3D Tissue Scaffolds,” Light: Advanced Manufacturing, 3: 26. 2022.
X. Li, F. Goudail, S.C. Chen. “Self-calibration for Mueller Polarimeters based on DoFP Polarization Imagers,” Optics Letters, 47(6): 1415-1418. 2022.
D. Chen, S. Gu, and S. C. Chen, “Study of Optical Modulation based on Binary Masks with Finite Pixels,” Opt. Lasers Eng., vol. 142, no. 106604, 2021.
X. Liu, X. Li, S.C. Chen. “Enhanced Polarization Demosaicking Network via a Precise Angle of Polarization Loss Calculation Method,” Optics Letters, 47(5): 1065-1068. 2021.
M. Ren, W. Lu, Q. Shao, F. Han, W. Ouyang, T. Zhang, C. C.L. Wang, and S.C. Chen. “Aberration-free Large-area Stitch-free 3D Nano-printing based on Binary Holography,” Optics Express, 29(26): 44250-263. 2021.
李鴻升教授
Z. Lin, D. Liu, R. Zhang, P. Gao, L. Qiu, H. Xiao, H. Qiu, W. Shao, K. Chen, J. Han, S. Huang, Y. Zhang, X. He, Y. Qiao, H. Li, “SPHINX: A Mixer of Weights, Visual Embeddings and Image Scales for Multi-modal Large Language Models.” In Computer Vision – ECCV 2024: 18th European Conference, Milan, Italy, September 29–October 4, 2024, Proceedings, Part LXII. Springer-Verlag, Berlin, Heidelberg, 36–55.
X. Chen, S. Shi, T. Ma, J. Zhou, S. See, K. C. Cheung, H. Li, “M3Net: Multimodal Multi-task Learning for 3D Detection, Segmentation, and Occupancy Prediction in Autonomous Driving”, AAAI, vol. 39, no. 2, pp. 2275-2283, Apr, 2025.
D. Jiang, R. Zhang, Z. Guo, Y. Wu, J. Lei, P. Qiu, P. Lu, Z. Chen, C. Fu, G. Song, P. Gao, Y. Liu, C. Li, H. Li, “MMSearch: Unveiling the Potential of Large Models as Multi-modal Search Engines,” in ICLR 2025, Apr, 2025.
P. Gao, L. Zhuo, D. Liu, R. Du, X. Luo, L. Qiu, Y. Zhang, C. Lin, R. Huang, S. Geng, R. Zhang, J. Xi, W. Shao, Z. Jiang, T. Yang, W. Ye, H. Tong, J. He, Y. Qiao, H. Li, “Lumina-T2X: Scalable Flow-based Large Diffusion Transformer for Flexible Resolution Generation,” in ICLR 2025, Apr, 2025.
X. Ju and H. Li, “DirectTriGS: Triplane-based Gaussian Splatting Field Representation for 3D Generation,” 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 2025, pp. 16229-16239.
W. Bian, Z. Huang, X. Shi, Y. Li, F. -Y. Wang and H. Li, “GS-DiT: Advancing Video Generation with Dynamic 3D Gaussian Fields through Efficient Dense 3D Point Tracking,” 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 2025, pp. 21717-21727.
R. Zhang, X. Wei, D. Jiang, Z. Guo, S. Li, Y. Zhang, C. Tong, J. Liu, A. Zhou, B. Wei, S. Zhang, P. Gao, C. Li, H. Li, “MAVIS: Mathematical Visual Instruction Tuning with an Automatic Data Engine,” in ICLR 2025, Apr, 2025.
X. Chen, L. Huang, T. Ma, R. Fang, S. Shi and H. Li, “SOLVE: Synergy of Language-Vision and End-to-End Networks for Autonomous Driving,” 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 2025, pp. 12068-12077.
X. Yang, L. Xu, H. Li, S. Zhang, “One Leaf Reveals the Season: Occlusion-Based Contrastive Learning with Semantic-Aware Views for Efficient Visual Representation,” ICML 2025, Jul, 2025.
X. Yang, L. Xu, S. Yu, Q. Xia, H. Li and S. Zhang, “Segmentation and Vascular Vectorization for Coronary Artery by Geometry-Based Cascaded Neural Network,” in IEEE Transactions on Medical Imaging, vol. 44, no. 1, pp. 259-269, Jan. 2025.
L. Wei, C. Liu, Z. Zhang, W. Zhang, S. Zhang, H. Li, “VBCD: A Voxel-Based Framework for Personalized Dental Crown Design,” in MICCAI 2025, Sep, 2025.
R. Zhang, J. Han, C. Liu, A. Zhou, P. Lu, Y. Qiao, H. Li, G. Peng, “LLaMA-Adapter: Efficient Fine-tuning of Large Language Models with Zero-initialized Attention,” in ICLR 2025, May, 2024.
H. Shao, Y. Hu, L. Wang, G. Song, S. L. Waslander, Y. Liu, H. Li, “LMDrive: Closed-Loop End-to-End Driving with Large Language Models,” in 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2024, pp. 15120-15130.
X. Ju, Z. Huang, Y. Li, G. Zhang, Y. Qiao, H. Li, “DiffInDScene: Diffusion-Based High-Quality 3D Indoor Scene Generation,” in 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2024, pp. 4526-4535.
F. Wang, X. Wu, Z. Huang, X. Shi, D. Shen, G. Song, Y. Liu, H. Li, “Be-Your-Outpainter: Mastering Video Outpainting Through Input-Specific Adaptation.” In Computer Vision – ECCV 2024: 18th European Conference, Milan, Italy, September 29–October 4, 2024, Proceedings, Part XLIV. Springer-Verlag, Berlin, Heidelberg, 153–168.
L. Huang, R. Fang, A. Zhang, G. Song, S. Liu, Y. Liu, H. Li, “FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis.” In Computer Vision – ECCV 2024: 18th European Conference, Milan, Italy, September 29–October 4, 2024, Proceedings, Part XII. Springer-Verlag, Berlin, Heidelberg, 196–212.
F. Wang, Z. Huang, A. W. Bergman, D. Shen, P. Gao, M. Lingelbach, K. Sun, W. Bian, G. Song, Y. Liu, X. Wang, H. Li, “Phased Consistency Models.” In Advances in Neural Information Processing Systems 37 (2024), Dec, 2024.
K. Wang, J. Pan, W. Shi, Z. Lu, H. Ren, A. Zhou, M. Zhan, H. Li, Measuring Multimodal Mathematical Reasoning with MATH-Vision Dataset.” In Advances in Neural Information Processing Systems 37 (2024), Dec, 2024.
D. Jiang, G. Song, X. Wu, R. Zhang, D. Shen, Z. Zong, Y. Liu, H. Li, “CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching.” In Advances in Neural Information Processing Systems 37 (2024), Dec, 2024.
Z. Lu, A. Zhou, K. Wang, H. Ren, W. Shi, J. Pan, M. Zhan, H. Li, “MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code,” in ICLR 2025, Apr, 2025.
F. Wang, Y. Shui, J. Piao, K. Sun, H. Li, “Diffusion-NPO: Negative Preference Optimization for Better Preference Aligned Generation of Diffusion Models,” in ICLR 2025, Apr, 2025.
F. Wang, L. Yang, Z. Huang, M. Wang, H. Li, “Rectified Diffusion: Straightness Is Not Your Need in Rectified Flow,” in ICLR 2025, Apr, 2025.
W. Lin, X. Wei, R. An, P. Gao, B. Zou, Y. Luo, S. Huang, S. Zhang, H. Li, “Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want” in ICLR 2025, Apr, 2025.
W. Lin, X. Wei, R. Zhang, L. Zhuo, S. Zhao, S. Huang, J. Xie, P. Gao, H. Li, “PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions,” in ICLR 2025, Apr, 2025.
F. Hong, L. Kong, H. Zhou, X. Zhu, H. Li, Z. Liu, “Unified 3D and 4D Panoptic Segmentation via Dynamic Shifting Networks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 46, no. 5, pp. 3480-3495, 2024
F. Yan, Q. Da, H. Yi, S. Deng, L. Zhu, Y. Liu, M. Feng, J. Wang, X. Wang, Y. Zhang, W. Zhang, X. Zhang, J. Lin, S. Zhang, C. Wang, “Artificial Intelligence-based Assessment of PD-L1 Expression in Diffuse Large B Cell Lymphoma,” npj Precision Oncology, Vol. no. 76, 2024
R. Zhang, Z. Jiang, Z. Guo, S. Yan, J. Pan, H. Dong, Y. Qiao, P. Gao, H. Li, “Personalize Segment Anything Model with One Shot,” ICLR 2024, 2024
R. Zhang, J. Han, C. Liu, A. Zhou, P. Lu, Y. Qiao, H. Li, P. Gao. “LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-initialized Attention,” ICLR, 2024
K. Sun, S. Wu, N. Zhang, Z. Huang, Q. Wang, H. Li, “CGOF++: Controllable 3D Face Synthesis with Conditional Generative Occupancy Fields,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 46, no. 02, pp. 913-926, 2023
W. Bian, Z. Huang, X. Shi, Y. Dong, Y. Li, H. Li, “Context-PIPs: Persistent Independent Particles Demands Context Features,” NeurlIPS 2023, 2023
X. Yang, L. Xu, S. Yu, Q. Xia, H. Li, S. Zhang, “Geometry-Based End-to-End Segmentation of Coronary Artery in Computed Tomography Angiography,” ICLR 2023 Workshop TML4H, 2023
F. Lu, Y. Xu, G. Chen, H. Li, K. Y. Lin, C. Jiang, “Urban Radiance Field Representation with Deformable Neural Mesh Primitives,” ICCV 2023, 2023
S. Fan, J. Piao, C. Qian, H. Li, K. Y. Lin, “Simulating Fluids in Real-world Still Images,” ICCV 2023, 2023
Y. Zhang, X. Shi, D. Li, X. Wang, J. Wang, H. Li, “A Unified Conditional Framework for Diffusion-based Image Restoration,” NeurlIPS 2023, 2023
K. Sun, J. Pan, Y. Ge, H. Li, H. Duan, X. Wu, R. Zhang, A. Zhou, Z. Qin, Y. Wang, J. Dai, Y. Qiao, L. Wang, H. Li, “JourneyDB: A Benchmark for Generative Image Understanding,” NeurlIPS 2023, 2023
Y. Niu, Y. Pu, Z. Yang, X. Li, T. Zhou, J. Ren, S. Hu, H. Li, Y. Liu, “LightZero: A Unified Benchmark for the Monte Carlo Tree Search in General Sequential Decision Scenarios,” NeurlIPS 2023, 2023
L. Huang, K. Lu, G. Song, L. Wang, S. Liu, Y. Liu, H. Li, “Teach-DETR: Better Training DETR with Teachers,” IEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 45, no. 12, pp. 15759-15771, 2023
X. Wu, K. Sun, F. Zhu, R. Zhao, H. Li, “Human Preference Score: Better Aligning Text-to-Image Models with Human Preference,” ICCV, 2023
T. Ma, X. Yang, H. Zhou, X. Li, B. Shi, J. Liu, Y. Yang, Z. Liu, L. He, Y. Qiao, Y. Li, H. Li, “DetZero: Rethinking Offboard 3D Object Detection with Long-term Sequential Point Clouds,” ICCV, 2023
X. Shi, Z. Huang, W. Bian, D. Li, M. Zhang, K. C. Cheung, S. See, H. Qin, J. Dai, H. Li, “Videoflow: Exploiting Temporal Cues for Multi-frame Optical Flow Estimation,” ICCV, 2023
J. Liu, T. Wang, B. Liu, Q. Zhang, Y. Liu, H. Li, “GeoMIM: Towards Better 3D Knowledge Transfer via Masked Image Modeling for Multi-view 3D Understanding,” ICCV, 2023
A. Zhou, Y. Li, Z. Qin, J. Liu, J. Pan, R. Zhang, R. Zhao, P. Gao, H. Li, “SparseMAE: Sparse Training Meets Masked Autoencoders,” ICCV, 2023
X. Chen, S. Shi, C. Zhang, B. Zhu, Q. Wang, K. C. Cheung, S. See, H. Li, “TrajectoryFormer: 3D Object Tracking Transformer with Predictive Trajectory Hypotheses,” ICCV, 2023
M. Zhang, G. Song, Y. Liu, H. Li, “Decoupled DETR: Spatially Disentangling Localization and Classification for Improved End-to-End Object Detection,” ICCV, 2023
J. Yao, C. Li, K. Sun, Y. Cai, H. Li, W. Ouyang, H. Li, “NDC-Scene: Boost Monocular 3D Semantic Scene Completion in Normalized Device Coordinates Space,” ICCV, 2023
R. Zhang, H. Qiu, T. Wang, Z. Guo, Z. Cui, Y. Qiao, P. Gao, H. Li, “MonoDETR: Depth-guided Transformer for Monocular 3D Object Detection.” ICCV, 2023
C. Tao, D. Gu, R. Huang, L. Zhou, Z. Hu, Y. Chen, X. Zhang, “Hippocampus Segmentation after Brain Tumor Resection via Postoperative Region Synthesis,” BMC Medical Imaging,
23
(1), 1–142. 2023.
Q. Chang, Z. Yan, M. Zhou, H. Zhang, X. He, H. Zhang, L. Baskaran, S. Al’Aref, H. Li, S. Zhang, D. N. Metaxas, “Mining multi-center heterogeneous medical data with distributed synthetic learning,” Nature Communications
,
Vol. 14, pp. 5510, 2023.
B. Zhu, Z. Wang, S. Shi, H. Xu, L. Hong, and H. Li, “ConQueR: Query Contrast Voxel-DETR for 3D Object Detection,” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
X. Shi, Z. Huang, D. Li, M. Zhang, K. C. Cheung, S. See, H. Qin, J. Dai, and H. Li, “FlowFormer++: Masked Cost Volume Autoencoding for Pretraining Optical Flow Estimation,” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
R. Zhang, X. Hu, B. Li, S. Huang, H. Deng, H. Li, Y. Qiao, and P. Gao, “Prompt, Generate, then Cache: Cascade of Foundation Models makes Strong Few-shot Learners,” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
X. Shi, Y. Zhang, K. C. Cheung, S. See, X. Wang, H. Qin, and H. Li, “A Simple Baseline for Video Restoration with Grouped Spatial-temporal Shift,” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
X. Wu, F. Zhu, R. Zhao, and H. Li, “CORA: Adapting CLIP for Open-Vocabulary Detection with Region Prompting and Anchor Pre-Matching,” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
R. Zhang, L. Wang, Y. Qiao, P. Gao, and H. Li, “Learning 3D Representations from 2D Pre-trained Models via Image-to-Point Masked Autoencoders,” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
R. Zhang, L. Wang, Z. Guo, Y. Wang, P. Gao, H. Li, and J. Shi, “Starting from Non-Parametric Networks for 3D Point Cloud Analysis,” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
J. Liu, X. Huang, J. Zheng, Y. Liu, and H. Li, “MixMAE: Mixed and Masked Autoencoder for Efficient Pretraining of Hierarchical Vision Transformers,” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
J. Zhou, L. Huang, L. Wang, S. Liu, and H. Li, “Improving Weakly Supervised Temporal Action Localization by Bridging Train-Test Gap in Pseudo Labels,” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
J. Mao, S. Shi, X. Wang, and H. Li, “3D object detection for autonomous driving: A comprehensive survey,” International Journal of Computer Vision, Apr. 2023.
Z. Huang, X. Pan, W. Pan, W. Bian, Y. Xu, K. C. Cheung, G. Zhang, and H. Li, “NeuralMarker: A Framework for Learning General Marker Correspondence,” ACM Transactions on Graphics, vol. 41, no. 6, pp. 1–10, Nov. 2022.
K. Sun, S. Wu, Z. Huang, N. Zhang, Q. Wang, and H. Li, “Controllable 3D Face Synthesis with Conditional Generative Occupancy Fields,” In Advances in Neural Information Processing Systems (NeurIPS), 2022.
R. Zhang, Z. Guo, R. Fang, B. Zhao, D. Wang, Y. Qiao, H. Li, and P. Gao, “Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud Pre-training,” In Advances in Neural Information Processing Systems (NeurIPS), 2022.
J. Pan, Z. Lin, X. Zhu, J. Shao, and H. Li, “ST-Adapter: Parameter-Efficient Image-to-Video Transfer Learning,” In Advances in Neural Information Processing Systems (NeurIPS), 2022.
Z. Lin, S. Geng, R. Zhang, P. Gao, G. de Melo, X. Wang, J. Dai, Y. Qiao, and H. Li, “Frozen clip models are efficient video learners,” In Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, vol. 13695, pp. 388–404, Nov. 2022.
X. Chen, S. Shi, B. Zhu, K. C. Cheung, H. Xu, and H. Li, “MPPNet: Multi-frame feature intertwining with proxy points for 3D temporal object detection,” In Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, pp. 680–697, 2022.
D. Li, Y. Zhang, K. C. Cheung, X. Wang, H. Qin, and H. Li, “Learning degradation representations for image Deblurring,” In Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, vol. 13678, pp. 736–753, 2022.
D. Li, Y. Zhang, K. L. Law, X. Wang, H. Qin, and H. Li, “Efficient burst raw denoising with variance stabilization and multi-frequency denoising network,” International Journal of Computer Vision, vol. 130, no. 8, pp. 2060–2080, Jun. 2022.
Y. Cai, KY. Lin, C. Zhang, Q.Wang, X. Wang, H. Li. “Learning a Structured Latent Space for Unsupervised Point Cloud Completion,” In IEEE Conference on Computer Vision and Pattern Recognition, 2022.
R. Zhang, Z. Guo, W. Zhang, K. Li, X. Miao, B. Cui, Y. Qiao, P. Gao, H. Li. “PointCLIP: Point Cloud Understanding by CLIP,” In IEEE Conference on Computer Vision and Pattern Recognition, 2022.
Y. Zhang, D. Li, K.L. Law, X. Wang, H. Qin, H. Li. “IDR: Self-Supervised Image Denoising via Iterative Data Refinement,” In IEEE Conference on Computer Vision and Pattern Recognition, 2022.
Y. Xu, K.Y. Lin, G. Zhang, X. Wang, H. Li. “RNNPose: Recurrent 6-DoF Object Pose Refinement with Robust Correspondence Field Estimation and Pose Optimization,” In IEEE Conference on Computer Vision and Pattern Recognition, 2022.
L. Huang, L. Wang, H. Li. “Weakly Supervised Temporal Action Localization via Representative Snippet Knowledge Propagation,” In IEEE Conference on Computer Vision and Pattern Recognition, 2022.
X. Zhang, Y. Ge, Y. Qiao, and H. Li, “Refining Pseudo Labels with Clustering Consensus over Generations for Unsupervised Object Re-identification,” in IEEE/ CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 19-25, 2021.
林達華教授
Z. Liu, S. Ding, Z. Zhang, X. Dong, P. Zhang, Y. Zang, Y. Cao, D. Lin, J. Wang, “SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation,” Forty-Second International Conference on Machine Learning, 2025.
X. Wei, X. Liu, Y. Zang, X. Dong, P. Zhang, Y. Cao, J. Tong, H. Duan, Q. Guo, J. Wang, X. Qiu, D. Lin, “VideoRoPE: What Makes for Good Video Rotary Position Embedding?” Forty-Second International Conference on Machine Learning, 2025.
L. Chen, X. Wei, J. Li, X. Dong, P. Zhang, Y. Zang, Z. Chen, H. Duan, B. Lin, Z. Tang, L. Yuan, Y. Qiao, D. Lin, F. Zhao, J. Wang, “ShareGPT4Video: Improving Video Understanding and Generation with Better Captions,” In Advances in Neural Information Processing Systems 37 (2024), pp. 19472-19495.
Y. Guo, C. Yang, A. Rao, M. Agrawala, D. Lin, B. Dai, “SparseCtrl: Adding Sparse Controls to Text-to-Video Diffusion Models.” In Computer Vision – ECCV 2024: 18th European Conference, Milan, Italy, September 29–October 4, 2024, Proceedings, Part XLII. Springer-Verlag, Berlin, Heidelberg, 330–348.
F. Cui, C. Yin, K. Zhou, Y. Xiao, G. Sun, Q. Xu, Q. Guo, D. Song, D. Lin, X. Zhang, Y. Liang, “OriGen: Enhancing RTL Code Generation with Code-to-Code Augmentation and Self-Reflection,” 2024 ACM/IEEE International Conference On Computer Aided Design (ICCAD), Newark, NJ, USA, 2024, pp. 1-9.
J. Duan, X. Li, P. Xu, X. Zhang, S. Yan, Y. Liang, D. Lin, “Proteus: Simulating the Performance of Distributed DNN Training,” in IEEE Transactions on Parallel and Distributed Systems, vol. 35, no. 10, pp. 1867-1878, Oct. 2024.
H. Duanmu, X. Li, Z. Yuan, S. Zheng, J. Duan, X. Zhang, D. Lin, “MxMoE: Mixed-precision Quantization for MoE with Accuracy and Performance Co-Design,” in ICML 2025, 2025.
T. Lu, M. Yu, L. Xu, Y. Xiangli, L. Wang, D. Lin, B. Dai, “Scaffold-GS: Structured 3D Gaussians for View-Adaptive Rendering,” 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2024, pp. 20654-20664.
R. Qian, S. Ding, D. Lin, “Rethinking Image-to-Video Adaptation: An Object-Centric Perspective.” In Computer Vision – ECCV 2024: 18th European Conference, Milan, Italy, September 29–October 4, 2024, Proceedings, Part XLIII. Springer-Verlag, Berlin, Heidelberg, 329–348.
M. Zhang, T. Wu, T. Wang, T. Wang, Z. Liu, D. Lin, “Omni6D: Large-Vocabulary 3D Object Dataset for Category-Level 6D Object Pose Estimation.” In Computer Vision – ECCV 2024: 18th European Conference, Milan, Italy, September 29–October 4, 2024, Proceedings, Part XXV. Springer-Verlag, Berlin, Heidelberg, 216–232.
Z. Wang, Y. Li, Y. Zeng, Y. Fang, Y. Guo, W. Liu, J. Tan, K. Chen, T. Xue, B. Dai, D. Lin, “HumanVid: Demystifying Training Data for Camera-controllable Human Image Animation.” In Advances in Neural Information Processing Systems 37 (2024), Dec, 2024.
X. Dong, P. Zhang, Y. Zang, Y. Cao, B. Wang, L. Ouyang, S. Zhang, H. Duan, W. Zhang, Y. Li, H. Yan, Y. Gao, Z. Chen, X. Zhang, W. Li, J. Li, W. Wang, K. Chen, C. He, X. Zhang, J. Dai, Y. Qiao, D. Lin, J. Wang, “”InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD.”” In Advances in Neural Information Processing Systems 37 (2024), Dec, 2024.
R. Lyu, J. Lin, T. Wang, S. Yang, X. Mao, Y. Chen, R. Xu, H. Huang, C. Zhu, D. Lin, J. Pang, “MMScan: A Multi-Modal 3D Scene Dataset with Hierarchical Grounded Language Annotations.” In Advances in Neural Information Processing Systems 37 (2024), Dec, 2024.
X. Fang, K. Mao, H. Duan, X. Zhao, Y. Li, D. Lin, K. Chen, “MMBench-Video: A Long-Form Multi-Shot Benchmark for Holistic Video Understanding.” In Advances in Neural Information Processing Systems 37 (2024), Dec, 2024.
Z. Wang, J. Wang, Y. Li, D. Lin, B. Dai, “InterControl: Zero-shot Human Interaction Generation by Controlling Every Joint.” In Advances in Neural Information Processing Systems 37 (2024), Dec, 2024.
T. Wu, Y. Xu, R. Po, M. Zhang, G. Yang, J. Wang, Z. Liu, D. Lin, G. Wetzstein, “FiVA: Fine-grained Visual Attribute Dataset for Text-to-Image Diffusion Models.” In Advances in Neural Information Processing Systems 37 (2024), Dec, 2024.
T. Lan, W. Zhang, C. Xu, H. Huang, D. Lin, K. Chen, X. Mao, “CriticEval: Evaluating Large-scale Language Model as Critic.” In Advances in Neural Information Processing Systems 37 (2024), Dec, 2024.
Y. Guo, C. Yang, A. Rao, Z. Liang, Y. Wang, Y. Qiao, M. Agrawala, D. Lin, B. Dai, “AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning,” ICLR 2024, 2024
.
R. Qian, S. Ding, X. Liu, D. Lin, “Semantics Meets Temporal Correspondence: Self-supervised Object-centric Learning in Videos,” ICCV 2023, 2023.
Y. Li, J. Zhao, D. Zheng, Z. Y. Hu, Z. Chen, X. Su, Y. Huang, S. Huang, D. Lin, M. R. Lyu, L. Wang, “CLEVA: Chinese Language Models EVAluation Platform,” In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 186-217, 2023
Y. Xiangli, L. Xu, X. Pan, N. Zhao, B. Dai, D. Lin, “AssetField: Assets Mining and Reconfiguration in Ground Feature Plane Representation,” ICCV 2023, 2023
Y. Li, L. Xu, Y. Xiangli, Z. Wang, D. Lin, B. Dai, “MatrixCity: A Large-scale City Dataset for City-scale Neural Rendering and Beyond,” ICCV 2023, 2023
J. Wang, P. Zhang, T. Chu, Y. Cao, Y. Zhou, T. Wu, B. Wang, C. He, D. Lin, “V3Det: Vast Vocabulary Visual Detection Dataset,” ICCV 2023, 2023
Z. Lyu, J. Wang, Y. An, Y. Zhang, D. Lin, B. Dai, “Controllable Mesh Generation Through Sparse Latent Point Diffusion Models,” CVPR 2023, 2023
R. Xu, T. Wang, W. Zhang, R. Chen, J. Cao, J. Pang, D. Lin, “MV-JAR: Masked Voxel Jigsaw and Reconstruction for LiDAR-Based Self-Supervised Pre-Training,” CVPR 2023, 2023
A. Rao, X. Jiang, Y. Guo, L. Xu, L. Yang, L. Jin, D. Lin, B. Dai, “Dynamic Storyboard Generation in an Engine-based Virtual Environment for Video Production,” SIGGRAPH 2023, 2023
T. Wu, J. Zhang, X. Fu, Y. Wang, J. Ren, L. Pan, W. Wu, L. Yang, J. Wang, C. Qian, D. Lin, Z. Liu, “OmniObject3D: Large-vocabulary 3D Object Dataset for Realistic Perception, Reconstruction and Generation,” CVPR 2023, 2023
W. Li, Y. Lai, L. Xu, Y. Xiangli, J. Yu, C. He, G. S. Xia, and D. Lin, “OmniCity: Omnipotent City Understanding with Multi-level and Multi-view Images,” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
L. Xu, Y. Xiangli, S. Peng, X. Pan, N. Zhao, C. Theobalt, B. Dai, and D. Lin, “Grid-guided Neural Radiance Fields for Large Urban Scenes,” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
Y. Jin, J. Wang, and D. Lin, “Multi-level Logit Distillation,” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
R. Xu, T. Wang, W. Zhang, R. Chen, J. Cao, J. Pang, and D. Lin, “MV-JAR: Masked Voxel Jigsaw and Reconstruction for LiDAR-Based Self-Supervised Pre-Training,” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
Z. Lyu, J. Wang, Y. An, Y. Zhang, D. Lin, and B. Dai, “Controllable Mesh Generation Through Sparse Latent Point Diffusion Models,” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
L. Xu, Y. Xiangli, S. Peng, X. Pan, N. Zhao, C. Theobalt, B. Dai, and D. Lin, “Grid-guided Neural Radiance Fields for Large Urban Scenes,” In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
Y. Jin, J. Wang, and D. Lin, “Semi-Supervised Semantic Segmentation via Gentle Teaching Assistant,” In Advances in Neural Information Processing Systems (NeurIPS), 2022.
H. Duan, N. Zhao, K.Chen, D. Lin. “TransRank: Self-supervised Video Representation Learning via Ranking-based Transformation Recognition,” In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022.
J. Wang, S. Yan, B. Dai, D. Lin. “Scene-aware Generative Network for Human Motion Synthesis,” In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021.
L. Xu, Y. Xiangli, A. Rao, N. Zhao, B. Dai, Z. Liu, D. Lin. “BlockPlanner: City Block Generation with Vectorized Graph Representation,” In International Conference on Computer Vision, 2021
T. Wang, X. Zhu, J. Pang, D. Lin. “Probabilistic and Geometric Depth: Detecting Objects in Perspective,” In Conference on Robot Learning, 2021.
X. Zhu, H. Zhou, F. Hong, T. Wang, Y. Ma, W. Li, H. Li, D. Lin. “Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR Segmentation,” In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021.
T. Wu, L. Pan, J. Zhang, T. Wang, Z. Liu, D. Lin. “Balanced Chamfer Distance as a Comprehensive Metric for Point Cloud Completion,” In Conference on Neural Information Processing Systems, 2021.
盧至力教授
Z. Lai, F. Cui, H. Fan, E. Lo, W. Zhou, F. Li, “Occam’s Razor for Distributed Protocols.” In Proceedings of the 2024 ACM Symposium on Cloud Computing (SoCC ’24). Association for Computing Machinery, New York, NY, USA, 618–636.
B. Lu, K. Huang, C. M. Liang, T. Wang, E. Lo, “DEX: Scalable Range Indexing on Disaggregated Memory.” Proc. VLDB Endow. 17, 10 (June 2024), 2603–2616.
C. Chang, E. Lo, C. Ye, “Biathlon: Harnessing Model Resilience for Accelerating ML Inference Pipelines.” Proc. VLDB Endow. 17, 10 (June 2024), 2631–2640.
Y. Zhang, J. Xing, B. Xia, S. Liu, B. PENG, X. Tao, P. Wan, E. Lo, J. Jia, “Training-Free Efficient Video Generation via Dynamic Token Carving,” in NeurIPS 2025, Dec, 2025.
Y. Zhang, Y. Y. Liu, B. Xia, B. PENG, Z. Yan, E. Lo, J. Jia, “MagicMirror: ID-Preserved Video Generation in Video Diffusion Transformers,” in ICCV 2025, Hawaii, USA, Oct 2025.
Z. Zhong, C. Wang, Y. Liu, S. Yang, L. Tang, Y. Zhang, J. Li, T. Qu, Y. Li, Y. Chen, S. Yu, S. Wu, E. Lo, S. Liu, J. Jia, “Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition,” in ICCV 2025, Oct 2025.
Y. Zhang, J. Xing, E. Lo, J. Jia, “Real-World Image Variation by Aligning Diffusion Inversion Chain,” NeurlIPS 2023, 2023
M. J. Amiri, Z. Lai, L. Patel, B. T. Loo, E. Lo, and W. Zhou, “Saguaro: An Edge Computing-enabled Hierarchical Permissioned Blockchain,” In IEEE International Conference on Data Engineering, Apr. 2023.
Z. Lai, C. Liu, and E. Lo, “When private blockchain meets deterministic database,” In Proceedings of the ACM on Management of Data, vol. 1, no. 1, pp. 1–28, 2023.
Z. Zhong, J. Cui, E. Lo, Z. Li, J. Sun, J. Jia. “Rebalanced Siamese Contrastive Mining for Long-Tailed Recognition,” Computer Vision and Pattern Recognition, arXiv preprint arXiv:2203.11506. 2022 Mar 22.
Z. Lai, C. Han, C. Liu, P. Zhang, E. Lo, and B. Kao, “Finding Interesting Frames in Deep Video Analytics: A Top-K Approach,” in 2021 ACM SIGMOD/PODS International Conference on Management of Data, Xi’ an, Shaanxi, China, June 20-25, 2021.
Z. Chen, A.W. Fu, M. Jiang, E. Lo, and P. Zhang. “P2H: Efficient Distance Querying on Road Networks by Projected Vertex Separators,” In Proceedings of the 2021 International Conference on Management of Data. Association for Computing Machinery, New York, NY, USA, 313–325. 2021.
蒙美玲教授
H. Guo, F. Xie, D. Yang, H. Lu, X. Wu, H. Meng, “Addressing Index Collapse of Large-Codebook Speech Tokenizer with Dual-Decoding Product-Quantized Variational Auto-Encoder.” IEEE Spoken Language Technology Workshop (SLT), Macau, 2-5 December, 2024.
H. Guo, F. Xie, J. Kang, Y. Xiao, X. Wu, H. Meng, “QS-TTS: Towards Semi-Supervised Text-to-Speech Synthesis via Vector-Quantized Self-Supervised Speech Representation Learning,” in IEEE Transactions on Audio, Speech and Language Processing, vol. 33, pp. 3307-3319, 2025.
H. Guo, F. Xie, K. Xie, D Yang, D Guo, X. Wu, H. Meng, “SoCodec: A Semantic-Ordered Multi-Stream Speech Codec for Efficient Language Model Based Text-to-Speech Synthesis.” 2024 IEEE Spoken Language Technology Workshop (SLT), Macau, 2-5 December, 2024.
H. Guo, F. Xie, D. Yang, X. Wu, H. Meng, “Speaking from Coarse to Fine: Improving Neural Codec Language Model via Multi-Scale Speech Coding and Generation,” ICASSP 2025 – 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Hyderabad, India, 2025, pp. 1-5.
J. Zhou, M. Hu, J. Li, X. Zhang, X. Wu, I. King, H. Meng, “Rethinking Machine Ethics – Can LLMs Perform Moral Reasoning through the Lens of Moral Theories?” In Findings of the Association for Computational Linguistics: NAACL 2024, pages 2227–2242, Mexico City, Mexico. Association for Computational Linguistics.
D. Yang, H. Guo, Y. Wang, R. Huang, X. Li, X. Tan, X. Wu, H. Meng, “UniAudio 1.5: large language model-driven audio codec is a few-shot audio task learner,” In Proceedings of the 38th International Conference on Neural Information Processing Systems (NIPS ’24), Vol. 37. Curran Associates Inc., Red Hook, NY, USA, Article 1809, 56802–56827.
Y. Wang, H. Chen, D. Yang, W. Li, D. Luo, G. Li, S. Yang, Z. Wu, H. Meng, X. Wu, “UniSep: Universal Target Audio Separation with Language Models at Scale”, IEEE ICME 2025, 2025.
X. Chen, D. Yang, D. Wang, X. Wu, Z. Wu, H. Meng, “CoLM-DSR: Leveraging Neural Codec Language Modeling for Multi-Modal Dysarthric Speech Reconstruction.” Proc. Interspeech 2024, 4129-4133.
X. Feng, X. Wu, H. Meng, “Ontology-grounded Automatic Knowledge Graph Construction by LLM under Wikidata schema,” KDD Workshop on Human-Interpretable AI 2024, Barcelona, Spain, August 26, 2024.
D. Yang, D. Wang, H. Guo, X. Chen, X. Wu, H. Meng, “SimpleSpeech: Towards Simple and Efficient Text-to-Speech with Scalar Latent Transformer Diffusion Models.” Proc. Interspeech 2024, 4398-4402.
M. Wu, J. Xu, X. Chen, H. Meng, “Integrating Potential Pronunciations for Enhanced Mispronunciation Detection and Diagnosis Ability in LLMs,” ICASSP 2025 – 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Hyderabad, India, 2025, pp. 1-5.
M. Wu, J. Xu, X. Wu, H. Meng, “Prompting Large Language Models with Mispronunciation Detection and Diagnosis Abilities.” Proc. Interspeech 2024, 2990-2994.
Y. Wang, X. Wu, D. Wang, L. Meng and H. Meng, “UNIT-DSR: Dysarthric Speech Reconstruction System Using Speech Unit Normalization,” ICASSP 2024 – 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Seoul, Korea, Republic of, 2024, pp. 12306-12310.
T. Zhang, J. Ge, H. Luo, Y. Chuang, M. Gao, Y. Gong, X. Wu, Y. Kim, H. Meng, J. Glass, “Natural Language Embedded Programs for Hybrid Language Symbolic Reasoning.” In Findings of the Association for Computational Linguistics: NAACL 2024, pages 4131–4155, Mexico City, Mexico. Association for Computational Linguistics.
T. Zhang, K. Li, H. Luo, X. Wu, J. Glass, H.Meng, “Adaptive Query Rewriting: Aligning Rewriters through Marginal Probability of Conversational Answers.” In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 13444–13461, Miami, Florida, USA. Association for Computational Linguistics.
K. Li, T. Zhang, X. Wu, H. Luo, J. Glass, H. Meng, “Decoding on Graphs: Faithful and Sound Reasoning on Knowledge Graphs through Generation of Well-Formed Chains.” In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 24349–24364, Vienna, Austria. Association for Computational Linguistics.
K. Li, T. Zhang, Y. Li, H. Luo, A. Moustafa, X. Wu, J. Glass, H. Meng. “Generate, Discriminate, Evolve: Enhancing Context Faithfulness via Fine-Grained Sentence-Level Self-Evolution.” In Findings of the Association for Computational Linguistics: ACL 2025, pages 17091–17105, Vienna, Austria. Association for Computational Linguistics.
J. Wu, J. Lian, D. Wang, H. Meng, “SocialCC: Interactive Evaluation for Cultural Competence in Language Agents.” In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 33242–33271, Vienna, Austria. Association for Computational Linguistics.
X. Zhang, B. Peng, Y. Tian, J. Zhou, L. Jin, L. Song, H. Mi, H. Meng, “Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation.” In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1946–1965, Bangkok, Thailand. Association for Computational Linguistics.
H. Lu, X. Wu, H. Guo, S. Liu, Z. Wu, H. Meng, “Unifying One-Shot Voice Conversion and Cloning with Disentangled Speech Representations,” In ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 11141-11145, 2024
X. Chen, Y. Wang, X. Wu, D. Wang, Z. Wu, X. Liu, H. Meng, “Exploiting Audio-Visual Features with Pretrained AV-HuBERT for Multi-Modal Dysarthric Speech Reconstruction,” In ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 12341-12345, 2024
X. Chen, X. Wang, S. Zhang, L, He, Z. Wu, X. Wu, H. Meng, “Stylespeech: Self-Supervised Style Enhancing with VQ-V AE-Based Pre-Training for Expressive Audiobook Speech Synthesis,” In ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 12316-12320, 2024
H. Lu, X. Wu, Z. Wu, H. Meng, “Speech TripleNet: End-to-End Disentangled Speech Representation Learning for Content, Trimbre and Prosody,” In MM’23: the 31
st
ACM International Conference on Multimedia, pp. 2829-2837, 2023
X. Zhang, B. Peng, K. Li, J. Zhou, H. Meng, “SGP-TOD: Building Task Bots Effortlessly via Schema-Guided LLM Prompting,” In Findings of the Association for Computational Linguistics: EMNLP 2023 (EMNLP Findings 2023), pp. 13348-13369, 2023
H. Luo, T. Zhang, Y. S. Chuang, Y. Gong, Y. Kim, X. Wu, H. Meng, J. Glass, “Search Augmented Instruction Learning,” In Findings of the Association for Computational Linguistics: EMNLP 2023 (EMNLP Findings 2023), pp.3717-3729, 2023
Y. Li, P. Liu, X. Wu, H. Meng, “PunCantonese: A Benchmark Corpus for Low-Resource Cantonese Punctuation Restoration from Speech Transcripts,” In Interspeech 2023, pp. 2183-2187), 2023
T. Zhang, L. Tang, W. Fang, H. Luo, X. Wu, H. Meng and J. Glass, “ConvRGX: Recognition, Generation, and Extraction for Self-trained Conversational Question Answering,” In DialDoc 2023, 2023
H. Guo, F. Xie, X. Wu, F. K. Soong and H. Meng, “MSMC-TTS: Multi-Stage Multi-Codebook VQ-VAE Based Neural TTS,” in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 31, pp. 1811-1824, 2023.
H. Lu, W. Lam, H. Cheng, and H. Meng, “On controlling fallback responses for grounded dialogue generation,” In Findings of the Association for Computational Linguistics: ACL 2022, pages 2591–2601, Dublin, Ireland. Association for Computational Linguistics, May 2022.
K. Li, T. Zhang, L. Tang, J. Li, H. Lu, X. Wu, and H. Meng, “Grounded dialogue generation with cross-encoding re-ranker, grounding span prediction, and passage dropout,” In Proceedings of the Second DialDoc Workshop on Document-grounded Dialogue and Conversational Question Answering, pages 123–129, Dublin, Ireland. Association for Computational Linguistics. May 2022.
H. Guo, H. Lu, X. Wu, and H. Meng, “A multi-scale time-frequency spectrogram discriminator for gan-based non-autoregressive TTS,” In Interspeech 2022, 1566-1570. Sep. 2022.
Y. Deng, W. Zhang, W. Lam, H. Cheng, H. Meng. “User Satisfaction Estimation with Sequential Dialogue Act Modeling in Goal – oriented Conversational Systems,” Computation and Language, 2998-3008. 2022.
H. Lu, W. Lam, H. Cheng, H. Meng. “On Controlling Fallback Responses for Grounded Dialogue Generation,” In Findings of the Association for Computational Linguistics: ACL 2022, pages 2591–2601, Dublin, Ireland. Association for Computational Linguistics. 2022.
K. Li, T. Zhang, L. Tang, J. Li, H. Lu, X. Wu, H. Meng. “Grounded Dialogue Generation with Cross-encoding Re-ranker, Grounding Span Prediction, and Passage Dropout,” In Proceedings of the Second DialDoc Workshop on Document-grounded Dialogue and Conversational Question Answering, pages 123–129, Dublin, Ireland. Association for Computational Linguistics. 2022.
D. Wang, S. Liu, X. Wu, H. Lu, L. Sun, X. Liu, H. Meng. “Speaker Identity Preservation in Dysarthric Speech Reconstruction by Adversarial Speaker Adaptation,” In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 6677-6681). IEEE. 2022.
D. Wang, S. Yang, D. Su, X. Liu, D. Yu, H. Meng. “VCVTS: Multi-speaker Video-to-Speech synthesis via cross-modal knowledge transfer from voice conversion,” In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 7252-7256). IEEE. 2022.
H. Wu, P. Hsu, J. Gao, S. Zhang, S. Huang, J. Kang, Z. Wu, H. Meng, H. Lee. “Adversarial Sample Detection for Speaker Verification by Neural Vocoders,” In ICASSP 2022 – 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022, pp. 236-240, doi: 10.1109/ICASSP43922.2022.9746900. 2022.
H. Wu, H. Kuo, N. Zheng, K. Hung, H. Lee, Y. Tsao, H. Wang, H. Meng. “Partially Fake Audio Detection by Self-Attention-Based Fake Span Discovery,” In ICASSP 2022 – 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022, pp. 9236-9240, doi: 10.1109/ICASSP43922.2022.9746162. 2022.
H. Wu, B. Zheng, X. Li, X. Wu, H.Y. Lee and H. Meng. “Characterizing the Adversarial Vulnerability of Speech self-Supervised Learning”, In ICASSP 2022 – 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022, pp. 3164-3168, doi: 10.1109/ICASSP43922.2022.9747242. 2022.
X. Wu, S. Hu, Z. Wu, X. Liu, H. Meng. “Neural Architecture Search for Speech Emotion Recognition,” In ICASSP 2022 – 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022, pp. 6902-6906, 10.1109/ICASSP43922.2022.9746155. 2022.
M. Chaudhary, B. Dzodzo, S. Huang, C. H. Lo, M. Lyu, L.Y. Nie, J. Xing, T. Zhang, X. Zhang, J. Zhou, H. Cheng, W. Lam, and H. Meng, “Unstructured Knowledge Access in Task-oriented Dialog Modeling using Language Inference, Knowledge Retrieval and Knowledge-Integrative Response Generation,” in 35th AAAI Conference on Artificial Intelligence 2021 Dialog System Technology Challenge Workshop (DSTC9), February 8-9, 2021.
H. Lu, Z. Wu, X. Wu, X. Li, S. Kang, X. Liu, H. Meng. “VAENAR-TTS: Variational Auto-Encoder Based Non-AutoRegressive Text-to-Speech Synthesis,” In Interspeech 2021, 22nd Annual Conference of the International Speech Communication Association, Brno, Czechia, 30 August – 3 September 2021, 3775—3779. 2021
M. Wu, K. Li and W.K. Leung, H. Meng. “Transformer Based End-to-End Mispronunciation Detection and Diagnosis,” InProc. Interspeech 2021, 3954—3958. 2021.
任揚教授
X. Chen, C. Dai, B. Sun, C. P. Lam, G. Fang, Y. Yam, C. C. L. Wang, “Weaving-Based Freeform Sensing Interface: Design and Fabrication,” ICRA 2024 Workshop, 2024
Z. Liu, E. L. Doubrovski, J. M. P. Geraedts, W. Wang, Y. Yam, and C. C. L. Wang, “Photogrammetric Reconstruction of a Stolen Statue,” 2023.
王歷偉教授
Z. Hu, Y. Zhong, S. Huang, M. Lyu, L. Wang, “Enhancing Temporal Modeling of Video LLMs via Time Gating.” In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 2845–2856, Miami, Florida, USA. Association for Computational Linguistics.
S. Liang, Y. Zhong, Z. Hu, Y. Tao, L. Wang, “Fine-grained Spatiotemporal Grounding on Egocentric Videos,” in ICCV 2025, Hawaii, USA, Oct 2025
Y. Li, S. Liang, M. R. Lyu, L. Wang, “Making Long-Context Language Models Better Multi-Hop Reasoners,” In Proceedings of the 62
nd
Annual Meeting of the Association for Computational Linguistics (ACL 2024), Vol. 1: Long Papers, pp. 2462-2475, 2024
D. Zheng, S. Huang, L. Zhao, Y. Zhong, L. Wang, “Towards Learning a Generalist Model for Embodied Navigation,” CVPR 2024, pp. 13624-13634, 2024
S. Huang, J. Zhao, Y. Li, L. Wang, “Learning Preference Model for LLMs via Automatic Preference Data Generation,” In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023), pp. 9187-9199, 2023
Y. Li, J. Zhao, D. Zheng, Z. Y. Hu, Z. Chen, X. Su, Y. Huang, S. Huang, D. Lin, M. R. Lyu, L. Wang, “CLEVA: Chinese Language Models EVAluation Platform,” In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 186-217, 2023
Y. Huang, Y. Li, Y. Xu, L. Zhang, R. Gan, J. Zhang, L. Wang, “MVP-Tuning: Multi-View Knowledge Retrieval with Prompt Tuning for Commonsense Reasoning,“ In Proceedings of the 61
st
Annual Meeting of the Association for Computational Linguistics (ACL 2023). Vol.1: Long Papers, pp. 13417-13432, 2023
Z. Y. Hu, Y. Li, M. R. Lyu, L. Wang, “VL-PET: Vision-and-Language Parameter-Efficient Tuning via Granularity Control,” ICCV 2023, 2023
Y. Li, J. Zhao, M. R. Lyu, and L. Wang, “Eliciting Knowledge from Large Pre-Trained Models for Unsupervised Knowledge-Grounded Conversation,” In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 10551–10564, Abu Dhabi, United Arab Emirates. Association for Computational Linguistics. 2022.
邢國良教授
S. Shi, N. Ling, Z. Jiang, X. Huang, Y. He, X. Zhao, B. Yang, C. Bian, J. Xia, Z. Yan, R. W. Yeung, G. Xing, “Soar: Design and Deployment of A Smart Roadside Infrastructure System for Autonomous Driving.” In Proceedings of the 30th Annual International Conference on Mobile Computing and Networking (ACM MobiCom ’24). Association for Computing Machinery, New York, NY, USA, 139–154.
J. Cui, Y. He, J. Niu, Z. Ouyang, G. Xing, “ΑLiDAR: An Adaptive High-Resolution Panoramic LiDAR System. In Proceedings of the 30th Annual International Conference on Mobile Computing and Networking (ACM MobiCom ’24). Association for Computing Machinery, New York, NY, USA, 1515–1529.
J. Cui, S. Shi, Y. He, J. Niu, G. Xing, Z. Ouyang, “VILAM: Infrastructure-assisted 3D Visual Localization and Mapping for Autonomous Driving,” In Proceedings of the 21
st
USENIX Symposium on Networked Systems Design and Implementation (NSDI 24), pp. 1831-1845, 2024
X. Song, Y. Hua, Y. Yang, G. Xing, F. Liu, L. Xu, T. Song, “Distributed Resource Allocation with Federated Learning for Delay-Sensitive IoV Services,” In IEEE Transactions on Vehicular Technology, Vol. 73, no. 3, pp. 4326-4336, 2024
S. Shi, J. Cui, Z. Jiang, Z. Yan, G. Xing, J. Niu, and Z. Ouyang, “VIPS: Real-Time Perception Fusion for Infrastructure-Assisted Autonomous Driving,” In Proceedings of the 28th Annual International Conference on Mobile Computing And Networking, Oct. 2022.
X. Shuai, Y. Shen, S. Jiang, Z. Zhao, W. Lan, G. Xing. “BalanceFL: Addressing Class Imbalance in Long-tail Federated Learning,” Accepted by ACM / IEEE International Conference on Information Processing in Sensor Networks (IPSN), 2022.
X. Shuai, Y. Shen, Y. Tang, S. Shi, L. Ji, and G. Xing, “milliEye: A Lightweight mmWave Radar and Camera Fusion System for Robust Object Detection,” in The ACM/IEEE International Conference on Internet of Things Design and Implementation (IoTDI), May 18-21, 2021.
Z. Zhao, K. Wang, N. Ling, and G. Xing, “EdgeML: An AutoML Framework for Real-Time Deep Learning on the Edge,” in The ACM/IEEE International Conference on Internet of Things Design and Implementation (IoTDI), May 18-21, 2021.
Professor Bolei ZHOU
Y. Liu, J. Zhang, L. Fang, Q. Jiang, and B. Zhou, “Multimodal Motion Prediction with Stacked Transformers,” in IEEE/ CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 19-25, 2021.
C. Yang, Z. Wu, B. Zhou, and S. Lin, “Instance Localization for Self-supervised Detection Pretraining,” in IEEE/ CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 19-25, 2021.
J. Sun, L. Yu, P. Dong, B. Lu, and B. Zhou, “Adversarial Inverse Reinforcement Learning with Self-attention Dynamics Model,” IEEE Robot. Autom. Lett., vol. 6, no. 2, pp. 1880-1886, 2021.
J. Sun, L. Yu, P. Dong, B. Lu, and B. Zhou, “HiABP: Hierarchical Initialized ABP for Unsupervised Representation Learning,” in 35th AAAI Conference on Artificial Intelligence, February 2-9, 2021.
Y. Shen and B. Zhou, “Closed-Form Factorization of Latent Semantics in GANS,” in IEEE/ CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 19-25, 2021.
Y. Xu, Y. Shen, J. Zhu, C. Yang, and B. Zhou, “Generative Hierarchical Features from Synthesizing Images,” in IEEE/ CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 19-25, 2021.
Z. Peng, Q. Li and C. Liu, and B. Zhou. “Safe Driving via Expert Guided Policy Optimization,” In 5th Annual Conference on Robot Learning (CoRL), 2021.
Q. Li, Z. Peng, B. Zhou. “Efficient Learning of Safe Driving Policy via Human-AI Copilot Optimization,” In International Conference on Learning Representations (ICLR), 2022.
Z. Peng, Q. Li, K.M. Hui, C. Liu, B. Zhou. “Learning to Simulate Self-driven Particles System with Coordinated Policy Optimization,” In Advances in Neural Information Processing Systems (NeurIPS), 2021.
Q. Li, Z. Peng, L. Feng, Q. Zhang, Z. Xue and B. Zhou, “MetaDrive: Composing Diverse Driving Scenarios for Generalizable Reinforcement Learning,” in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 3, pp. 3461-3475.
X. Liu, Q. Wu, H. Zhou, Y. Xu, R. Qian, X. Lin, X. Zhou, W. Wu, B. Dai, B. Zhou. “Learning Hierarchical Cross-Modal Association for Co-Speech Gesture Generation,” In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022.