| 2026 | Contextual Counterfactual Credit Assignment for Multi-Agent Reinforcement Learning in LLM Collaboration Y Chen, Y Sun, H Wang, X Zhang, X Shen, W Li, W Zhang | arXiv preprint arXiv:2603.06859, 2026 | 0 |
| 2026 | SonicBench: Dissecting the Physical Perception Bottleneck in Large Audio Language Models Y Sun, Y Chen, X Qiu, G Zhang, H Chen, D Wu, C Li, M Yang, D Zhu, ... | arXiv preprint arXiv:2601.11039, 2026 | 0 |
| 2025 | Reasoning beyond language: A comprehensive survey on latent chain-of-thought reasoning X Chen, A Zhao, H Xia, X Lu, H Wang, Y Chen, W Zhang, J Wang, W Li, ... | arXiv 2025 | 39 |
| 2025 | Unveiling the key factors for distilling chain-of-thought reasoning X Chen, Z Sun, G Wenjin, M Zhang, Y Chen, Y Sun, H Su, Y Pan, ... | ACL Findings 2025 | 37 |
| 2025 | Fine-grained and multi-dimensional metrics for document-level machine translation Y Sun, D Zhu, Y Chen, E Xiao, X Chen, X Shen | NAACL 2025 | 7 |
| 2025 | Integrating Chain-of-Thought for Multimodal Alignment: A Study on 3D Vision-Language Learning Y Chen, Y Sun, X Chen, J Wang, X Shen, W Li, W Zhang | arXiv 2025 | 4 |
| 2025 | LLaSO: A Foundational Framework for Reproducible Research in Large Language and Speech Model Y Sun, Y Geng, P Wei, Y Chen, J Yang, R Chen, W Zhang, X Shen | arXiv preprint arXiv:2508.15418, 2025 | 2 |
| 2025 | Breaking the pre-planning barrier: Real-time adaptive coordination of mission and charging UAVs using graph reinforcement learning Y Hu, Y Sun, Y Chen, X Chen | arXiv e-prints, arXiv: 2501.14488, 2025 | 1 |
| 2025 | PricingLogic: Evaluating LLMs Reasoning on Complex Tourism Pricing Tasks Y Liu, D Zhu, Z Al-Khalili, D Cheng, Y Chen, D Klakow, W Zhang, X Shen | Proceedings of the 2025 Conference on Empirical Methods in Natural Language …, 2025 | 0 |
| 2025 | MA-ROESL: Motion-aware Rapid Reward Optimization for Efficient Robot Skill Learning from Single Videos X Wang, X Zhang, Y Chen, X Shen, W Zhang | arXiv preprint arXiv:2505.08367, 2025 | 0 |
| 2024 | The Accuracy Paradox in RLHF: When Better Reward Models Don't Yield Better Language Models Y Chen, D Zhu, Y Sun, X Chen, W Zhang, X Shen | EMNLP 2024 | 14 |
| 2024 | Rethinking Soft Actor-Critic in High-Dimensional Action Spaces: The Cost of Ignoring Distribution Shift Y Chen, X Zhang, X Wang, Z Xu, X Shen, W Zhang | arXiv 2024 | 5 |