Publications
List of publications.
In the following * indicates co–first authors; † indicates the corresponding author or that the work was completed under their supervision. See Google Scholar for the full paper list.
Selected Publications
2026
2025
- ACL
Unintended Harms of Value-Aligned LLMs: Psychological and Empirical InsightsIn Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL), 2025
2024
- NAACL
Value FULCRA: Mapping Large Language Models to the Multidimensional Spectrum of Basic Human ValueIn Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL), 2024
2023
2019
Selected Preprints
2025
- On the Dynamics of Multi-Agent LLM Communities Driven by Value DiversityarXiv preprint arXiv:2512.10665, 2025 [ PDF]
- MMA-ASIA: A Multilingual and Multimodal Alignment Framework for Culturally-Grounded EvaluationarXiv preprint arXiv:2510.08608, 2025 [ PDF]
- The Morality of Probability: How Implicit Moral Biases in LLMs May Shape the Future of Human-AI SymbiosisarXiv preprint arXiv:2509.10297, 2025 [ PDF]
- The Incomplete Bridge: How AI Research (Mis) Engages with PsychologyarXiv preprint arXiv:2507.22847, 2025 [ PDF]
- PICACO: Pluralistic In-context Value Alignment of LLMs via Total Correlation OptimizationarXiv preprint arXiv:2507.16679, 2025 [ PDF]
- CAReDiO: Cultural Alignment of LLM via Representativeness and Distinctiveness Guided Data OptimizationarXiv preprint arXiv:2504.08820, 2025 [ PDF]
- Leveraging Implicit Sentiments: Enhancing Reliability and Validity in Psychological Trait Evaluation of LLMsarXiv preprint arXiv:2503.20182, 2025 [ PDF]
- Research on Superalignment Should Advance Now with Parallel Optimization of Competence and ConformityarXiv preprint arXiv:2503.07660, 2025 [ PDF]
2024
- The Road to Artificial Superintelligence: A comprehensive Survey of SuperalignmentarXiv preprint arXiv:2412.16468, 2024 [ PDF]
- Toolnet: Connecting Large Language Models with Massive Tools via Tool GrapharXiv preprint arXiv:2403.00839, 2024 [ PDF]
2023
- From Instructions to Intrinsic Human Values–A Survey of Alignment Goals for Big ModelsarXiv preprint arXiv:2308.12014, 2023 [ PDF]
Paper List
2026
- AdAEM: An Adaptively and Automated Extensible Measurement of LLMs’ Value Difference [Oral Paper]The Fourteenth International Conference on Learning Representations (ICLR), 2026 [ PDF]
- Generative Personality Simulation via Theory-Informed Structured InterviewProceedings of the European Chapter of the Association for Computational Linguistics (EACL), 2026 [ PDF]
2025
- On the Dynamics of Multi-Agent LLM Communities Driven by Value DiversityarXiv preprint arXiv:2512.10665, 2025 [ PDF]
- MoVa: Towards Generalizable Classification of Human Morals and ValuesProceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2025 [ PDF]
- MMA-ASIA: A Multilingual and Multimodal Alignment Framework for Culturally-Grounded EvaluationarXiv preprint arXiv:2510.08608, 2025 [ PDF]
- The Morality of Probability: How Implicit Moral Biases in LLMs May Shape the Future of Human-AI SymbiosisarXiv preprint arXiv:2509.10297, 2025 [ PDF]
- Specify Privacy Yourself: Assessing Inference-Time Personalized Privacy Preservation Ability of Large Vision-Language ModelsProceedings of the 33rd ACM International Conference on Multimedia (ACM MM), 2025 [ PDF]
- Benchmarking Retrieval-augmented Generation in Multi-modal ContextsProceedings of the 33rd ACM International Conference on Multimedia (ACM MM), 2025 [ PDF]
- Toward Faithful and Human-Aligned Self-Explanation of Deep Modelsnpj Artificial Intelligence, 2025 [ PDF]
- MoHoBench: Assessing Honesty of Multimodal Large Language Models via Unanswerable Visual Questions [Oral Paper]Proceedings of the AAAI conference on Artificial Intelligence (AAAI), 2025 [ PDF]
- IROTE: Human-like Traits Elicitation of Large Language Model via In-Context Self-Reflective OptimizationProceedings of the AAAI conference on Artificial Intelligence (AAAI), 2025 [ PDF]
- The Incomplete Bridge: How AI Research (Mis) Engages with PsychologyarXiv preprint arXiv:2507.22847, 2025 [ PDF]
- PICACO: Pluralistic In-context Value Alignment of LLMs via Total Correlation OptimizationarXiv preprint arXiv:2507.16679, 2025 [ PDF]
- Beyond Human Norms: Unveiling Unique Values of Large Language Models through Interdisciplinary Approaches [Best Paper]Proceedings of the 6th International Conference on Social Computing (ICSC), 2025 [ PDF]
- ParamMute: Suppressing Knowledge-Critical FFNs for Faithful Retrieval-Augmented GenerationThe Thirty-ninth Annual Conference on Neural Information Processing Systems (NeurIPS), 2025 [ PDF]
- Counterfactual Reasoning for Steerable Pluralistic Value Alignment of Large Language ModelsThe Thirty-ninth Annual Conference on Neural Information Processing Systems (NeurIPS), 2025 [ PDF]
- Value Compass Benchmarks: A Comprehensive, Generative and Self-Evolving Platform for LLMs’ Value EvaluationProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL: System Demonstrations), 2025 [ PDF]
- Unintended Harms of Value-Aligned LLMs: Psychological and Empirical Insights [Oral + Panel Paper]Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL), 2025 [ PDF]
- Towards Better Value Principles for Large Language Model Alignment: A Systematic Evaluation and Enhancement [SAC Highlight Paper]Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL), 2025 [ PDF]
- MotiveBench: How Far Are We From Human-Like Motivational Reasoning in Large Language Models?Findings of the Association for Computational Linguistics: ACL 2025, 2025 [ PDF]
- Neural Recommendation Reasoning with Logic RulesACM Transactions on Information Systems, 2025 [ PDF]
- Raising the Bar: Investigating the Values of Large Language Models via Generative Evolving TestingForty-second International Conference on Machine Learning (ICML), 2025 [ PDF]
- LegalDuet: Learning Fine-Grained Representations for Legal Judgment Prediction via Dual-View Contrastive Learning [Best Paper]Advanced Data Mining and Applications: 21st International Conference (ADMA), 2025 [ PDF]
- CAReDiO: Cultural Alignment of LLM via Representativeness and Distinctiveness Guided Data OptimizationarXiv preprint arXiv:2504.08820, 2025 [ PDF]
- Leveraging Implicit Sentiments: Enhancing Reliability and Validity in Psychological Trait Evaluation of LLMsarXiv preprint arXiv:2503.20182, 2025 [ PDF]
- Research on Superalignment Should Advance Now with Parallel Optimization of Competence and ConformityarXiv preprint arXiv:2503.07660, 2025 [ PDF]
2024
- The Road to Artificial Superintelligence: A comprehensive Survey of SuperalignmentarXiv preprint arXiv:2412.16468, 2024 [ PDF]
- CLAVE: An Adaptive Framework for Evaluating Values of LLM Generated ResponsesAdvances in Neural Information Processing Systems (NeurIPS), 2024 [ PDF]
- Embedding an Ethical Mind: Aligning Text-to-Image Synthesis via Lightweight Value OptimizationProceedings of the 32nd ACM International Conference on Multimedia (ACM MM), 2024 [ PDF]
- CDEval: A Benchmark for Measuring the Cultural Dimensions of Large Language ModelsProceedings of the 2nd Workshop on Cross-Cultural Considerations in NLP (C3NLP), 2024 [ PDF]
- DeNEVIL: Towards Deciphering and Navigating the Ethical Values of Large Language Models via Instruction LearningThe Twelfth International Conference on Learning Representations (ICLR), 2024 [ PDF]
- Value FULCRA: Mapping Large Language Models to the Multidimensional Spectrum of Basic Human ValueProceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL), 2024 [ PDF]
- Multi-Evidence Based Fact Verification via A Confidential Graph Neural NetworkIEEE Transactions on Big Data, 2024 [ PDF]
- On the Essence and Prospect: An Investigation of Alignment Approaches for Big ModelsProceedings of the 33 International Joint Conference on Artificial Intelligence, IJCAI, 2024 [ PDF]
- Toolnet: Connecting Large Language Models with Massive Tools via Tool GrapharXiv preprint arXiv:2403.00839, 2024 [ PDF]
- Negating Negatives: Alignment with Human Negative Samples via Distributional Dispreference OptimizationFindings of the Association for Computational Linguistics: EMNLP 2024, 2024 [ PDF]
- A Survey on Evaluation of Large Language ModelsACM Transactions on Intelligent Systems and Technology, 2024 [ PDF]
2023
- ToViLaG: Your Visual-Language Generative Model is Also An Evildoer [Oral Paper]Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2023 [ PDF]
- Unpacking the Ethical Value Alignment in Big ModelsJournal of Computer Research and Development, 2023 [ PDF]
- From Instructions to Intrinsic Human Values–A Survey of Alignment Goals for Big ModelsarXiv preprint arXiv:2308.12014, 2023 [ PDF]
- DuNST: Dual Noisy Self Training for Semi-Supervised Controllable Text GenerationProceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL), 2023 [ PDF]
- KEST: Kernel Distance Based Efficient Self-Training for Improving Controllable Text GenerationProceedings of the 32 International Joint Conference on Artificial Intelligence, IJCAI, 2023 [ PDF]
- Unified Detoxifying and Debiasing in Language Generation via Inference-time Adaptive OptimizationThe Eleventh International Conference on Learning Representations (ICLR), 2023 [ PDF]
2022
- Self-Explaining Deep Models with Logic Rule ReasoningAdvances in Neural Information Processing Systems (NeurIPS), 2022 [ PDF]
- Evade the Trap of Mediocrity: Promoting Diversity and Novelty in Text Generation via Concentrating AttentionProceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2022 [ PDF]
- Recurrence Boosts Diversity! Revisiting Recurrent Latent Variable in Transformer-Based Variational AutoEncoder for Diverse Text GenerationFindings of the Association for Computational Linguistics: EMNLP 2022, 2022 [ PDF]
- Personalized Chit-chat Generation for Recommendation Using External Chat CorporaProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2022 [ PDF]
- Fuse It More Deeply! A Variational Transformer with Layer-Wise Latent Variable Inference for Text GenerationProceedings of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL), 2022 [ PDF]
- Clickbait Detection via Contrastive Variational Modelling of Text and LabelProceedings of the 30th International Joint Conferences on Artificial Intelligence (IJCAI), 2022 [ PDF]
2021
- Neural Quality Estimation with Multiple Hypotheses for Grammatical Error CorrectionProceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL), 2021 [ PDF]
2020
- Neural Network-Based Poetry RetrievalJournal of Chinese Information Processing, 2020 [ PDF]
2019
- Sentiment-Controllable Chinese Poetry Generation.Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI), 2019 [ PDF]
2018
- Chinese Poetry Generation with a Salient-Clue Mechanism [Oral Paper]Proceedings of the 22nd Conference on Computational Natural Language Learning (CoNLL), 2018 [ PDF]
- Automatic Poetry Generation with Mutual Reinforcement Learning [Oral Paper]Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 2018 [ PDF]
2017
- Generating Chinese Classical Poems with RNN Encoder-DecoderChina National Conference on Chinese Computational Linguistics (CCL), 2017 [ PDF]
2016
- Inferring Users’ Emotions for Human-Mobile Voice Dialogue ApplicationsIEEE International Conference on Multimedia and Expo (ICME), 2016 [ PDF]