Zirui Song
I am a third year Undergraduate at University of Technology Sydeny(UTS) majoring in Software Engineering. I have become a member of UTS-NLP since Feb 2024 where I am fortuante to be advised by Prof. Ling Chen, and be mentored by Prof. Meng Fang
I am deeply appreciative of my mentor, Prof. Dayan Guan, who guided me into scientific research.
Previously, I had wonderful experience working with Miao Fang at NEU.
I am curently seeking for the Mphil/Ph.D position in 2025 fall.
Email  / 
CV  / 
Google Scholar  / 
Twitter  / 
Github  / 
LinkedIn  / 
Wechat
|
|
What's news
2024-07-01: One paper was accepted by ECCV 2024.
2024-02-29: Prof. Ling Chen had accepted me as an undergraduate research assistant at Australia Artificial Intelligence Institute(AAII).
2023-07-01: I am honored to be selected as an international exchange student majoring in Softawre Engineering at UTS.
2023-05-18: Prof. Dayan Guan had accepted me as a remote undergraduate research assistant at ROSE Lab.
2022-04-11: Prof. Miao Fang had accepted me as an undergraduate research assistant at NEU-NLP Lab.
2022-02-15: I won a third-class scholarship from Northeastern University.
2021-09-07: I was admitted to the Computer Science and Technology major at Northeastern University(NEU)
|
Research
My primary research interests lie in the area of Large Multimodal Models , Vision COT and Prompt Engineering.
|
|
BenchLMM: Benchmarking Cross-style Visual Capability of Large Multimodal Models
Rizhao Cai*,
Zirui Song*
Dayan Guanā ,
Zhenhao Chen,
Xing Luo,
Chenyu Yi,
Alex Kot
*Equal contribution, ā Corresponding Author
ECCV 2024
Project Page
/
github
/
arXiv
We propose BenchLMM to investigate the cross-style capability of Large Multimodal Models (LMMs).
|
|
MMAC-Copilot: Multi-modal Agent Collaboration Operating System Copilot
Zirui Song*,
Yaohang Li*,
Meng Fangā ,
Zhenhao Chen,
Zecheng Shi,
Yuan Huang,
Ling Chen
*Equal contribution, ā Corresponding Author
Arxiv, 2024
arXiv
Autonomous virtual agents are often limited by their singular mode of interaction with real-world environments, restricting their versatility.
To address this, we propose the Multi-Modal Agent Collaboration framework (MMAC-Copilot), a framework that utilizes the collective expertise of diverse agents to enhance interaction ability with operating systems.
The framework introduces a team collaboration chain, enabling each participating agent to contribute insights based on their specific domain knowledge, effectively reducing the hallucination associated with knowledge domain gaps.
|
|
University of Technology Sydeny, B.S. in Software Engineering, 2023 - Present
WAM: 88.5/100 - First Class Honours.
|
|
Northeastern University, B.S. in Computer Science and Technology, 2021 - 2023
GPA: 3.47/4.00
After two years of academic pursuits, I was honored to be selected as an exchange student for the University of Technology Sydney, where I will complete the final two years of my undergraduate studies.
|
|
Hebei Hengshui High School, Top 1 Senior high school in China, 2018 - 2021
I am very grateful to have met my science and innovation mentor here.
|
Website template mainly borrowed from Here
|
|