News
2025-08: I joined UIUC as a PhD student!
2025-07: One paper on Multimodal Model Editing is accepted by COLM 2025.
2025-06: One paper on Low-Rank Adaptation is accepted by ICCV 2025.
2025-01: One paper on Low-Rank Adaptation is accepted by ICLR 2025.
2024-06: I joined Harvard SEAS as a Research Assistant!
2024-05: I graduated from CMU!
2024-05: Two papers are accepted by MICCAI 2024.
|
Selected Publications [ Full List ]
(*Equal Contribution)
|
|
DualEdit: Dual Editing for Knowledge Updating in Vision-Language Models
Zhiyi Shi*, Binjie Wang*, Chongjie Si, Yichen Wu, Junsik Kim, Hanspeter Pfister
Conference on Language Modeling (COLM), 2025
[Paper]
[Arxiv]
[Code]
DualEdit edits both textual and visual modalities in vision-language models at their most sensitive layers. A gating mechanism in the textual modality enables knowledge updates without harming the model's original abilities.
|
|
Unleashing the power of task-specific directions in parameter efficient fine-tuning
Chongjie Si*, Zhiyi Shi*, Shifan Zhang, Xiaokang Yang, Hanspeter Pfister, Wei Shen
International Conference on Learning Representations (ICLR), 2025
[Paper]
[Arxiv]
[Code]
We propose LoRA-Dash, a framework that leverages task-specific directions (TSDs) to maximize fine-tuning efficiency and improve task performance.
|
|
Generalized Tensor-based Parameter-Efficient Fine-Tuning via Lie Group Transformations
Chongjie Si, Zhiyi Shi, Xuehui Wang, Yichen Xiao, Xiaokang Yang, Wei Shen
International Conference on Computer Vision (ICCV), 2025
[Arxiv]
[Code]
We propose a generalization of matrix-based PEFT to higher-dimensional parameter spaces using Lie group modeling, ensuring structure-preserving updates and demonstrating improved performance across vision and language tasks.
|
|
MoRA: LoRA Guided Multi-modal Disease Diagnosis with Missing Modality
Zhiyi Shi, Junsik Kim, Wanhua Li, Yicong Li, Hanspeter Pfister
Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), 2024
[Paper]
[Arxiv]
[Code]
MoRA is a computationally efficient method for multi-modal disease diagnosis that adapts to missing modalities using modality-specific low-rank projections.
|
|
Spatial-Temporal Pre-Training for Embryo Viability Prediction Using Time-Lapse Videos
Zhiyi Shi*, Junsik Kim*, Helen Y. Yang, Yonghyun Song, Hyun-Jic Oh, Hanspeter Pfister
[Arxiv]
We propose Spatial-Temporal Pre-Training (STPT), a two-stage self-supervised framework that efficiently handles memory and temporal misalignment, achieving state-of-the-art performance on 23,027 embryo videos with minimal computational resources.
|
Professional Activities
Reviewer, Conference on Neural Information Processing Systems (NeurIPS), 2025.
Reviewer, IEEE Transactions on Artificial Intelligence.
Reviewer, IEEE Transactions on Medical Imaging.
Reviewer, Artificial intelligence in medicine.
Reviewer, Medical Physics.
|
|