Gene selection in high-dimensional genomic data is essential for understanding disease mechanisms and improving therapeutic outcomes. Traditional feature selection methods effectively identify predictive genes but often ignore complex biological pathways and regulatory networks, leading to unstable and biologically irrelevant signatures. Prior approaches, such as Lasso-based methods and statistical filtering, either focus solely on individual gene-outcome associations or fail to capture pathway-level interactions, presenting a key challenge: how to integrate biological pathway knowledge while maintaining statistical rigor in gene selection? To address this gap, we propose a novel two-stage framework that integrates statistical selection with biological pathway knowledge using multi-agent reinforcement learning (MARL). First, we introduce a pathway-guided pre-filtering strategy that leverages multiple statistical methods alongside KEGG pathway information for initial dimensionality reduction. Next, for refined selection, we model genes as collaborative agents in a MARL framework, where each agent optimizes both predictive power and biological relevance. Our framework incorporates pathway knowledge through Graph Neural Network-based state representations, a reward mechanism combining prediction performance with gene centrality and pathway coverage, and collaborative learning strategies using shared memory and a centralized critic component. Extensive experiments on multiple gene expression datasets demonstrate that our approach significantly improves both prediction accuracy and biological interpretability compared to traditional methods.