Feature selection can efficiently identify the most informative features with respect to the target feature used in training. However, state-of-the-art vector-based methods are unable to encapsulate the relationships between feature samples into the feature selection process, thus leading to significant information loss. To address this problem, we propose a new graph-based structurally interacting elastic net method for feature selection. Specifically, we commence by constructing feature graphs that can incorporate pairwise relationship between samples. With the feature graphs to hand, we propose a new information theoretic criterion to measure the joint relevance of different pairwise feature combinations with respect to the target feature graph representation. This measure is used to obtain a structural interaction matrix where the elements represent the proposed information theoretic measure between feature pairs. We then formulate a new optimization model through the combination of the structural interaction matrix and an elastic net regression model for the feature subset selection problem. This allows us to a) preserve the information of the original vectorial space, b) remedy the information loss of the original feature space caused by using graph representation, and c) promote a sparse solution and also encourage correlated features to be selected. Because the proposed optimization problem is non-convex, we develop an efficient alternating direction multiplier method (ADMM) to locate the optimal solutions. Extensive experiments on various datasets demonstrate the effectiveness of the proposed methods.