Electroencephalography (EEG) signals contain rich temporal-spectral structure but are difficult to model due to noise, subject variability, and multi-scale dynamics. Lightweight deep learning models have shown promise, yet many either rely solely on local convolutions or require heavy recurrent modules. This paper presents PaperNet, a compact hybrid architecture that combines temporal convolutions, a channel-wise residual attention module, and a lightweight bidirectional recurrent block which is used for short-window classification. Using the publicly available BEED: Bangalore EEG Epilepsy Dataset, we evaluate PaperNet under a clearly defined subject-independent training protocol and compare it against established and widely used lightweight baselines. The model achieves a macro-F1 of 0.96 on the held-out test set with approximately 0.6M parameters, while maintaining balanced performance across all four classes. An ablation study demonstrates the contribution of temporal convolutions, residual attention, and recurrent aggregation. Channel-wise attention weights further offer insights into electrode relevance. Computational profiling shows that PaperNet remains efficient enough for practical deployment on resource-constrained systems through out the whole process. These results indicate that carefully combining temporal filtering, channel reweighting, and recurrent context modeling can yield strong EEG classification performance without excessive computational cost.