Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Md Nurul Absar Siddiky

Safety-Oriented Routing Analysis of Mixtral MoE Under Benign and Harmful Prompts

May 22, 2026

Md Nurul Absar Siddiky

Abstract:Sparse mixture-of-experts (MoE) language models activate only a small subset of parameters for each token, making router behavior a central part of model computation. This paper studies routing behavior of Mixtral 8x7B-Instruct under benign and harmful prompts using two complementary signals: activation-based routing scores derived from expert selection frequencies and gradient-based scores derived from router-gate sensitivities. We analyze expert- and layer-level routing behavior and conduct expert-suppression interventions. The results show that activation-based expert usage is broad and long-tailed, whereas gradient-based importance is concentrated. At expert level, benign and harmful prompt groups remain close under both signals with modest separation. At layer level, activation-based routing is most selective around layers 8-15, while gradient-based importance is concentrated in final layers. Expert classification shows most experts are shared across benign and harmful prompts, though a limited subset shows clear group preference. Top-ranked expert sets show stronger benign-malicious overlap under gradient scores than activation scores, suggesting concentration on a common late-layer expert set. In intervention experiments, suppressing top five benign-dominant experts from activation scores reduces restricted responses from 24 to 14 over 100 prompts, while suppressing gradient-derived experts reduces them from 34 to 22 with fewer unintended reversals. Overall, safety-relevant routing in Mixtral is subtle, depth-dependent, and distributed rather than dominated by a fixed set of experts.

Via

Access Paper or Ask Questions

Status Updating with Time Stamp Errors

Apr 07, 2025

Md Nurul Absar Siddiky, Ahmed Arafa

Abstract:A status updating system is considered in which multiple processes are sampled and transmitted through a shared channel. Each process has its dedicated server that processes its samples before time stamping them for transmission. Time stamps, however, are prone to errors, and hence the status updates received may not be credible. Our setting models the time stamp error rate as a function of the servers' busy times. Hence, to reduce errors and enhance credibility, servers need to process samples on a relatively prolonged schedule. This, however, deteriorates timeliness, which is captured through the age of information (AoI) metric. An optimization problem is formulated whose goal to characterize the optimal processes' schedule and sampling instances to achieve the optimal trade-off between timeliness and credibility. The problem is first solved for a single process setting, where it is shown that a threshold-based sleep-wake schedule is optimal, in which the server wakes up and is allowed to process newly incoming samples only if the AoI surpasses a certain threshold that depends on the required timeliness-credibility trade-off. Such insights are then extended to the multi-process setting, where two main scheduling and sleep-wake policies, namely round-robin scheduling with threshold-waiting and asymmetric scheduling with zero-waiting, are introduced and analyzed.

* To appear in the Age and Semantics of Information (ASoI) Workshop in IEEE Infocom 2025

Via

Access Paper or Ask Questions