Abstract:We present a federated, multi-modal phishing website detector that supports URL, HTML, and IMAGE inputs without binding clients to a fixed modality at inference: any client can invoke any modality head trained elsewhere. Methodologically, we propose role-aware bucket aggregation on top of FedProx, inspired by Mixture-of-Experts and FedMM. We drop learnable routing and use hard gating (selecting the IMAGE/HTML/URL expert by sample modality), enabling separate aggregation of modality-specific parameters to isolate cross-embedding conflicts and stabilize convergence. On TR-OP, the Fusion head reaches Acc 97.5% with FPR 2.4% across two data types; on the image subset (ablation) it attains Acc 95.5% with FPR 5.9%. For text, we use GraphCodeBERT for URLs and an early three-way embedding for raw, noisy HTML. On WebPhish (HTML) we obtain Acc 96.5% / FPR 1.8%; on TR-OP (raw HTML) we obtain Acc 95.1% / FPR 4.6%. Results indicate that bucket aggregation with hard-gated experts enables stable federated training under strict privacy, while improving the usability and flexibility of multi-modal phishing detection.
Abstract:For several decades now, Bayesian inference techniques have been applied to theories of particle physics, cosmology and astrophysics to obtain the probability density functions of their free parameters. In this study, we review and compare a wide range of Markov Chain Monte Carlo (MCMC) and nested sampling techniques to determine their relative efficacy on functions that resemble those encountered most frequently in the particle astrophysics literature. Our first series of tests explores a series of high-dimensional analytic test functions that exemplify particular challenges, for example highly multimodal posteriors or posteriors with curving degeneracies. We then investigate two real physics examples, the first being a global fit of the $\Lambda$CDM model using cosmic microwave background data from the Planck experiment, and the second being a global fit of the Minimal Supersymmetric Standard Model using a wide variety of collider and astrophysics data. We show that several examples widely thought to be most easily solved using nested sampling approaches can in fact be more efficiently solved using modern MCMC algorithms, but the details of the implementation matter. Furthermore, we also provide a series of useful insights for practitioners of particle astrophysics and cosmology.