Listen to the Experts: Covert Communication Channels in Mixture-of-Experts Models via Expert Selection Patterns
Luc Chartier, George Tourtellot, Amir Nuriyev, Krystal Maughan, Natalia Kokoromyti
Mentored by Gabriel Kulp
Working report from the SPAR program. May not reflect the authors' current views.
Abstract
Mixture-of-Experts (MoE) models have demonstrated remarkable per- formance and scalability, largely due to their sparse activation of parame- ters. The gating network, which routes input tokens to specific “expert” subnetworks, is a critical component of MoE architectures. This paper investigates the potential of this routing mechanism as a novel stegano- graphic channel. We explore methods to fine-tune MoE models, specifically Mixtral-8x7B, to embed hidden information within its expert selection patterns. Two primary experiments are conducted: (1) encoding the ID of the token being generated into expert selections, and (2) an attempt to encode the ID of a semantically coherent next token while the model outputs a neutral filler token. Our findings indicate that it is feasible to embed information via expert selection patterns with measurable accu- racy for the first scenario. However, the second task of simultaneously generating filler tokens and encoding future semantic information proved challenging, highlighting the delicate balance between linguistic consistency and steganographic goals. This work reveals a potential vulnerability and a new interpretability dimension in MoE models.