sukrucildirr

The paper studies how decentralized GRPO (Group Relative Policy Optimization) used for post-training LLMs can be attacked and how to defend it. In decentralized GRPO, multiple nodes generate completions for prompts, a shared rule-based reward scores them, and each node updates its local model from the pooled completions. Because only strings are exchanged, this setup is attractive but also vulnerable. Threat model and attacks: The authors introduce the first targeted poisoning/backdoor attack...

sukrucildirr

sukrucildirr

sukrucildirr

sukrucildirr

Support sukrucildirr

Support sukrucildirr