Contextual Multi-armed Bandits for the Prevention of Spam in VoIP Networks
English
Jung, Tobias[Université de Liège - ULg > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Réseaux informatiques >]
Martin, Sylvain[Université de Liège - ULg > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Réseaux informatiques >]
Ernst, Damien[Université de Liège - ULg > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Smart grids >]
Leduc, Guy[Université de Liège - ULg > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Réseaux informatiques >]
2012
[en] Learning ; Bandits ; Networking
[en] In this paper we argue that contextual multi-armed bandit algorithms could open avenues for designing self-learning security modules for computer networks and related tasks. The paper has two contributions: a conceptual one and an algorithmical one. The conceptual contribution is to formulate -- as an example -- the real-world problem of preventing SPIT (Spam in VoIP networks), which is currently not satisfyingly addressed by standard techniques, as a sequential learning problem, namely as a contextual multi-armed bandit. Our second contribution is to present CMABFAS, a new algorithm for general contextual multi-armed bandit learning that specifically targets domains with finite actions. We illustrate how CMABFAS could be used to design a fully self-learning SPIT filter that does not rely on feedback from the end-user (i.e., does not require labeled data) and report first simulation results.