Generalization Bounds for Learning Kernels


This paper presents several novel generalization bounds for the problem of learning kernels based on a combinatorial analysis of the Rademacher complexity of the corresponding hypothesis sets. Our bound for learning kernels with a convex combination of p base kernels using L1 regularization admits only a √log p dependency on the number of kernels, which is tight and considerably more favorable than the previous best bound given for the same problem. We also give a novel bound for learning with a non-negative combination of p base kernels with an L2 regularization whose dependency on p is also tight and only in p^(1/4). We present similar results for Lq regularization with other values of q, and outline the relevance of our proof techniques to the analysis of the complexity of the class of linear functions. Experiments with a large number of kernels further validate the behavior of the generalization error as a function of p predicted by our bounds.