Sunday, August 28, 2011

Fastest membership test in Python

What is the most efficient method to check whether an item is in a given group or not? In Python, it seems that set (or frozenset) would be slightly faster than dict and much much faster than list.

Friday, August 26, 2011

Submodular functions

Intuitively, a submodular function over the subsets demonstrates "diminishing returns", which is related to the concept of marginal utility in economics. Its usefulness for machine learning is well explained and illustrated by the Beyond Convexity tutorial. There is a Matlab toolbox for submodular function optimization available that is developed by Andreas Krause.

L1 regularisation Is efficient for selecting relevant features

Andrew Ng has proven in his ICML-2004 paper that sample complexity grows linearly in the number of irrelevant features when using L2 regularisation (in logistic regression, support vector machine, and back-propagation neural network), but only logarithmically when using L1 regularisation (in logistic regression).