Friday, December 21, 2007


A Windows (32bit) binary version of svmsgd that I compiled using MinGW with zlib linked statically is available here.

Saturday, December 15, 2007

The Fastest SVM

The quest for the fastest SVM learning algorithm is continuing.

Leon Bottou reported his suprising finding: a classic optimization method, Stochastic Gradient Descent, is amazingly fast for training linear SVMs or CRFs. His program svmsgd works much faster than SVMperf and LIBLINEAR on very large datasets such as RCV1-v2.

Edward Chang's team has released their code of PSVM, a parallel implementation of SVM that can achieve almost linear reduction in both memory use and training time.

Thursday, December 13, 2007

Metaweb and KnowItAll

Metaweb Technologies, a start-up company that has been reported by the following news articles, seems to have a similar ambition as the KnowItAll project. They both try to turn the Web into a huge database of structured knowledge.

New York Times: Start-Up Aims for Database to Automate Web Searching
The Economist: Sharing what matters

Search Internationalization and Personalization

Some Web search engines employ different relevance (ranking) algorithms for different countries. Leaving the language issues aside, I wonder if the underlying technology of internationalized search is essentially equivalent to that of personalized search (such as Personalized PageRank) since we can consider the whole nation as a virtual user.

Monday, December 10, 2007

Transfer Learning

Hal Daume III recently wrote an insightful blogpost on the relationship between Transfer Learning and Domain Adaptation. I think another problem closely related to Transfer Learning is Multi-Task Learning.

The Apex Lab in SJTU recently published a number of papers on this topic and released a dataset.