A Windows (32bit) binary version of svmsgd that I compiled using MinGW with zlib linked statically is available here.
Friday, December 21, 2007
Saturday, December 15, 2007
The Fastest SVM
The quest for the fastest SVM learning algorithm is continuing.
Leon Bottou reported his suprising finding: a classic optimization method, Stochastic Gradient Descent, is amazingly fast for training linear SVMs or CRFs. His program svmsgd works much faster than SVMperf and LIBLINEAR on very large datasets such as RCV1-v2.
Edward Chang's team has released their code of PSVM, a parallel implementation of SVM that can achieve almost linear reduction in both memory use and training time.
Thursday, December 13, 2007
Metaweb and KnowItAll
Metaweb Technologies, a start-up company that has been reported by the following news articles, seems to have a similar ambition as the KnowItAll project. They both try to turn the Web into a huge database of structured knowledge.
New York Times: Start-Up Aims for Database to Automate Web Searching
The Economist: Sharing what matters
Search Internationalization and Personalization
Some Web search engines employ different relevance (ranking) algorithms for different countries. Leaving the language issues aside, I wonder if the underlying technology of internationalized search is essentially equivalent to that of personalized search (such as Personalized PageRank) since we can consider the whole nation as a virtual user.
Monday, December 10, 2007
Transfer Learning
Hal Daume III recently wrote an insightful blogpost on the relationship between Transfer Learning and Domain Adaptation. I think another problem closely related to Transfer Learning is Multi-Task Learning.
The Apex Lab in SJTU recently published a number of papers on this topic and released a dataset.