ParsCit is an open-source reference string parsing package developed by Min-Yen Kan et al. It is based on the Conditional Random Fields (CRF) toolkit CRF++. It is being used by the well-known computer science digital library CiteSeer^x.
Monday, April 28, 2008
Tuesday, April 08, 2008
More Data vs. Better Algorithms
The recent blog posts from Anand Rajaraman that more data usually beats better algorithms (part 1, part 2 and part 3) reminds me of a talk by David Hand two years ago --- Classifier Technology and the Illusion of Progress. There has also been discussons on a this issue in Hal Daume III's blog post about Heuristics.
Laplacian Kernel, Resistance Distance and Commute Time
The Laplacian kernel for a graph is interestingly connected to the resistance distance (the total resistance between two nodes) and the commute time (the average length of a random walk between two nodes) over the graph.
Subscribe to:
Posts (Atom)