David Carmel, Elad Yom-Tov, Adam Darlow, Dan Pelleg: What Makes A Query Difficult?, SIGIR 2006: 390-397.
Vishwa Vinay, Ingemar J. Cox, Natasa Milic-Frayling, Ken Wood: On Ranking the Effectiveness of Searches, SIGIR 2006: 398-404.
Yun Zhou, Bruce Croft: Query Performance Prediction in Web Search Environments, SIGIR 2007.
Saturday, March 24, 2007
Estimating Query Hardness
Friday, March 23, 2007
Sociocentric vs. Egocentric
While sociocentric network/graph anlysis is oriented toward the global 'society', egocentric network/graph anlysis is centered around an individual node therefore provides a personalized view of its local neighborhood environment.
Saturday, March 17, 2007
Edge Separator vs. Vertex Separator
An edge separator is a subset of the graph's edges whose removal from the graph makes the graph disconnected. Defferent weakness measures of the edge separator lead to different ways of graph partitioning or clustering, such as normalized cuts, conductance, edge betweenness, modularity, and relative neighborhoods. However, regardless of the weakness measure used, edge separators sometimes fail to capture the cohesion of graphs, especially in the presence of overlapping clusters. While the existence of a weak edge separator in a graph is sufficient to make the graph noncohesive, it is not a necessary condition.
A vertex separator is a subset of the graph's nodes whose removal leaves the graph disconnected. Note that sparse vertex separators subsume weak edge separators: if the graph has a weak edge separator, then it must also have a sparse vertex separator, but the converse is not true.
--- Ziv Bar-Yossef, Ido Guy, Ronny Lempel, Yoelle S. Maarek and Vladimir Soroka, Cluster Ranking with an Application to Mining Mailbox Networks, Proceedings of the 6th IEEE International Conference on Data Mining (ICDM), pp. 63--74, Hong Kong, China, 2006.
Thursday, March 15, 2007
Motzkin-Straus
The Motzkin-Straus theorem (formulation) establishes a remarkable connection between the maximal/maximum cliques of an unweighted undirected graph G and the local/global solutions of a quadratic programming problem. It has motivated several clique-finding techniques. There are generalizations of this result to edge-weighted or vertex-weighted graphs.
Thursday, March 08, 2007
Wednesday, March 07, 2007
10 Challenging Problems in Data Mining Research
On ICDM'05, Qiang Yang and Xindong Wu presented 10 Challenging Problems in Data Mining Research.
1. Developing a Unifying Theory of Data Mining
2. Scaling Up for High Dimensional Data and High Speed Data Streams
3. Mining Sequence Data and Time Series Data
4. Mining Complex Knowledge from Complex Data
5. Data Mining in a Network Setting
6. Distributed Data Mining and Mining Multi-agent Data
7. Data Mining for Biological and Environmental Problems
8. Data-Mining-Process Related Problems
9. Security, Privacy and Data Integrity
10. Dealing with Non-static, Unbalanced and Cost-sensitive Data
I am particularly interested in Problem 2, 4 and 5 which share one common theme: Learning/Mining with Very Large Graphs.
Sunday, March 04, 2007
Free eBook on Matrix Computation
The Matrix Cookbook is a mathematical desktop reference on matrices that is freely available online.