David Carmel, Elad Yom-Tov, Adam Darlow, Dan Pelleg: What Makes A Query Difficult?, SIGIR 2006: 390-397.

Vishwa Vinay, Ingemar J. Cox, Natasa Milic-Frayling, Ken Wood: On Ranking the Effectiveness of Searches, SIGIR 2006: 398-404.

Yun Zhou, Bruce Croft: Query Performance Prediction in Web Search Environments, SIGIR 2007.

## Saturday, March 24, 2007

### Estimating Query Hardness

## Friday, March 23, 2007

### Sociocentric vs. Egocentric

While sociocentric network/graph anlysis is oriented toward the global 'society', egocentric network/graph anlysis is centered around an individual node therefore provides a personalized view of its local neighborhood environment.

## Saturday, March 17, 2007

### Edge Separator vs. Vertex Separator

An edge separator is a subset of the graph's edges whose removal from the graph makes the graph disconnected. Defferent weakness measures of the edge separator lead to different ways of graph partitioning or clustering, such as normalized cuts, conductance, edge betweenness, modularity, and relative neighborhoods. However, regardless of the weakness measure used, edge separators sometimes fail to capture the *cohesion* of graphs, especially in the presence of overlapping clusters. While the existence of a weak edge separator in a graph is sufficient to make the graph noncohesive, it is not a necessary condition.

A vertex separator is a subset of the graph's nodes whose removal leaves the graph disconnected. Note that *sparse* vertex separators subsume *weak* edge separators: if the graph has a weak edge separator, then it must also have a sparse vertex separator, but the converse is not true.

--- Ziv Bar-Yossef, Ido Guy, Ronny Lempel, Yoelle S. Maarek and Vladimir Soroka, Cluster Ranking with an Application to Mining Mailbox Networks, Proceedings of the 6th IEEE International Conference on Data Mining (ICDM), pp. 63--74, Hong Kong, China, 2006.

## Thursday, March 15, 2007

### Motzkin-Straus

The Motzkin-Straus theorem (formulation) establishes a remarkable connection between the maximal/maximum cliques of an unweighted undirected graph G and the local/global solutions of a quadratic programming problem. It has motivated several clique-finding techniques. There are generalizations of this result to edge-weighted or vertex-weighted graphs.

## Thursday, March 08, 2007

## Wednesday, March 07, 2007

### 10 Challenging Problems in Data Mining Research

On ICDM'05, Qiang Yang and Xindong Wu presented 10 Challenging Problems in Data Mining Research.

1. Developing a Unifying Theory of Data Mining

2. Scaling Up for High Dimensional Data and High Speed Data Streams

3. Mining Sequence Data and Time Series Data

4. Mining Complex Knowledge from Complex Data

5. Data Mining in a Network Setting

6. Distributed Data Mining and Mining Multi-agent Data

7. Data Mining for Biological and Environmental Problems

8. Data-Mining-Process Related Problems

9. Security, Privacy and Data Integrity

10. Dealing with Non-static, Unbalanced and Cost-sensitive Data

I am particularly interested in Problem 2, 4 and 5 which share one common theme: **Learning/Mining with Very Large Graphs**.

## Sunday, March 04, 2007

### Free eBook on Matrix Computation

The Matrix Cookbook is a mathematical desktop reference on matrices that is freely available online.