An edge separator is a subset of the graph's edges whose removal from the graph makes the graph disconnected. Defferent weakness measures of the edge separator lead to different ways of graph partitioning or clustering, such as normalized cuts, conductance, edge betweenness, modularity, and relative neighborhoods. However, regardless of the weakness measure used, edge separators sometimes fail to capture the cohesion of graphs, especially in the presence of overlapping clusters. While the existence of a weak edge separator in a graph is sufficient to make the graph noncohesive, it is not a necessary condition.
A vertex separator is a subset of the graph's nodes whose removal leaves the graph disconnected. Note that sparse vertex separators subsume weak edge separators: if the graph has a weak edge separator, then it must also have a sparse vertex separator, but the converse is not true.
--- Ziv Bar-Yossef, Ido Guy, Ronny Lempel, Yoelle S. Maarek and Vladimir Soroka, Cluster Ranking with an Application to Mining Mailbox Networks, Proceedings of the 6th IEEE International Conference on Data Mining (ICDM), pp. 63--74, Hong Kong, China, 2006.
Saturday, March 17, 2007
Edge Separator vs. Vertex Separator
Thursday, March 15, 2007
Motzkin-Straus
The Motzkin-Straus theorem (formulation) establishes a remarkable connection between the maximal/maximum cliques of an unweighted undirected graph G and the local/global solutions of a quadratic programming problem. It has motivated several clique-finding techniques. There are generalizations of this result to edge-weighted or vertex-weighted graphs.
Thursday, March 08, 2007
Wednesday, March 07, 2007
10 Challenging Problems in Data Mining Research
On ICDM'05, Qiang Yang and Xindong Wu presented 10 Challenging Problems in Data Mining Research.
1. Developing a Unifying Theory of Data Mining
2. Scaling Up for High Dimensional Data and High Speed Data Streams
3. Mining Sequence Data and Time Series Data
4. Mining Complex Knowledge from Complex Data
5. Data Mining in a Network Setting
6. Distributed Data Mining and Mining Multi-agent Data
7. Data Mining for Biological and Environmental Problems
8. Data-Mining-Process Related Problems
9. Security, Privacy and Data Integrity
10. Dealing with Non-static, Unbalanced and Cost-sensitive Data
I am particularly interested in Problem 2, 4 and 5 which share one common theme: Learning/Mining with Very Large Graphs.
Sunday, March 04, 2007
Free eBook on Matrix Computation
The Matrix Cookbook is a mathematical desktop reference on matrices that is freely available online.