Research on Search

Saturday, March 17, 2007

Edge Separator vs. Vertex Separator

An edge separator is a subset of the graph's edges whose removal from the graph makes the graph disconnected. Defferent weakness measures of the edge separator lead to different ways of graph partitioning or clustering, such as normalized cuts, conductance, edge betweenness, modularity, and relative neighborhoods. However, regardless of the weakness measure used, edge separators sometimes fail to capture the cohesion of graphs, especially in the presence of overlapping clusters. While the existence of a weak edge separator in a graph is sufficient to make the graph noncohesive, it is not a necessary condition.

A vertex separator is a subset of the graph's nodes whose removal leaves the graph disconnected. Note that sparse vertex separators subsume weak edge separators: if the graph has a weak edge separator, then it must also have a sparse vertex separator, but the converse is not true.

--- Ziv Bar-Yossef, Ido Guy, Ronny Lempel, Yoelle S. Maarek and Vladimir Soroka, Cluster Ranking with an Application to Mining Mailbox Networks, Proceedings of the 6th IEEE International Conference on Data Mining (ICDM), pp. 63--74, Hong Kong, China, 2006.

Thursday, March 15, 2007

Motzkin-Straus

The Motzkin-Straus theorem (formulation) establishes a remarkable connection between the maximal/maximum cliques of an unweighted undirected graph G and the local/global solutions of a quadratic programming problem. It has motivated several clique-finding techniques. There are generalizations of this result to edge-weighted or vertex-weighted graphs.

Thursday, March 08, 2007

Blog Datasets

TREC-BLOG'06 Dataset

Weblog Data Collection provided by BuzzMetrics (see also AND-2007 Datasets)

Splog (Spam-blog) Dataset

ICWSM'08 Social Media Dataset

Wednesday, March 07, 2007

10 Challenging Problems in Data Mining Research

On ICDM'05, Qiang Yang and Xindong Wu presented 10 Challenging Problems in Data Mining Research.
1. Developing a Unifying Theory of Data Mining
2. Scaling Up for High Dimensional Data and High Speed Data Streams
3. Mining Sequence Data and Time Series Data
4. Mining Complex Knowledge from Complex Data
5. Data Mining in a Network Setting
6. Distributed Data Mining and Mining Multi-agent Data
7. Data Mining for Biological and Environmental Problems
8. Data-Mining-Process Related Problems
9. Security, Privacy and Data Integrity
10. Dealing with Non-static, Unbalanced and Cost-sensitive Data

I am particularly interested in Problem 2, 4 and 5 which share one common theme: Learning/Mining with Very Large Graphs.

Sunday, March 04, 2007

Free eBook on Matrix Computation

The Matrix Cookbook is a mathematical desktop reference on matrices that is freely available online.

Research on Search

Saturday, March 17, 2007

Edge Separator vs. Vertex Separator

Thursday, March 15, 2007

Motzkin-Straus

Thursday, March 08, 2007

Blog Datasets

Wednesday, March 07, 2007

10 Challenging Problems in Data Mining Research

Sunday, March 04, 2007

Free eBook on Matrix Computation

About Me

Archive

Links

Research on Search

Saturday, March 17, 2007

Edge Separator vs. Vertex Separator

Thursday, March 15, 2007

Motzkin-Straus

Thursday, March 08, 2007

Blog Datasets

Wednesday, March 07, 2007

10 Challenging Problems in Data Mining Research

Sunday, March 04, 2007

Free eBook on Matrix Computation

About Me

Archive

Links

Subscribe To