Semantic text features from small world graphs
We present a set of methods for creating a semantic representation from a collection of textual documents. Given a document collection we use a simple algorithm to connect the documents into a tree or a graph. Using the imposed topology we define a feature and document similarity measures. We use the kernel alignment to compare the quality of various similarity measures. Results show that the document similarity defined over the topology gives better alignment than standard cosine similarity measure on a bag of words document representation.
Author: Jure Leskovec, Carnegie Mellon University