Alexandria Digital Research Library

Uncovering Interesting Attributed Anomalies in Large Graphs

Li, Nan
Degree Grantor:
University of California, Santa Barbara. Computer Science
Degree Supervisor:
Xifeng Yan
Place of Publication:
[Santa Barbara, Calif.]
University of California, Santa Barbara
Creation Date:
Issued Date:
Computer Science
Graph Indexing
Graph Querying
Anomaly Detection
Graph Mining
Vertex Classification
Mixture Models
Dissertations, Academic and Online resources
Ph.D.--University of California, Santa Barbara, 2013

Graph is a fundamental model for capturing entities and their relations in a wide range of applications. Examples of real-world graphs include the Web, social networks, communication networks, intrusion networks, collaboration networks, and biological networks. In recent years, with the proliferation of rich information available for real-world graphs, vertices and edges are often associated with attributes that describe their characteristics and properties. This gives rise to a new type of graphs, namely attributed graphs. Anomaly detection has been extensively studied in many research areas, and finds important applications in real-world tasks such as financial fraud detection, spam detection and cyber security. Anomaly detection in large graphs, especially graphs annotated with attributes, is still under explored. Most of existing work in this aspect focuses on the structural information of the graphs. In this thesis, we aim to address the following questions: How do we define anomalies in large graphs annotated with attributive information? How to mine such anomalies efficiently and effectively?

A succinct yet fundamental anomaly definition is introduced: given a graph augmented with vertex attributes, an attributed anomaly refers to a constituent component of the graph, be it a vertex, an edge, or a subgraph, exhibiting abnormal features that deviate from the majority of constituent components of the same nature, in a combined structural and attributive space. For example in a social network, assume there exists a group of people, most of whom share similar taste in movies, whereas the majority of social groups in this network tend to have very diverse interests in movies; or in a collaboration network, there exists a group of closely connected experts that possess a set of required expertise, and such a group occurs scarcely in this network; we consider the groups in both scenarios as "anomalous". Applications of this research topic abound, including target marketing, recommendation systems, and social influence analysis. The goal of this work therefore is to create efficient solutions to effectively uncover interesting anomalous patterns in large attributed graphs.

In service of this goal, we have developed several frameworks using two types of approaches: (1) combinatorial methods based on graph indexing and querying; (2) statistical methods based on probabilistic models and network regularization.

Physical Description:
1 online resource (188 pages)
UCSB electronic theses and dissertations
Catalog System Number:
Inc.icon only.dark In Copyright
Copyright Holder:
Nan Li
Access: This item is restricted to on-campus access only. Please check our FAQs or contact UCSB Library staff if you need additional assistance.