Collective classification in social networks
Classification is one of the most studied subjects in machine learning. Most classification methods that were developed this last decade either account for structure (interactions, relationships) or attributes (text, numerical, etc). This leads to ignoring significant patterns in a dataset that could only be captured by analyzing the features of an item and its interactions. Collective classification methods use both structure and attributes, often by aggregating data from neighbors of a node and learning a model on the aggregated data. In social networks, the degree distribution of nodes follows a power law where few nodes have many neighbors. High degree nodes have incoming links from low degree nodes of different classes and many nodes have very few edges. Hence, using only local structure may lead to poor predictions. Also, many social networks allow for different types of interactions (retweet, reply, like, etc.) that affect classification differently. This article proposes a collective classification method that makes use of the structure of a network to determine its neighbors. It then presents experiments aimed at detecting jihadi propagandists and malware distributors on social networks.
- article2.pdf (0.5 Mo)