Dataset: imdb – Internet Movie Database. The raw files can be found at ftp://ftp fu-berlin de/pub/misc/movies/database/ Preprocessing: (using Java)

Дата канвертавання22.04.2016
Памер4.95 Kb.
Analysis of the IMDB actors social network

Dataset: IMDB – Internet Movie Database.
The raw files can be found at

Preprocessing: (using Java)

helpful document

Database System: SQL Server 2008 on

Social network:

A social network is an undirected graph in which nodes represent people and edges relationships between them. Edges can contain additional information like timestamps. We would like to create a social network from the IMDB dataset by considering the actresses and actors as people in the social network. There exists a relationship between two actors if they participated in the same movie. Furthermore, the edges contain labels which denote the year in which the movie was first screened. This information can be found in the dataset.

Project proposal:

Association rules are normally applied to transaction data. In this project, we would like to extend them to social network data using ILP terminology.

Examples are:

1. Actor(A), Friends(A, B, T), Actor(B) -> Friends(A,B, T+k), where k >=1

If two actors played in the same movie then they are very likely to act together a couple of year later again.

2. Actor(A), Friends(A,B,T), Actor(B), Friends(A,C,T) -> Friends(A,C, T+k)

“closing triangle” example

3. Actor(A), Friends(A, B, T), Actor(B), Genre(A,T-l, “comedy”),Genre(A,T,”thriller”), Genre(B,T,”comedy”) ->Genre(B,T+k, “thriller”)

(The “genre” of an actor can be determined by the majority genres in which he played within this particular year.)

If your friend moves from the genre “comedy” to the genre “thriller” then it is very likely that you do the same. Here we can also defined a community instead of a just a friendship relationship by defining Community({A,B,C},T):= Friends(A,B,T) ,Friends(C,B,T) Friends(A,C,T)

База данных защищена авторским правом © 2016
звярнуцца да адміністрацыі

    Галоўная старонка