There are some datasets and algorithms for community detection
Before compiling codes, the following software should be installed in your system.
- Matlab
- gcc (for Linux and Mac) or Microsoft Visual Studio (for Windows)
- SBM benchmark generated by stochastic block model
- GN benchmark generated by Girvan and Newman
- LFR benchmark (available at http://sites.google.com/site/santofortunato/inthepress2/)
- From http://snap.stanford.edu/data/index.html#communities you can download following datasets: Amazon, dblp, youtube and Email-Eu
- From https://linqs.soe.ucsc.edu/data you can download following datasets: citeseer, cora, WebKB, Pubmed-Diabetes, TerrorAttack and TerroristRel
- From http://math.bu.edu/people/kolaczyk/datasets.html you can download following datasets: AIDSBlog, Ecoli_microarray, epilepsy, packet_delay, ppi, PPI_function, router_INET, TM-ESTIMATION and zachary
- From http://cb.csail.mit.edu/cb/mna/isobase/ you can download following datasets: Isobase
- From http://socialcomputing.asu.edu/pages/datasets you can download following datasets: BlogCatalog and Flickr
- From http://www-personal.umich.edu/~mejn/netdata/ you can download following datasets: karate, lesmis, adjnoun, football, dolphins, polblogs, polbooks, celegansneural, power, cond-mat, cond-mat-2003, cond-mat-2005, astro-ph, hep-th, netscience and as-22july06
- From https://figshare.com/articles/American_College_Football_Network_Files/93179 you can download following dataset: footballTSE
Note that zachary and karate are the same datasets, the difference is that zachary dataset provides ground-truth while karate without ground-truth
- zachary dataset
- nodes: 34, edges: 78
- two communities with ground truth size >= 3
$ cd Baseline_Algorithms/Global_Algorithms/Algorithms/
$ sh complile-all.sh
$ sh run.sh
$ cd processCode
$ matlab
$ getResults
$ cd Baseline_Algorithms/Local_Algorithms/Algorithms/
$ cd LEMON
$ matlab
$ LEMON % run LEMON algorithm
$ cd LOSP
$ matlab
LOSP % run LOSP algorithm
$ cd HK
$ matlab
$ mex -largeArrayDims hkgrow_mex.cpp % compile the mex file
$ HK % run HK algorithm
$ cd PR
$ matlab
$ mex -largeArrayDims pprgrow_mex.cc % compile the mex file
$ PR % run PR algorithm
$ cd PGDc_EMc
$ matlab
$ PGDc_EMc % run PGDc_EMc algorithm
$ cd YL
$ matlab
$ YL % run YL algorithm
$ cd MOV
$ matlab
$ MOV % run MOV algorithm
Please email to panshi2016@gmail.com or setup an issue if you have any problems or find any bugs.
In the program, we incorporate some open source codes as baseline algorithms from the following websites: