Developer notes¶
Overview¶
Inputs¶
Broadly speaking all programs expect an edge list format where nodes have unique numeric IDs. But, each program allows certain details to be passed along with the edgelist, depending on their specific objectives.
mfinder
Expects an edge list of
Source\tTarget\tWeight
The edge list must be TAB delimited
The
Weight
is not active and should default to 1
fanmod_cmd
Expects an edge list of
Source Target
Can also accept ….
NetMODE
Expects an edge list of
Source Target
Listing has to begin with the number of edges expected present in the file.
The edge list must be TAB delimited.
PGD
Expects an edge list of
Source Target
The edge list must be COMMA separated.
Parameters¶
mfinder
¶
Usage : mfinder <Network input file name> -s <motif size> -r <no. of randon networks> [-f <output file name>] [more flags]
-s <motif size> :Motif size to search
-r <rand net num> :Number of random networks to generate
-f <output file name> : Output file name
-nd : Input network is a non-directed network.
-p <num of samples>: run with Sampling method,
-omem : output members list of all subgraphs
-h : help
Additional flags:
Motif criteria flags:
-m <value> : mfactor threshold to use when calculating motifs
-z <value> : Z-score threshold to use when calculating motifs
-u : Uniqueness threshold
-nu : Dont count uniqueness and ignore uniqueness threshold
Random networks randomization flags:
-rs : use stubs method for generating random networks
-rclust : Preserve clustering sequence in random networks
-met :Use Metropolis algorithm to conserve triad-census
in random networks
(for s>3; Default : Do not use Metropolis)
-t0 <(default 0.001)> :Initial temperature (-met option)
-iter <(default 2)> :controls how many steps to perform (-met option)
-eth <(default 0.005)> : energy threshhold (-met option)
-rgrass <colony size>: generate random networks using grassberger
algorithm
-rgrass_max_sz <max ratio>: Limit maximal colony size ratio
-rdm: don't conserve mutual edges in random networks
-rcl <layers num><size1 size2 ..sizem>: conserve layers in random
networks
-nsr : Global Switches number when generating Random networks.
Output files flags:
-oi : output intermediate output file. Defualt :No:
-ospmem <subgraph id>: output members list of a specific subgraphs only
-maxmem <list length>: limit length of members list to 'list length'.
Defualt: 1000
-omat : output matrix format file ('__MAT.txt')
-omet : output metropolis log
-olog : output general log file
-orall : output matrix format of appearances in each random network
-ornet : output random networks files
-otop <no. of top motifs> : No. of top motifs to show
-onodangl: output a list of all nan-dangling detected motifs
Other flags
-ts : <target,source,weight> Old format of input network file
-q :Quiet mode - No output to the screen
-dd : Don't die mode. Wait to user action before terminating the
program
-pold <num of samples>: run sampling method old version
-nor : Dont search Real network. Defualt :No:
-cr : calculate roles statistics
mfinder Version 1.20
fanmod_cmd
¶
Allowed options:
--help produce help message
-i [ --input ] arg input graph file
-o [ --output ] arg output csv file
-d [ --directed ] the graph input is directed
-s [ --motif_size ] arg the size of the searched motifs. default = 4
-r [ --rnd_nets ] arg the amount of random nets to compare the motif
frequency. default = 1000
NetMODE
¶
-k --size = k-node subgraphs (=3,4,5 or 6)
-c --random = # comparison graphs (some number, could be 0, an integer in [0, 2^31))
-b --burnin = burnin = # comparison graphs discarded (some number, could be 0)
-e --method = bidirectional edge random_method (=0, 1, 2, 3(default) or 4)
0: fixed;
1: no regard;
2: global constant;
3: local constant;
4: uniform.
-t --thread = number of threads to run
-v --verbose = interface mode (for interfacing with e.g. R) (=0 (default), 1)
-s --showall = show all subgraph statistics in random network while (k <= 5)
PGD
¶
Usage: ./pgd -f path
pgd options:
=================================================================================
Parallel Parameterized Graphlet Decomposition (PGD) Library
=================================================================================
-f, --file,--graph : Input GRAPH file for computing the graphlets (e.g., matrix market format, simple edge list).
-a, --algorithm : Algorithm for the GRAPHLET DECOMPOSITION. Default: exact, approximate, etc.
---------------------------------------------------------------------------------
-w, --workers : Number of PROCESSING UNITS (workers) for the algorithm to use (default = max).
-b, --block_size : Size of batch (number of jobs) dynamically assigned to the processing unit, that is, 1, 64, 512, etc. Default: -b 64
---------------------------------------------------------------------------------
-o, --ordering : Strategy used to determine the order in which the edge/node graphlets are computed.
Default: -o degree (kcore, rand, natural/off, etc.).
--s2l : Direction of the ordering (default: largest to smallest).
Note to order from smallest to largest, set '--s2l'
-n, --neigh_ordering : Strategy used to order the neighbors of each node. Default: degree (kcore, rand, natural, etc.)
Note only applicable if CSC/CSR is used (-r csc).
--s2l_neigh : Order direction for neighbor/csc ordering strategy
(e.g., --neigh_ordering degree --s2l_neigh, default: largest to smallest)
---------------------------------------------------------------------------------
-c, --counts,--macro : Compute MACRO (GLOBAL) GRAPHLET FEATURES and save counts to file (e.g., --counts name.graphlets)
-m, --micro : Compute MICRO (LOCAL) GRAPHLET FEATURES and save EDGE-by-MOTIF feature matrix (-m name.micro_graphlets)
Default: OFF. NOTE: Turn ON edge graphlet counting by specifying an output file, e.g., '-m name.micro_graphlets'
---------------------------------------------------------------------------------
-v, --verbose : Output additional details to the screen.
-?, -h, --help : Print out this help menu.
REPRESENTATION: Example: ./pgd -r csc (adj, etc.)
=================================================================================
-r, --rep : Graph representation [adj, csc, hybrid, auto, etc].
Default: Auto select optimal.
'adj' - adjacency matrix : dense n by n matrix, O(|V|^2) storage cost
'csc' - comp. sparse col : large sparse graphs, O(|V|+|E|) storage cost
'hybrid' - csc + adj : small sparse and dense graphs, O(|V|^2 + |V| + |E|) storage cost
-l, --adj_limit : Threshold/limit for creating adj representation. Default: '-l 10000' (that is <10000 nodes).
ORDERING TECHNIQUES: Example: ./pgd -o degree (kcore, rand, etc.)
=================================================================================
'degree', 'deg' : O(|V|)
'kcore', : O(|E|)
'rand', 'random' : O(|V|)
'off', 'natural'
Other methods for ordering include:
'kcore_degree', 'kcore_deg' : O(|V|)
'degree_vol', 'deg_vol' : O(|E|)
'kcore_vol', : O(|E|)
'deg_kcore_vol' : O(|E|)
------------------------------------------------------------------
NOTE: Default ordering is kcore (degeneracy order). For natural order, use '-o off' or '-o natural'
Copyright Nesreen K. Ahmed (http://nesreenahmed.com) and Ryan A. Rossi (http://ryanrossi.com).
Website: http://nesreenahmed.com/graphlets for news and updates.
Outputs¶
mfinder
¶
Summary motif results
=====================
mfinder Version 1.20
MOTIF FINDER RESULTS:
Network name: alpha.el
Network type: Directed
Num of Nodes: 510 Num of Edges: 510
Num of Nodes with edges: 510
Maximal out degree (out-hub) : 2
Maximal in degree (in-hub) : 1
Roots num: 0 Leaves num: 256
Single Edges num: 508 Mutual Edges num: 0
Motif size searched 3
Total number of 3-node subgraphs : 762
Number of random networks generated : 100
Random networks generation method: Switches
Num of Switches range: 100.0-200.0, Success switches Ratio:0.988+-0.00
The following motifs were found:
Criteria taken : Nreal Zscore > 2.00
Pval ignored (due to small number of random networks)
Mfactor > 1.10
Uniqueness >= 4
Full list includes 0 motifs
MOTIF NREAL NRAND NREAL NREAL UNIQ CREAL
ID STATS ZSCORE PVAL VAL [MILI]
Full list of subgraphs size 3 ids:
( Total num of different subgraphs size 3 is : 13 )
MOTIF NREAL NRAND NREAL NREAL CREAL UNIQ
ID STATS ZSCORE PVAL [MILI]
6 254 254.0+-0.0 -0.10 1.000 333.33 170
12 504 503.5+-1.3 0.36 0.870 661.42 72
14 0 0.0+-0.0 888888 1.000 0.00 0
36 0 0.0+-0.0 888888 1.000 0.00 0
38 0 0.0+-0.0 888888 1.000 0.00 0
46 0 0.0+-0.0 888888 1.000 0.00 0
74 0 0.0+-0.0 888888 1.000 0.00 0
78 0 0.0+-0.0 888888 1.000 0.00 0
98 0 0.2+-0.4 -0.36 1.000 0.00 0
102 0 0.0+-0.0 888888 1.000 0.00 0
108 0 0.0+-0.0 888888 1.000 0.00 0
110 0 0.0+-0.0 888888 1.000 0.00 0
238 0 0.0+-0.0 888888 1.000 0.00 0
(Application total runtime was: 18.0 seconds.)
(Real network processing runtime was: 0.0 seconds.)
(Single Random network processing runtime was: 0.2 seconds.)
fanmod_cmd
¶
FANMOD 1.1 subgraph enumeration
-------------------------------
Network name: alpha_2.el
Network type: Undirected
Number of nodes: 100
Number of edges: 400
Number of single edges: 0
Number of mutual edges: 400
Algorithm: enumeration
Subgraph size: 3
Generated 1000 random networks
with locally constant number of bidirectional edges,
3 exchanges per edge and 3 tries per edge.
1798 subgraphs were enumerated in the original network.
2817794 subgraphs were enumerated in the random networks.
2819592 subgraphs were enumerated in all networks.
For the random networks: 1608151 tries were made, 1177966 were successful.
Randomization took 0.88589 seconds.
Enumeration took 5.49207 seconds.
Result overview:
ID,Adj-Matrix,Frequency,Mean-Freq,Standard-Dev,Z-Score,p-Value
,,[Original],[Random],[Random]
78,001,71.58%,99.961%,0.00074998,-378.43,1
,001
,110
238,011,28.42%,0.039255%,0.00074998,378.43,0
,101
,110
NetMODE
¶
NetMODE produces both a file with results as well as a file with the adjacency matrices of the motifs it has detected.
calc Z-Score
gID: 6 freq: 80 ave_rand_freq: -nan (sd: -nan) conc: 0.16393 ave_rand_conc: -nan (sd: -nan) f-ZScore: -nan f-pValue: -nan c-ZScore: -nan c-pValue: -nan
gID: 12 freq: 103 ave_rand_freq: -nan (sd: -nan) conc: 0.21107 ave_rand_conc: -nan (sd: -nan) f-ZScore: -nan f-pValue: -nan c-ZScore: -nan c-pValue: -nan
gID: 14 freq: 105 ave_rand_freq: -nan (sd: -nan) conc: 0.21516 ave_rand_conc: -nan (sd: -nan) f-ZScore: -nan f-pValue: -nan c-ZScore: -nan c-pValue: -nan
gID: 36 freq: 61 ave_rand_freq: -nan (sd: -nan) conc: 0.12500 ave_rand_conc: -nan (sd: -nan) f-ZScore: -nan f-pValue: -nan c-ZScore: -nan c-pValue: -nan
gID: 38 freq: 13 ave_rand_freq: -nan (sd: -nan) conc: 0.02664 ave_rand_conc: -nan (sd: -nan) f-ZScore: -nan f-pValue: -nan c-ZScore: -nan c-pValue: -nan
gID: 46 freq: 15 ave_rand_freq: -nan (sd: -nan) conc: 0.03074 ave_rand_conc: -nan (sd: -nan) f-ZScore: -nan f-pValue: -nan c-ZScore: -nan c-pValue: -nan
gID: 74 freq: 69 ave_rand_freq: -nan (sd: -nan) conc: 0.14139 ave_rand_conc: -nan (sd: -nan) f-ZScore: -nan f-pValue: -nan c-ZScore: -nan c-pValue: -nan
gID: 78 freq: 12 ave_rand_freq: -nan (sd: -nan) conc: 0.02459 ave_rand_conc: -nan (sd: -nan) f-ZScore: -nan f-pValue: -nan c-ZScore: -nan c-pValue: -nan
gID: 98 freq: 1 ave_rand_freq: -nan (sd: -nan) conc: 0.00205 ave_rand_conc: -nan (sd: -nan) f-ZScore: -nan f-pValue: -nan c-ZScore: -nan c-pValue: -nan
gID: 102 freq: 5 ave_rand_freq: -nan (sd: -nan) conc: 0.01025 ave_rand_conc: -nan (sd: -nan) f-ZScore: -nan f-pValue: -nan c-ZScore: -nan c-pValue: -nan
gID: 108 freq: 7 ave_rand_freq: -nan (sd: -nan) conc: 0.01434 ave_rand_conc: -nan (sd: -nan) f-ZScore: -nan f-pValue: -nan c-ZScore: -nan c-pValue: -nan
gID: 110 freq: 12 ave_rand_freq: -nan (sd: -nan) conc: 0.02459 ave_rand_conc: -nan (sd: -nan) f-ZScore: -nan f-pValue: -nan c-ZScore: -nan c-pValue: -nan
gID: 238 freq: 5 ave_rand_freq: -nan (sd: -nan) conc: 0.01025 ave_rand_conc: -nan (sd: -nan) f-ZScore: -nan f-pValue: -nan c-ZScore: -nan c-pValue: -nan
graphID = 6
000
000
110
graphID = 12
000
001
100
graphID = 14
000
001
110
graphID = 36
000
100
100
graphID = 38
000
100
110
graphID = 46
000
101
110
graphID = 74
001
001
010
graphID = 78
001
001
110
graphID = 98
001
100
010
graphID = 102
001
100
110
graphID = 108
001
101
100
graphID = 110
001
101
110
graphID = 238
011
101
110
pgd
¶
[reading generic edge list: read_edge_list func] filename: sample_graph.csv
|V|: 5748
|E|: 14267
p: 0.000863783
d_max: 11
d_avg: 4
r = 0.253437
K = 4
************************************************************
total_2_1edge = 14267
total_2_indep = 16502611
----------------------------------------
total_3_tris = 9286
total_2_star = 35397
total_3_1edge = 81879530
total_3_indep = 31553402783
----------------------------------------
total_4_clique = 2116
total_4_chordcycle = 9925
total_4_tailed_tris = 41154
total_4_cycle = 1267
total_3_star = 13838
total_4_path = 88568
----------------------------------------
total_4_1edge = 234712803384
total_4_2edge = 101544802
total_4_2star = 203029889
total_4_tri = 53278602
total_4_indep = 45201167584460
************************************************************
graphlet decomposition time: 0.0163529 sec
--------------------------------------------------------------------------------
Graphlet Frequency Distribution (GFD)
--------------------------------------------------------------------------------
4-clique 4.65708e-11
4-chordal-cycle 2.18438e-10
4-tailed-tri 9.05753e-10
4-cycle 2.78852e-11
3-star 3.04559e-10
4-path 1.94928e-09
4-node-1-tri 1.1726e-06
4-node-2-star 4.46846e-06
4-node-2-edge 2.23489e-06
4-node-1-edge 0.00516576
4-node-indep 0.994826
--------------------------------------------------------------------------------
Connected Motif Graphlet Frequency Distribution (GFD)
--------------------------------------------------------------------------------
4-clique 0.013489
4-chordal-cycle 0.0632698
4-tailed-tri 0.262348
4-cycle 0.00807685
3-star 0.0882143
4-path 0.564602
--------------------------------------------------------------------------------
Disconnected Motif Graphlet Frequency Distribution (GFD)
--------------------------------------------------------------------------------
4-node-1-tri 1.1726e-06
4-node-2-star 4.46846e-06
4-node-2-edge 2.23489e-06
4-node-1-edge 0.00516576
4-node-indep 0.994826
Recompiling the binaries¶
For the moment, please see this document with more information about this.