Functions Related to Clustering¶
This page provides a detailed documentation of ACCORDION specific functions related to clustering, defined under (markovCluster
)
main clustering function¶
-
markovCluster.
runMarkovCluster
(out_dir, ext_edges, base_model, coef)[source]¶ This function prepares the inputs to the markov clustering algorithm (MCL) and creates a pickle file for the output clusters with interaction information. It also returns a modified baseline model (without introducing new nodes).
- Parameters:
out_dir (str) – Path of the directory that will include all output files (including intermediate and final results)
ext_edges (set) – Holds the interactions in the reading output file (extracted events). Each interaction is in the form: (regulator element, regulated element, type of interaction (+/-))
base_model (dict) – Dictionary that holds baseline model elements and corresponding regulator elements
coef (int) – The inflation parameter of the markov clustering algorithm
- Returns:
res (list) – Each item in this list is a grouped extension (i.e., this indivisible group is one candidate for model extension). It’s also stored as ‘grouped_ext’ file.
new_base_model (dict) – The baseline model elements are keys of this dict and the values are the corresponindg regulator elements(includes new edges information from extracted events).
building extended graph¶
-
markovCluster.
buildExtGraph
(ext_edges, base_model={})[source]¶ A utility function for runMarkovCluster(), this function constructs two graph models, one with the whole extension information (i.e., with both new edges and new nodes), another for the modified baseline model (i.e.,without introducing new nodes)
- Parameters:
ext_edges (set) – Holds the interactions in the reading output file (extracted events). Each interaction is in the form: (regulator element, regulated element, type of interaction (+/-))
base_model (dict) – Dictionary that holds baseline model elements and corresponding regulator elements
- Returns:
ext_model (dict) – This dict contains the elements of both the baseline model and the reading output file. Those elements are the keys and the values are the corresponding regulator elements.
new_base_model (dict) – The baseline model elements are keys of this dict and the values are the corresponindg regulator elements.
recalling MCL algorithm¶
-
markovCluster.
clusteringAlgo
(MCL_result_folder, ext_model, coef)[source]¶ A utility function for runMarkovCluster(), this function is designed to run Markov Clustering Algorithm(MCL) obtained at https://micans.org/mcl/, build on its latest stable release /mcl/src/mcl-14-137
- Parameters:
MCL_result_folder (str) – Folder name of the directory that will store the intermediate and final result file of MCL algorithm, default as ‘examples/Output/’. Inside the folder, ‘markov_cluster’ file is the final clustering result, with each row in this file being a cluster.
ext_model (dict) – This dict contains the elements of both the baseline model and the reading output file. Those elements are the keys and the values are the corresponding regulator elements.
coef (int) – The inflation parameter of the Markov Clustering Algorithm(MCL)
translating baseline to edges¶
-
markovCluster.
ModelNetwork
(out_dir, base_model)[source]¶ A utility function for runMarkovCluster(), this function translates the baseline model to a file with edges of interactions (serves as an intermediate result file)
- Parameters:
out_dir (str) – Path of the directory that will include the output file
base_model (dict) – Dictionary that holds baseline model elements and corresponding regulator elements
generating extension candidates¶
-
markovCluster.
getGroupedExt
(cluster_file, ext_edges)[source]¶ A utility function for runMarkovCluster(), this function summarizes the clustering result file and interaction information to generate list of candidate extensions
- Parameters:
cluster_file (str) – The path of the markov_cluster file (the result of MCL algorithm), each row in this file is a list that is classified as a cluster
ext_edges (set) – Holds the interactions in the reading output file (extracted events). Each interaction is in the form: (regulator element, regulated element, type of interaction (+/-))
- Returns:
res – Each item in this list is a grouped extension (i.e., this indivisible group is one candidate for model extension)
- Return type:
list