geneplexus.util
Utilities including file and path handling.
- geneplexus.util.check_file(path)[source]
Check existence of a file.
- Parameters:
path (str) – Path to the file.
- Raises:
FileNotFoundError – if file not exist.
- geneplexus.util.check_param(name, value, expected, /)[source]
Check parameter specified and raise ValueError for unexpected value.
- Parameters:
name (str) –
value (Any) –
expected (List[Any]) –
- geneplexus.util.format_choices(choices)[source]
Convert list of str to choices format.
- Return type:
str- Parameters:
choices (List[str]) –
- geneplexus.util.get_all_filenames()[source]
Iterate over filenames.
- Return type:
Generator[str,None,None]
- geneplexus.util.get_all_gscs(file_loc)[source]
Return list of GSCs found in the data directory.
- Return type:
List[str]- Parameters:
file_loc (str | None) –
Note
Only the full GSC is checked (starts with
GSCOriginal), but not the network specific ones (goodsets and universe).
- geneplexus.util.get_all_net_types(file_loc)[source]
Return list of networks found in the data directory.
- Return type:
List[str]- Parameters:
file_loc (str | None) –
Note
Only the node ordering files are checked (starts with
NodeOrder).
- geneplexus.util.load_correction_mat(file_loc, gsc, target_set, net_type, features)[source]
Load correction matrix.
- Return type:
ndarray- Parameters:
file_loc (str) – Location of data files.
gsc (Literal['GO', 'DisGeNet']) – Gene set collection.
target_set (Literal['GO', 'DisGeNet']) – Target gene set collection.
net_type (Literal['BioGRID', 'STRING', 'STRING-EXP', 'GIANT-TN']) – Network used.
features (Literal['Adjacency', 'Embedding', 'Influence']) – Type of features used.
- geneplexus.util.load_correction_order(file_loc, target_set, net_type)[source]
Load correction matrix order.
- Return type:
ndarray- Parameters:
file_loc (str) – Location of data files.
target_set (Literal['GO', 'DisGeNet']) – Target gene set collection.
net_type (Literal['BioGRID', 'STRING', 'STRING-EXP', 'GIANT-TN']) – Network used.
- geneplexus.util.load_gene_features(file_loc, features, net_type)[source]
Load gene features.
- Return type:
ndarray- Parameters:
file_loc (str) – Location of data files.
net_type (Literal['BioGRID', 'STRING', 'STRING-EXP', 'GIANT-TN']) – Network used.
features (Literal['Adjacency', 'Embedding', 'Influence']) – Type of features used.
- geneplexus.util.load_geneid_conversion(file_loc, src_id_type, dst_id_type, upper=False)[source]
Load the gene ID conversion mapping.
- Return type:
Dict[str,List[str]]- Parameters:
file_loc (str) – Directory containig the ID conversion file.
src_id_type (Literal['ENSG', 'ENSP', 'ENST', 'Entrez', 'Symbol']) – Souce gene ID type.
dst_id_type (Literal['Entrez', 'ENSG', 'Name', 'Symbol']) – Destination gene ID type.
upper (bool) – If set to True, then convert all keys to upper case.
- geneplexus.util.load_genes_universe(file_loc, gsc, net_type)[source]
Load gene universe a given network and GSC.
- Return type:
ndarray- Parameters:
file_loc (str) – Location of data files.
gsc (Literal['GO', 'DisGeNet']) – Gene set collection.
net_type (Literal['BioGRID', 'STRING', 'STRING-EXP', 'GIANT-TN']) – Network used.
- geneplexus.util.load_gsc(file_loc, gsc, net_type)[source]
Load gene set collection dictionary.
- Return type:
Dict[str,Dict[Literal['Name','Genes'],Union[str,ndarray]]]- Parameters:
file_loc (str) – Location of data files.
target_set – Target gene set collection.
net_type (Literal['BioGRID', 'STRING', 'STRING-EXP', 'GIANT-TN']) – Network used.
gsc (Literal['GO', 'DisGeNet']) –
- geneplexus.util.load_node_order(file_loc, net_type)[source]
Load network genes.
- Return type:
ndarray- Parameters:
file_loc (str) – Location of data files.
net_type (Literal['BioGRID', 'STRING', 'STRING-EXP', 'GIANT-TN']) – Network used.
- geneplexus.util.load_pretrained_weights(file_loc, target_set, net_type, features)[source]
Load pretrained model dictionary.
- Return type:
Dict[str,Dict[Literal['Name','Weights','PosGenes'],Union[str,ndarray]]]- Parameters:
file_loc (str) – Location of data files.
target_set (Literal['GO', 'DisGeNet']) – Target gene set collection.
net_type (Literal['BioGRID', 'STRING', 'STRING-EXP', 'GIANT-TN']) – Network used.
features (Literal['Adjacency', 'Embedding', 'Influence']) – Type of features used.
- geneplexus.util.mapgene(gene, entrez_to_other)[source]
Map entrez to other representations.
- Return type:
str- Parameters:
gene (str) – Entrez gene ID.
entrez_to_other (Dict[str, List[str]]) – Mapping from Entrez to list of other gene representations of interest.
- Returns:
Gene representation corresponding to the gene Entrez ID.
- Return type:
str
Note
Mapping from a single Entrez to multiple representations is allowed and the representations will be separated by ‘/’.
- geneplexus.util.normexpand(path, create=True)[source]
Normalize then expand path and optionally create dir.
- Return type:
str- Parameters:
path (str) –
create (bool) –