geneplexus.custom

Helper functions for setting up custom networks and GSCs.

geneplexus.custom.edgelist_to_matrix(edgelist_loc, data_dir, net_name, features, alpha=0.85, sep='\\t', skiplines=0)[source]

Convert edgelist to an adjacency matrix or influence matrix.

Note

The NodeOrder file needs to be a single column text file. If not supplying custom GSC, the file needs to be in Entrez ID space.

Parameters:
  • edgelist_loc (str) – Location of the edgelist

  • data_dir (str) – The directory to save the file

  • net_name (str) – The name of the network

  • features (str) – Features for the networks (Adjacency or Influence, All)

  • alpha (float) – Restart parameter.

  • sep (str) – The separation used in the edgelist file (default tab)

  • skiplines (int) – The number of lines to skip for header

geneplexus.custom.edgelist_to_nodeorder(edgelist_loc, data_dir, net_name, sep='\\t', skiplines=0)[source]

Convert edgelist to node order.

The node order (NodeOrder) file is used to map gene IDs to rows in the data repsentation matrix.

Parameters:
  • edgelist_loc (str) – Location of the edgelist

  • data_dir (str) – The directory to save the file

  • net_name (str) – The name of the network

  • sep (str) – The separation used in the edgelist file (default tab)

  • skiplines (int) – The number of lines to skip for header

geneplexus.custom.subset_gsc_to_network(data_dir, net_name, gsc_name, max_size=200, min_size=10)[source]

Subset GSC to only include genes in the network.

Note

Use the geneplexus.download.download_select_data() function to get the preprocessed GO and DisGeNet files first.

Parameters:
  • data_dir (str) – The directory to save the file

  • net_name (str) – The name of the network

  • gsc_name (str) – The name of the GSC

  • max_size (int) – Maximum geneset size.

  • min_size (int) – Minimum geneset size.