Functional Class Scoring¶
This module contain the functional class methods implemented in PathwayForte.
Currently this includes GSEA and ssGSEA.
-
pathway_forte.pathway_enrichment.functional_class.create_cls_file(gene_expression_file, normal_sample_file, tumor_sample_file, data)[source]¶ Create categorical (e.g. tumor vs sample) class file format (i.e., .cls) for input into GSEA.
-
pathway_forte.pathway_enrichment.functional_class.run_gsea(gene_exp, gene_set, phenotype_class, permutations=500, output_dir='/home/docs/checkouts/readthedocs.org/user_builds/pathwayforte/checkouts/latest/data/results/gsea')[source]¶ Run GSEA on a given dataset with a given gene set.
-
pathway_forte.pathway_enrichment.functional_class.filter_gsea_results(gsea_results_path, source, kegg_manager=None, reactome_manager=None, wikipathways_manager=None, p_value=None, absolute_nes_filter=None, geneset_set_filter_minimum_size=None, geneset_set_filter_maximum_size=None)[source]¶ Get top and bottom rankings from GSEA results.
- Parameters
gsea_results_path (
str) – path to GSEA results in .tsv file formatsource –
kegg_manager (
Optional[Manager]) – KEGG managerreactome_manager (
Optional[Manager]) – Reactome managerwikipathways_manager (
Optional[Manager]) – WikiPathways managerabsolute_nes_filter (
Optional[float]) – filter by magnitude of normalized enrichment scoresgeneset_set_filter_minimum_size (
Optional[int]) – filter to include a minimum number of genes in a gene setgeneset_set_filter_maximum_size (
Optional[int]) – filter to include a maximum number of genes in a gene set
- Return type
DataFrame- Returns
list of pathways ranked as having the highest and lowest significant enrichment scores
-
pathway_forte.pathway_enrichment.functional_class.merge_statistics(merged_pathways_df, dataset)[source]¶ Get statistics for pathways included in the merged gene sets dataFrame.
These include the proportion of pathways from each of the other databases and the proportion of pathways deriving from 2 or more primary resources
- Parameters
merged_pathways_df (
DataFrame) – dataFrame containing pathways from multiple databases- Returns
statistics of contents in merged dataset
-
pathway_forte.pathway_enrichment.functional_class.rearrange_df_columns(df)[source]¶ Rearrange order of columns.
- Return type
DataFrame
-
pathway_forte.pathway_enrichment.functional_class.get_pathway_names(database, pathway_df, kegg_manager=None, reactome_manager=None, wikipathways_manager=None)[source]¶ Get pathway names from database specific pathway IDs.
-
pathway_forte.pathway_enrichment.functional_class.pathway_names_to_df(filtered_gsea_results_df, all_pathway_ids, source, kegg_manager=None, reactome_manager=None, wikipathways_manager=None)[source]¶ Get pathway names.
- Parameters
- Return type
DataFrame
-
pathway_forte.pathway_enrichment.functional_class.gsea_results_to_filtered_df(dataset, kegg_manager=None, reactome_manager=None, wikipathways_manager=None, p_value=None, absolute_nes_filter=None, geneset_set_filter_minimum_size=None, geneset_set_filter_maximum_size=None)[source]¶ Get filtered GSEA results dataFrames.
-
pathway_forte.pathway_enrichment.functional_class.get_pathways_by_resource(pathways, resource)[source]¶ Return pathways by resource.
- Return type
-
pathway_forte.pathway_enrichment.functional_class.get_analogs_comparison_numbers(kegg_reactome_pathway_df, reactome_wikipathways_pathway_df, wikipathways_kegg_pathway_df, *, pathway_column='pathway_id')[source]¶ Get number of existing versus expected pairwise mappings.
-
pathway_forte.pathway_enrichment.functional_class.get_pairwise_mapping_numbers(kegg_pathway_df, reactome_pathway_df, wikipathways_pathway_df)[source]¶ Get number of existing versus expected pairwise mappings.
-
pathway_forte.pathway_enrichment.functional_class.get_pairwise_mappings(kegg_pathway_df, reactome_pathway_df, wikipathways_pathway_df)[source]¶ Get pairwise mappings.
-
pathway_forte.pathway_enrichment.functional_class.compare_database_results(df_1, resource_1, df_2, resource_2, mapping_dict, check_contradiction=False)[source]¶ Compare pathways in the dataframe from enrichment results to evaluate the concordance in similar pathways.
-
pathway_forte.pathway_enrichment.functional_class.get_matching_pairs(df_1, resource_1, df_2, resource_2, equivalent_mappings_dict)[source]¶ Get equivalent pathways and their direction of change.
-
pathway_forte.pathway_enrichment.functional_class.run_ssgsea(filtered_expression_data, gene_set, output_dir='/home/docs/checkouts/readthedocs.org/user_builds/pathwayforte/checkouts/latest/data/results/ssgsea', processes=1, max_size=3000, min_size=15)[source]¶ Run single sample GSEA (ssGSEA) on filtered gene expression data set.
-
pathway_forte.pathway_enrichment.functional_class.filter_gene_exp_data(expression_data, gmt_file)[source]¶ Filter gene expression data file to include only gene names which are found in the gene set files.
- Parameters
expression_data (
DataFrame) – gene expression values for samplesgmt_file (
str) – .gmt file containing gene sets
- Returns
Filtered gene expression data with genes with no correspondences in gene sets removed
- Return type
pandas.core.frame.DataFrame kegg_xml_parser.py