moDiNA

Submodules

modina.context_net_inference module

modina.context_net_inference.calculate_association_scores(ord_data, nom_data, cont_data, bi_data, test_type='nonparametric', num_workers=1, nan_value=-89.0, correction='bh')[source]

Return type:: DataFrame

modina.context_net_inference.compute_context_scores(context_data, meta_file, test_type='nonparametric', correction='bh', num_workers=1, path=None, nan_value=None, name='context1')[source]

Compute association scores for a given context.

Parameters:

context_data (DataFrame) – Raw context data (rows: samples, columns: variables).
meta_file (DataFrame) – Metadata file containing a ‘label’ and ‘type’ column to specify the data type of each variable.
test_type (str) – Type of tests to use for network inference. Defaults to ‘nonparametric’.
correction (str) – Correction method for multiple testing. Defaults to ‘bh’.
num_workers (int) – Number of workers for parallel processing. Defaults to 1.
path (Optional[str]) – Optional path to save the computed scores as a CSV file. Defaults to None.
nan_value (Optional[float]) – Numerical value used for NaN values in the context data. If None, an error will be raised if such values are present. Defaults to None.
name (str) – Name of the context. Used for saving files. Defaults to ‘context’.

Return type:

DataFrame

Returns:

A pd.DataFrame containing the computed association scores.

modina.context_net_inference.napy_bi_cont(cont_phenotypes, bi_phenotypes, test='nonparametric', num_workers=8, nan_value=-89.0)[source]

modina.context_net_inference.napy_bi_nom(nom_phenotypes, bi_phenotypes, num_workers=8, nan_value=-89.0)[source]

modina.context_net_inference.napy_bi_ord(ord_phenotypes, bi_phenotypes, num_workers=8, nan_value=-89.0)[source]

modina.context_net_inference.napy_cont_cont(cont_phenotypes, test='nonparametric', num_workers=8, nan_value=-89.0)[source]

modina.context_net_inference.napy_nom_cont(cont_phenotypes, nom_phenotypes, test='nonparametric', num_workers=8, nan_value=-89.0)[source]

modina.context_net_inference.napy_ord_cont(cont_phenotypes, ord_phenotypes, num_workers=8, nan_value=-89.0)[source]

modina.context_net_inference.napy_ord_nom(ord_phenotypes, nom_phenotypes, num_workers=8, nan_value=-89.0)[source]

modina.context_simulation module

modina.context_simulation.save_gt(groundtruths, path, mode='node')[source]

modina.context_simulation.simulate_copula(path=None, name1='context1', name2='context2', n_bi=50, n_cont=50, n_cat=50, n_samples=500, n_shift_cont=0, n_shift_bi=0, n_shift_cat=0, n_corr_cont_cont=0, n_corr_bi_bi=0, n_corr_cat_cat=0, n_corr_bi_cont=0, n_corr_bi_cat=0, n_corr_cont_cat=0, n_both_cont_cont=0, n_both_bi_bi=0, n_both_cat_cat=0, n_both_bi_cont=0, n_both_bi_cat=0, n_both_cont_cat=0, shift=0.5, corr=0.7)[source]

Simulate two contexts with binary and continuous nodes using a Gaussian copula.

Parameters:

path – Path to save the simulated contexts, the meta file and the ground truth information. If None, files are not saved.
name1 – Name of the first context.
name2 – Name of the second context.
n_bi – Number of binary nodes to simulate.
n_cont – Number of continuous nodes to simulate.
n_cat – Number of categorical nodes to simulate.
n_samples – Number of samples per context.
n_shift_cont – Number of continuous nodes with an artificially introduced mean shift.
n_shift_bi – Number of binary nodes with an artificially introduced mean shift.
n_shift_cat – Number of categorical nodes with an artificially introduced mean shift.
n_corr_cont_cont – Number of continuous node pairs with an artifically introduced correlation difference.
n_corr_bi_bi – Number of binary node pairs with an artificially introduced correlation difference.
n_corr_cat_cat – Number of categorical node pairs with an artificially introduced correlation difference.
n_corr_bi_cat – Number of binary-categorical node pairs with an artificially introduced correlation difference.
n_corr_cont_cat – Number of continuous-categorical node pairs with an artificially introduced correlation difference.
n_corr_bi_cont – Number of mixed node pairs with an artificially introduced correlation difference.
n_both_cont_cont – Number of continuous node pairs with both an aritificially introduced mean shift and correlation difference.
n_both_bi_bi – Number of binary node pairs with both an artificially introduced mean shift and correlation difference.
n_both_cat_cat – Number of categorical node pairs with both an artificially introduced mean shift and correlation difference.
n_both_bi_cat – Number of binary-categorical node pairs with both an artificially introduced mean shift and correlation difference.
n_both_cont_cat – Number of continuous-categorical node pairs with both an artificially introduced mean shift and correlation difference.
n_both_bi_cont – Number of mixed node pairs with both an artificially introduced mean shift and correlation difference.
shift – Magnitude of the mean shift.
corr – Magnitude of the correlation difference (measured as correlation coefficient between 0 and 1).

Returns:

A tuple containing the two simulated contexts, a meta file and a list of ground truth nodes. - context1: pd.DataFrame of the first simulated context. - context2: pd.DataFrame of the second simulated context. - meta: pd.DataFrame containing the data type for each simulated variable. - ground_truth: A tuple containing three lists of ground truth nodes: (shift_nodes, corr_nodes, shift_corr_nodes).

modina.diff_net_construction module

modina.diff_net_construction.compute_diff_edges(scores1, scores2, edge_metric, max_path_length=2, path=None)[source]

Compute differential edge scores based on the specified edge metric.

Parameters:

scores1 (DataFrame) – Statistical association scores of Context 1, rescaled and potentially filtered.
scores2 (DataFrame) – Statistical association scores of Context 2, rescaled and potentially filtered.
edge_metric (str) – Edge metric to compute the differential edge scores.
max_path_length (int) – Maximum length of paths to consider in the computation of integrated interaction scores. Defaults to 2.
path (Optional[str]) – Optional path to save the differential edge scores as a CSV file. Defaults to None.

Return type:

DataFrame | None | Series

Returns:

A DataFrame containing the computed differential edge scores.

modina.diff_net_construction.compute_diff_network(scores1, scores2, context1, context2, edge_metric=None, node_metric=None, max_path_length=2, correction='bh', num_workers=1, path=None, format='csv', meta_file=None, test_type='nonparametric', nan_value=None)[source]

Computation of a differential network defined by a node metric and an edge metric.

Parameters:

scores1 (DataFrame) – Statistical association scores of Context 1, rescaled and potentially filtered.
scores2 (DataFrame) – Statistical association scores of Context 2, rescaled and potentially filtered.
context1 (DataFrame) – Observed data of Context 1, potentially filtered.
context2 (DataFrame) – Observed data of Context 2, potentially filtered.
edge_metric (Optional[str]) – Edge metric used to construct the differential network.
node_metric (Optional[str]) – Node metric used to construct the differential network.
max_path_length (int) – Maximum length of paths to consider in the computation of integrated interaction scores. Defaults to 2.
correction (str) – Correction method for multiple testing. Defaults to ‘bh’.
num_workers (int) – Number of workers for parallel computation of STC. Defaults to 1.
path (Optional[str]) – Optional path to save the differential scores as CSV files. Defaults to None.
format (str) – File format to save the differential network. Options are ‘csv’ and ‘graphml’. Defaults to ‘csv’.
meta_file (Optional[DataFrame]) – Meta file containing the node types. Only needed if node_metric is ‘STC’. Defaults to None.
test_type (str) – Test type to use for continuous nodes in STC metric. Defaults to ‘nonparametric’.
nan_value (Optional[int]) – Numerical value used for NaN values in the context data. If None, an error will be raised if such values are present. Defaults to None.

Return type:

Tuple[Series | DataFrame | None, Series | DataFrame | None]

Returns:

A tuple (edges_diff, nodes_diff) containing the computed differential edges and nodes.

modina.diff_net_construction.compute_diff_nodes(scores1, scores2, context1, context2, node_metric, correction='bh', meta_file=None, test_type='nonparametric', nan_value=None, num_workers=1, path=None)[source]

Compute differential node scores based on the specified node metric.

Parameters:

scores1 (DataFrame) – Statistical association scores of Context 1, rescaled and potentially filtered.
scores2 (DataFrame) – Statistical association scores of Context 2, rescaled and potentially filtered.
context1 (DataFrame) – Observed data of Context 1, potentially filtered.
context2 (DataFrame) – Observed data of Context 2, potentially filtered.
node_metric (str) – Node metric to compute the differential node scores.
correction (str) – Correction method for multiple testing. Only needed if node_metric is ‘STC’. Defaults to ‘bh’.
meta_file (Optional[DataFrame]) – Meta file containing the node types. Only needed if node_metric is ‘STC’. Defaults to None.
test_type (str) – Test type to compare continuous variables across contexts for the ‘STC’ node metric. Defaults to ‘nonparametric’.
nan_value (Optional[int]) – Numerical value used for NaN values in the context data. If None, an error will be raised if such values are present. Defaults to None.
num_workers (int) – Number of workers for parallel computation of STC. Only needed if node_metric is ‘STC’. Defaults to 1.
path (Optional[str]) – Optional path to save the differential node scores as a CSV file. Defaults to None.

Return type:

DataFrame | None | Series

Returns:

A DataFrame containing the computed differential node scores.

modina.diff_net_construction.degree_centrality(nodes_diff, scores1, scores2, metric='DC-P')[source]

modina.diff_net_construction.interaction_score(data, max_path_length=3, metric='std-E')[source]

modina.diff_net_construction.pagerank_centrality(nodes_diff, scores1, scores2, metric='PRC-P')[source]

modina.diff_net_construction.stat_test_centrality(context1, context2, meta_file, test_type='nonparametric', correction='bh', nan_value=None, num_workers=1)[source]

modina.edge_filtering module

modina.edge_filtering.filter(scores1, scores2, context1, context2, filter_method=None, filter_param=0.0, filter_metric=None, filter_rule=None, path=None)[source]

Filter association scores and context data based on the specified filtering configurations.

Parameters:

scores1 (DataFrame) – Statistical association scores of Context 1.
scores2 (DataFrame) – Statistical association scores of Context 2.
context1 (DataFrame) – The first context for the differential network analysis.
context2 (DataFrame) – The second context for the differential network analysis.
filter_method (Optional[str]) – Method used for filtering. Defaults to None.
filter_param (float) – Parameter for the specified filtering method. Defaults to 0.0.
filter_metric (Optional[str]) – Edge metric used for filtering. Options include ‘raw-P’ and ‘std-E’. Defaults to None.
filter_rule (Optional[str]) – Rule to integrate the networks during filtering. Defaults to None.
path (Optional[str]) – Optional path to save the filtered scores and context data as CSV files. Defaults to None.

Return type:

Tuple[DataFrame, DataFrame, DataFrame, DataFrame]

Returns:

A tuple containing the filtered scores and context data.

modina.pipeline module

modina.pipeline.diffnet_analysis(context1, context2, meta_file, edge_metric=None, node_metric=None, ranking_alg='PageRank+', filter_method=None, filter_param=0.0, filter_metric=None, filter_rule=None, max_path_length=2, test_type='nonparametric', nan_value=None, correction='bh', num_workers=1, project_path=None, name1='context1', name2='context2')[source]

Wrapper function to perform an end-to-end differential network analysis following the moDiNA pipeline.

Parameters:

context1 (DataFrame) – Observed data of Context 1 (rows: samples, columns: variables).
context2 (DataFrame) – Observed data of Context 2 (rows: samples, columns: variables).
meta_file (DataFrame) – Metadata file containing a ‘label’ and ‘type’ column to specify the data type of each variable.
test_type (str) – Type of statistical tests to use for association score calculation. Defaults to ‘nonparametric’.
nan_value (Optional[int]) – Numerical value used for NaN values in the context data. If None, an error will be raised if such values are present. Defaults to None.
correction (str) – Correction method for multiple testing. Defaults to ‘bh’.
num_workers (int) – Number of workers for parallel processing. Defaults to 1.
filter_method (Optional[str]) – Method used for filtering. Defaults to None.
filter_param (float) – Parameter for the specified filtering method. Defaults to 0.0.
filter_metric (Optional[str]) – Edge metric used for filtering. Defaults to None.
filter_rule (Optional[str]) – Rule to integrate the networks during filtering. Defaults to None.
edge_metric (Optional[str]) – Edge metric used to construct the differential network.
node_metric (Optional[str]) – Node metric used to construct the differential network.
max_path_length (int) – Maximum length of paths to consider in the computation of integrated interaction scores. Defaults to 2.
ranking_alg (str) – Ranking algorithm to compute. Options are ‘PageRank+’, ‘PageRank’, ‘absDimontRank’, ‘DimontRank’, ‘nodeRank’ and ‘edgeRank’. Defaults to ‘PageRank+’.
name1 (str) – Name of Context 1. Used for saving files. Defaults to ‘context1’.
name2 (str) – Name of Context 2. Used for saving files. Defaults to ‘context2’.
project_path (Optional[str]) – Optional path to save results. Defaults to None.

Returns:

A tuple (ranking, edges_diff, nodes_diff, config) containing the computed ranking, differential edges, differential nodes, and configuration parameters.

modina.ranking module

modina.ranking.compute_ranking(nodes_diff, edges_diff, ranking_alg, path=None, meta_file=None)[source]

Compute a ranking based on the specified ranking algorithm.

Parameters:

nodes_diff (Union[DataFrame, Series, None]) – Differential node scores.
edges_diff (Union[DataFrame, Series, None]) – Differential edge scores.
ranking_alg (str) – Ranking algorithm to compute. Options are ‘PageRank+’, ‘PageRank’, ‘absDimontRank’, ‘DimontRank’, ‘nodeRank’ and ‘edgeRank’.
meta_file (Optional[DataFrame]) – Metadata file containing a ‘label’ and ‘type’ column to specify the data type of each variable.
path (Optional[str]) – Optional path to save the ranking as a CSV file.

Return type:

DataFrame | Series

Returns:

A tuple containing the list of ranked nodes and a dictionary with ranked nodes per data type.

modina.ranking.dimontrank(edges_diff, edge_metric, mode='abs')[source]

modina.ranking.pagerank(edges_diff, edge_metric, nodes_diff=None, node_metric=None, personalization=True)[source]

modina.statistics_utils module

modina.statistics_utils.cohens_d_to_r(scores1, scores2, n1, n2)[source]

modina.statistics_utils.probit_rescaling(scores1, scores2, metric='probit-E')[source]

modina.statistics_utils.std_rescaling(scores1, scores2, metric='std-E')[source]

Module contents

modina.cohens_d_to_r(scores1, scores2, n1, n2)[source]

modina.compute_context_scores(context_data, meta_file, test_type='nonparametric', correction='bh', num_workers=1, path=None, nan_value=None, name='context1')[source]

Compute association scores for a given context.

Parameters:

context_data (DataFrame) – Raw context data (rows: samples, columns: variables).
meta_file (DataFrame) – Metadata file containing a ‘label’ and ‘type’ column to specify the data type of each variable.
test_type (str) – Type of tests to use for network inference. Defaults to ‘nonparametric’.
correction (str) – Correction method for multiple testing. Defaults to ‘bh’.
num_workers (int) – Number of workers for parallel processing. Defaults to 1.
path (Optional[str]) – Optional path to save the computed scores as a CSV file. Defaults to None.
nan_value (Optional[float]) – Numerical value used for NaN values in the context data. If None, an error will be raised if such values are present. Defaults to None.
name (str) – Name of the context. Used for saving files. Defaults to ‘context’.

Return type:

DataFrame

Returns:

A pd.DataFrame containing the computed association scores.

modina.compute_diff_edges(scores1, scores2, edge_metric, max_path_length=2, path=None)[source]

Compute differential edge scores based on the specified edge metric.

Parameters:

scores1 (DataFrame) – Statistical association scores of Context 1, rescaled and potentially filtered.
scores2 (DataFrame) – Statistical association scores of Context 2, rescaled and potentially filtered.
edge_metric (str) – Edge metric to compute the differential edge scores.
max_path_length (int) – Maximum length of paths to consider in the computation of integrated interaction scores. Defaults to 2.
path (Optional[str]) – Optional path to save the differential edge scores as a CSV file. Defaults to None.

Return type:

DataFrame | None | Series

Returns:

A DataFrame containing the computed differential edge scores.

modina.compute_diff_network(scores1, scores2, context1, context2, edge_metric=None, node_metric=None, max_path_length=2, correction='bh', num_workers=1, path=None, format='csv', meta_file=None, test_type='nonparametric', nan_value=None)[source]

Computation of a differential network defined by a node metric and an edge metric.

Parameters:

scores1 (DataFrame) – Statistical association scores of Context 1, rescaled and potentially filtered.
scores2 (DataFrame) – Statistical association scores of Context 2, rescaled and potentially filtered.
context1 (DataFrame) – Observed data of Context 1, potentially filtered.
context2 (DataFrame) – Observed data of Context 2, potentially filtered.
edge_metric (Optional[str]) – Edge metric used to construct the differential network.
node_metric (Optional[str]) – Node metric used to construct the differential network.
max_path_length (int) – Maximum length of paths to consider in the computation of integrated interaction scores. Defaults to 2.
correction (str) – Correction method for multiple testing. Defaults to ‘bh’.
num_workers (int) – Number of workers for parallel computation of STC. Defaults to 1.
path (Optional[str]) – Optional path to save the differential scores as CSV files. Defaults to None.
format (str) – File format to save the differential network. Options are ‘csv’ and ‘graphml’. Defaults to ‘csv’.
meta_file (Optional[DataFrame]) – Meta file containing the node types. Only needed if node_metric is ‘STC’. Defaults to None.
test_type (str) – Test type to use for continuous nodes in STC metric. Defaults to ‘nonparametric’.
nan_value (Optional[int]) – Numerical value used for NaN values in the context data. If None, an error will be raised if such values are present. Defaults to None.

Return type:

Tuple[Series | DataFrame | None, Series | DataFrame | None]

Returns:

A tuple (edges_diff, nodes_diff) containing the computed differential edges and nodes.

modina.compute_diff_nodes(scores1, scores2, context1, context2, node_metric, correction='bh', meta_file=None, test_type='nonparametric', nan_value=None, num_workers=1, path=None)[source]

Compute differential node scores based on the specified node metric.

Parameters:

scores1 (DataFrame) – Statistical association scores of Context 1, rescaled and potentially filtered.
scores2 (DataFrame) – Statistical association scores of Context 2, rescaled and potentially filtered.
context1 (DataFrame) – Observed data of Context 1, potentially filtered.
context2 (DataFrame) – Observed data of Context 2, potentially filtered.
node_metric (str) – Node metric to compute the differential node scores.
correction (str) – Correction method for multiple testing. Only needed if node_metric is ‘STC’. Defaults to ‘bh’.
meta_file (Optional[DataFrame]) – Meta file containing the node types. Only needed if node_metric is ‘STC’. Defaults to None.
test_type (str) – Test type to compare continuous variables across contexts for the ‘STC’ node metric. Defaults to ‘nonparametric’.
nan_value (Optional[int]) – Numerical value used for NaN values in the context data. If None, an error will be raised if such values are present. Defaults to None.
num_workers (int) – Number of workers for parallel computation of STC. Only needed if node_metric is ‘STC’. Defaults to 1.
path (Optional[str]) – Optional path to save the differential node scores as a CSV file. Defaults to None.

Return type:

DataFrame | None | Series

Returns:

A DataFrame containing the computed differential node scores.

modina.compute_ranking(nodes_diff, edges_diff, ranking_alg, path=None, meta_file=None)[source]

Compute a ranking based on the specified ranking algorithm.

Parameters:

nodes_diff (Union[DataFrame, Series, None]) – Differential node scores.
edges_diff (Union[DataFrame, Series, None]) – Differential edge scores.
ranking_alg (str) – Ranking algorithm to compute. Options are ‘PageRank+’, ‘PageRank’, ‘absDimontRank’, ‘DimontRank’, ‘nodeRank’ and ‘edgeRank’.
meta_file (Optional[DataFrame]) – Metadata file containing a ‘label’ and ‘type’ column to specify the data type of each variable.
path (Optional[str]) – Optional path to save the ranking as a CSV file.

Return type:

DataFrame | Series

Returns:

A tuple containing the list of ranked nodes and a dictionary with ranked nodes per data type.

modina.diffnet_analysis(context1, context2, meta_file, edge_metric=None, node_metric=None, ranking_alg='PageRank+', filter_method=None, filter_param=0.0, filter_metric=None, filter_rule=None, max_path_length=2, test_type='nonparametric', nan_value=None, correction='bh', num_workers=1, project_path=None, name1='context1', name2='context2')[source]

Wrapper function to perform an end-to-end differential network analysis following the moDiNA pipeline.

Parameters:

context1 (DataFrame) – Observed data of Context 1 (rows: samples, columns: variables).
context2 (DataFrame) – Observed data of Context 2 (rows: samples, columns: variables).
meta_file (DataFrame) – Metadata file containing a ‘label’ and ‘type’ column to specify the data type of each variable.
test_type (str) – Type of statistical tests to use for association score calculation. Defaults to ‘nonparametric’.
nan_value (Optional[int]) – Numerical value used for NaN values in the context data. If None, an error will be raised if such values are present. Defaults to None.
correction (str) – Correction method for multiple testing. Defaults to ‘bh’.
num_workers (int) – Number of workers for parallel processing. Defaults to 1.
filter_method (Optional[str]) – Method used for filtering. Defaults to None.
filter_param (float) – Parameter for the specified filtering method. Defaults to 0.0.
filter_metric (Optional[str]) – Edge metric used for filtering. Defaults to None.
filter_rule (Optional[str]) – Rule to integrate the networks during filtering. Defaults to None.
edge_metric (Optional[str]) – Edge metric used to construct the differential network.
node_metric (Optional[str]) – Node metric used to construct the differential network.
max_path_length (int) – Maximum length of paths to consider in the computation of integrated interaction scores. Defaults to 2.
ranking_alg (str) – Ranking algorithm to compute. Options are ‘PageRank+’, ‘PageRank’, ‘absDimontRank’, ‘DimontRank’, ‘nodeRank’ and ‘edgeRank’. Defaults to ‘PageRank+’.
name1 (str) – Name of Context 1. Used for saving files. Defaults to ‘context1’.
name2 (str) – Name of Context 2. Used for saving files. Defaults to ‘context2’.
project_path (Optional[str]) – Optional path to save results. Defaults to None.

Returns:

A tuple (ranking, edges_diff, nodes_diff, config) containing the computed ranking, differential edges, differential nodes, and configuration parameters.

modina.filter(scores1, scores2, context1, context2, filter_method=None, filter_param=0.0, filter_metric=None, filter_rule=None, path=None)[source]

Filter association scores and context data based on the specified filtering configurations.

Parameters:

scores1 (DataFrame) – Statistical association scores of Context 1.
scores2 (DataFrame) – Statistical association scores of Context 2.
context1 (DataFrame) – The first context for the differential network analysis.
context2 (DataFrame) – The second context for the differential network analysis.
filter_method (Optional[str]) – Method used for filtering. Defaults to None.
filter_param (float) – Parameter for the specified filtering method. Defaults to 0.0.
filter_metric (Optional[str]) – Edge metric used for filtering. Options include ‘raw-P’ and ‘std-E’. Defaults to None.
filter_rule (Optional[str]) – Rule to integrate the networks during filtering. Defaults to None.
path (Optional[str]) – Optional path to save the filtered scores and context data as CSV files. Defaults to None.

Return type:

Tuple[DataFrame, DataFrame, DataFrame, DataFrame]

Returns:

A tuple containing the filtered scores and context data.

modina.probit_rescaling(scores1, scores2, metric='probit-E')[source]

modina.save_gt(groundtruths, path, mode='node')[source]

modina.simulate_copula(path=None, name1='context1', name2='context2', n_bi=50, n_cont=50, n_cat=50, n_samples=500, n_shift_cont=0, n_shift_bi=0, n_shift_cat=0, n_corr_cont_cont=0, n_corr_bi_bi=0, n_corr_cat_cat=0, n_corr_bi_cont=0, n_corr_bi_cat=0, n_corr_cont_cat=0, n_both_cont_cont=0, n_both_bi_bi=0, n_both_cat_cat=0, n_both_bi_cont=0, n_both_bi_cat=0, n_both_cont_cat=0, shift=0.5, corr=0.7)[source]

Simulate two contexts with binary and continuous nodes using a Gaussian copula.

Parameters:

path – Path to save the simulated contexts, the meta file and the ground truth information. If None, files are not saved.
name1 – Name of the first context.
name2 – Name of the second context.
n_bi – Number of binary nodes to simulate.
n_cont – Number of continuous nodes to simulate.
n_cat – Number of categorical nodes to simulate.
n_samples – Number of samples per context.
n_shift_cont – Number of continuous nodes with an artificially introduced mean shift.
n_shift_bi – Number of binary nodes with an artificially introduced mean shift.
n_shift_cat – Number of categorical nodes with an artificially introduced mean shift.
n_corr_cont_cont – Number of continuous node pairs with an artifically introduced correlation difference.
n_corr_bi_bi – Number of binary node pairs with an artificially introduced correlation difference.
n_corr_cat_cat – Number of categorical node pairs with an artificially introduced correlation difference.
n_corr_bi_cat – Number of binary-categorical node pairs with an artificially introduced correlation difference.
n_corr_cont_cat – Number of continuous-categorical node pairs with an artificially introduced correlation difference.
n_corr_bi_cont – Number of mixed node pairs with an artificially introduced correlation difference.
n_both_cont_cont – Number of continuous node pairs with both an aritificially introduced mean shift and correlation difference.
n_both_bi_bi – Number of binary node pairs with both an artificially introduced mean shift and correlation difference.
n_both_cat_cat – Number of categorical node pairs with both an artificially introduced mean shift and correlation difference.
n_both_bi_cat – Number of binary-categorical node pairs with both an artificially introduced mean shift and correlation difference.
n_both_cont_cat – Number of continuous-categorical node pairs with both an artificially introduced mean shift and correlation difference.
n_both_bi_cont – Number of mixed node pairs with both an artificially introduced mean shift and correlation difference.
shift – Magnitude of the mean shift.
corr – Magnitude of the correlation difference (measured as correlation coefficient between 0 and 1).

Returns:

A tuple containing the two simulated contexts, a meta file and a list of ground truth nodes. - context1: pd.DataFrame of the first simulated context. - context2: pd.DataFrame of the second simulated context. - meta: pd.DataFrame containing the data type for each simulated variable. - ground_truth: A tuple containing three lists of ground truth nodes: (shift_nodes, corr_nodes, shift_corr_nodes).

modina.std_rescaling(scores1, scores2, metric='std-E')[source]