Helper Module for Deep Learning.
Genome Wide Association with DL for standing height¶
Credit: V Frouin
Load the data¶
Load some data. You may need to change the ‘datasetdir’ parameter.
import os
import sys
from pynet.datasets import DataManager, fetch_height_biobank
from pynet.utils import setup_logging
# This example cannot run in CI : it accesses NS intra filesystems
if "CI_MODE" in os.environ:
sys.exit(0)
setup_logging(level="info")
data = fetch_height_biobank(datasetdir="/neurospin/tmp/height_bb")
manager = DataManager(
input_path=data.input_path,
labels=["Height"],
metadata_path=data.metadata_path,
number_of_folds=2,
batch_size=5,
test_size=0.2,
continuous_labels=True)
Basic inspection
import numpy as np
import matplotlib.pyplot as plt
train_dataset = manager["train"][0]
X_train = train_dataset.inputs[train_dataset.indices]
y_train = train_dataset.labels[train_dataset.indices]
test_dataset = manager["test"]
X_test = test_dataset.inputs[test_dataset.indices]
y_test = test_dataset.labels[test_dataset.indices]
print(X_train.shape, y_train.shape)
print(X_test.shape, y_test.shape)
print(" min max mean sd")
print("Train:", y_train.min(), y_train.max(), y_train.mean(),
np.sqrt(y_train.var()))
print("Test:", y_test.min(), y_test.max(), y_test.mean(),
np.sqrt(y_test.var()))
SNP preselection according to a simple GWAS: select N_best most associated SNPs or select by min_P_value. Optional: not used after.
from scipy import stats
pvals = []
for idx in range(X_train.shape[1]):
b, intercept, r_value, p_value, std_err = stats.linregress(
X_train[:, idx], y_train)
pvals.append(-np.log10(p_value))
pvals = np.array(pvals)
plt.figure()
plt.ylabel("-log10 P-value")
plt.xlabel("SNP")
plt.plot(pvals, marker="o")
Total running time of the script: ( 0 minutes 0.000 seconds)
Gallery generated by Sphinx-Gallery
Follow us
© 2019, pynet developers .
Inspired by AZMIND template.
Inspired by AZMIND template.