Helper Module for Deep Learning.

Module that provides generative losses.

Code: https://github.com/YannDubs/disentangling-vae

class pynet.losses.generative.BaseLoss(steps_anneal=0, use_mse=False)[source]¶

Base class for losses.

__init__(steps_anneal=0, use_mse=False)[source]¶

Init class.

Parameters

steps_anneal: int, default 0

number of annealing steps where gradually adding the regularisation.

use_mse: bool, default False

if set use MSE for the reconstruction loss rather than Log Likelihood.

compute_ll(p, data)[source]¶

Compute log likelihood.

Parameters

p: torch.distributions

probabilistic decoder (or likelihood of generating true data sample given the latent code).

data: torch.Tensor

reference data.

static compute_log_alpha(mu, logvar)[source]¶

get_params()[source]¶

Get forward layers outputs.

Returns

q: torch.distributions

probabilistic encoder (or estimated posterior probability function).

z: torch.Tensor

the compressed code learned in the bottleneck layer.

model: nn.Module

the network.

kl_log_uniform(normal)[source]¶

Calculates the KL log uniform divergence.

Paragraph 4.2 from: Variational Dropout Sparsifies Deep Neural Networks Molchanov, Dmitry; Ashukha, Arsenii; Vetrov, Dmitry https://arxiv.org/abs/1701.05369 https://github.com/senya-ashukha/variational-dropout-sparsifies-dnn/ blob/master/KL%20approximation.ipynb

kl_normal_loss(q)[source]¶

Calculates the KL divergence between a normal distribution with diagonal covariance and a unit normal distribution.

Parameters: q: torch.distributions

probabilistic encoder (or estimated posterior probability function).

linear_annealing(init, fin)[source]¶

Linear annealing of a parameter.

Returns: annealed: float

loss factor to gradually add the regularisation.

reconstruction_loss(p, data)[source]¶

Calculates the per image reconstruction loss for a batch of data (i.e. negative log likelihood).

The distribution of the likelihood on the each pixel implicitely defines the loss. Bernoulli corresponds to a binary cross entropy. Gaussian distribution corresponds to MSE, and is sometimes used, but hard to train because it ends up focusing only a few pixels that are very wrong. Laplace distribution corresponds to L1 solves partially the issue of MSE.

Parameters

p: torch.distributions

probabilistic decoder (or likelihood of generating true data sample given the latent code).

data: torch.Tensor

reference data.

Returns

loss: torch.Tensor

per image cross entropy (i.e. normalized per batch but not pixel and channel).

update_train_step(iteration=None)[source]¶

Update the train step.

Parameters: iteration: int, default None

the current iteration.

class pynet.losses.generative.BetaBLoss(C_init=0.0, C_fin=20.0, gamma=100.0, **kwargs)[source]¶

Compute the Beta-VAE loss.

Understanding disentangling in beta-VAE, Burgess, arXiv 2018.

__init__(C_init=0.0, C_fin=20.0, gamma=100.0, **kwargs)[source]¶

Init class.

Parameters

C_init: float, default 0

starting annealed capacity C.

C_fin: float, default 20

final annealed capacity C.

gamma: float, default 100

weight of the KL divergence term.

kwargs: dict

additional arguments for ‘BaseLoss’.

class pynet.losses.generative.BetaHLoss(beta=4, **kwargs)[source]¶

Compute the Beta-VAE loss.

beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework, Irina Higgins, ICLR 2017.

__init__(beta=4, **kwargs)[source]¶

Init class.

Parameters

beta: float, default 4

weight of the kl divergence.

kwargs: dict

additional arguments for ‘BaseLoss’.

class pynet.losses.generative.BtcvaeLoss(dataset_size, alpha=1.0, beta=6.0, gamma=1.0, is_mss=True, **kwargs)[source]¶

Compute the decomposed KL loss with either minibatch weighted sampling or minibatch stratified sampling according.

Isolating sources of disentanglement in variational autoencoders, Tian Qi, Advances in Neural Information Processing Systems, 2018.

__init__(dataset_size, alpha=1.0, beta=6.0, gamma=1.0, is_mss=True, **kwargs)[source]¶

Init class.

Parameters

dataset_size: int

number of training images in the dataset.

alpha: float, default 1

weight of the mutual information term.

beta: float, default 6

weight of the total correlation term.

gamma: float, default 1

weight of the dimension-wise KL term.

dataset_size: int

number of training images in the dataset.

is_mss: bool, default True

wether to use minibatch stratified sampling instead of minibatch weighted sampling.

kwargs: dict

additional arguments for ‘BaseLoss’.

get_probs(z, q)[source]¶

static log_importance_weight_matrix(batch_size, dataset_size)[source]¶

Calculates a log importance weight matrix.

Parameters

batch_size: int

number of training images in the batch.

dataset_size: int

number of training images in the dataset.

static matrix_log_density_gaussian(x, q)[source]¶

Calculates log density of a Gaussian for all combination of bacth pairs of ‘x’ and ‘mu’, i.e. return tensor of shape (batch_size, batch_size, dim) instead of (batch_size, dim) in the usual log density.

Parameters

x: torch.Tensor (batch_size, dim)

value at which to compute the density.

q: torch.distributions

probabilistic encoder (or estimated posterior probability function).

class pynet.losses.generative.FactorKLoss(device, gamma=10.0, disc_kwargs={}, optim_kwargs={'betas': (0.5, 0.9), 'lr': 5e-05}, **kwargs)[source]¶

Compute the Factor-VAE loss (algorithm 2).

Disentangling by factorising, Hyunjik, arXiv 2018.

__init__(device, gamma=10.0, disc_kwargs={}, optim_kwargs={'betas': (0.5, 0.9), 'lr': 5e-05}, **kwargs)[source]¶

Init class.

Parameters

device: torch.device

the device.

optimizer: torch.optim

the network optimizer.

gamma: float, default 10

Weight of the TC loss term. gamma in the paper.

disc_kwargs: dict

discrimiator arguments.

optim_kwargs: dict

Adam optimizer arguments.

kwargs: dict

additional arguments for ‘BaseLoss’.

class pynet.losses.generative.GMVAELoss[source]¶

GMVAE Loss.

__init__()[source]¶: Init class.

static cluster_acc(y_pred, y, is_logits=False)[source]¶

class pynet.losses.generative.MCVAELoss(n_channels, beta=1.0, enc_channels=None, dec_channels=None, sparse=False, nodecoding=False)[source]¶

MCVAE loss.

Sparse Multi-Channel Variational Autoencoder for the Joint Analysis of Heterogeneous Data, Luigi Antelmi, Nicholas Ayache, Philippe Robert, Marco Lorenzi Proceedings of the 36th International Conference on Machine Learning, PMLR 97:302-311, 2019.

MCVAE consists of two loss functions:

KL divergence loss: how off the distribution over the latent space is from the prior. Given the prior is a standard Gaussian and the inferred distribution is a Gaussian with a diagonal covariance matrix, the KL-divergence becomes analytically solvable.
log-likelihood LL

loss = beta * KL_loss + LL_loss.

__init__(n_channels, beta=1.0, enc_channels=None, dec_channels=None, sparse=False, nodecoding=False)[source]¶

Init class.

Parameters

n_channels: int

the number of channels.

beta, float, default 1.

for beta-VAE.

enc_channels: list of int, default None

encode only these channels (for kl computation).

dec_channels: list of int, default None

decode only these channels (for ll computation).

sparse: bool, default False

use sparsity contraint.

nodecoding: bool, default False

if set do not apply the decoding.

compute_kl(q, beta)[source]¶

compute_ll(p, x)[source]¶

compute_log_alpha(mu, logvar)[source]¶

class pynet.losses.generative.MOESimVAELoss(beta=1.0, alpha=1.0, n_components_umap=2, n_neighbors_knn=10, use_similarity_loss=False, use_balancing_loss=True)[source]¶

MOE-Sim_VAE Loss.

__init__(beta=1.0, alpha=1.0, n_components_umap=2, n_neighbors_knn=10, use_similarity_loss=False, use_balancing_loss=True)[source]¶

Init class.

Parameters

beta: float, default 1

the weight of KL regularization term.

alpha: float, default 1

the weight of the DEPICT term.

n_components_umap: int, default 2

the UMAP projection of the data desired number of dimensions.

n_neighbors_knn: int, dafault 10

the number of k-nearest-neighbors used to define the adjacency matrix.

use_similarity_loss: bool, default False

activate the similarity loss.

use_balancing_loss: bool, default True

activate the balancing loss.

static balancing(probs)[source]¶: One thing we need to be careful about when training this model is that the manager could easily degenerate into outputting a constant vector regardless of the input in hand. This results in one VAE specialized in all digits, and nine VAEs specialized in nothing. One way to mitigate it, is to add a balancing term to the loss. It encourages the outputs of the manager over a batch of inputs to be balanced, i.e. the distribution of the sum of the probabilities over the batch is almost uniform.

static depict(probs, probs_noisy)[source]¶: The DEPICT loss encourages the model to learn invariant features from the latent representation for clustering with respect to noise.

static get_similarity_matrix(data, n_components_umap=2, n_neighbors_knn=10, random_state=None)[source]¶: The similarity matrix is derived in an unsupervised way (e.g., UMAP projection of the data and k-nearest-neighbors or distance thresholding to define the adjacency matrix for the batch), but can also be used to include weakly-supervised information (e.g., knowledge about diseased vs. non-diseased patients). If labels are available, the model could even be used to derive a latent representation with supervision. Thesimilarity feature in MoE-Sim-VAE thus allows to include prior knowledge about the best similarity measure on the data.

static similarity(probs, similarity)[source]¶: Reconstruct a data-driven similarity loss using the Binary Cross-Entropy.

class pynet.losses.generative.PMVAELoss(beta=1)[source]¶

PMVAE loss.

Compute a global & a local (per pathway) reconstruction loss and a KL divergence regularization loss with beta weighting.

__init__(beta=1)[source]¶

Init class.

Parameters: beta: float, default 1

the weight of KL term regularization.

class pynet.losses.generative.SparseLoss(beta=4, **kwargs)[source]¶

Compute the Beta-Sparse VAE loss.

Sparse Multi-Channel Variational Autoencoder for the Joint Analysis of Heterogeneous Data, Luigi Antelmi, Nicholas Ayache, Philippe Robert, Marco Lorenzi, PMLR 2019.

__init__(beta=4, **kwargs)[source]¶

Init class.

Parameters

beta: float, default 4

weight of the kl divergence.

kwargs: dict

additional arguments for ‘BaseLoss’.

class pynet.losses.generative.VAEGMPLoss(beta=1.0, reduction='entropy')[source]¶

VAEGMP Loss.

__init__(beta=1.0, reduction='entropy')[source]¶

Init class.

Parameters

beta: float, default 1

the weight of KL term regularization.

reduction: str, default ‘entropy’

how to reduce the loss.

class pynet.losses.generative.VaDELoss(alpha=1)[source]¶

VaDE loss.

__init__(alpha=1)[source]¶

Init class.

Parameters: alpha: float, default 1

reconstruction loss weight.

pynet.losses.generative.get_vae_loss(loss_name, **kwargs)[source]¶

Return the correct VAE loss function given the input arguments.

The parameters for each loss:

vae: -
betah: beta
betab: C_init, C_fin, gamma
factor: device, gamma, latent_dim, lr_disc
btcvae: dataset_size, alpha, beta, gamma
sparse: beta

Parameters

loss_name: str

the name of the loss.

kwargs: dict

the loss kwargs.

Returns

loss: @callable

the loss function.

Helper Module for Deep Learning.

Follow us