pytorch-dni
This is an implementation of Decoupled Neural Interfaces using Synthetic Gradients, Jaderberg et al..
pip install pytorch-dni
git clone https://github.com/ixaxaar/pytorch-dni cd pytorch-dni pip install -r ./requirements.txt pip install -e .
Following are the constructor parameters of DNI
:
Argument | Default | Description |
---|---|---|
network | NA | Network to be optimized |
dni_network | None | DNI network class |
dni_params | {} | Parameters to be passed to the dni_network constructor |
optim | None | optimizer for the network |
grad_optim | 'adam' | DNI module optimizer |
grad_lr | 0.001 | DNI learning rate |
hidden_size | 10 | hidden size of the DNI network |
λ | 0.5 | How muc to mix backprop and synthetic gradients (0 = synthetic only, 1 = backprop only) |
recursive | True | whether to optimize leaf modules or treat network as a leaf module |
gpu_id | -1 | GPU ID |
net
(including last layer)from dni import DNI# Parent network, can be anything extending nn.Modulenet = WhateverNetwork(**kwargs) opt = optim.Adam(net.parameters(), lr=0.001)# use DNI to optimize this networknet = DNI(net, grad_optim='adam', grad_lr=0.0001)# after that we go about our business as usualfor e in range(epoch): opt.zero_grad() output = net(input, *args) loss = criterion(output, target_output) loss.backward() # Optional: do this to __also__ update net's weight using backprop # opt.step()...
DNI can be applied to any class extending nn.Module
.
In this example we supply which layers to use DNI for, as the parameter dni_layers
:
from dni import *class Net(nn.Module): def __init__(self, num_layers=3, hidden_size=256, dni_layers=[]): super(Net, self).__init__() self.num_layers = num_layers self.hidden_size = hidden_size self.net = [self.dni(self.layer( image_size*image_size if l == 0 else hidden_size, hidden_size )) if l in dni_layers else self.layer( image_size*image_size if l == 0 else hidden_size, hidden_size ) for l in range(self.num_layers)] self.final = self.layer(hidden_size, 10) # bind layers to this class (so that they're searchable by pytorch) for ctr, n in enumerate(self.net): setattr(self, 'layer'+str(ctr), n) def layer(self, input_size, hidden_size): return nn.Sequential( nn.Linear(input_size, hidden_size), nn.BatchNorm1d(hidden_size) ) # create a DNI wrapper layer, recursive=False implies treat this layer as a leaf module def dni(self, layer): d = DNI(layer, hidden_size=256, grad_optim='adam', grad_lr=0.0001, recursive=False) return d def forward(self, x): output = x.view(-1, image_size*image_size) for layer in self.net: output = F.relu(layer(output)) output = self.final(output) return F.log_softmax(output, dim=-1) net = Net(num_layers=3, dni_layers=[1,2,3])# use the gradient descent to optimize layers not optimized by DNIopt = optim.Adam(net.final.parametes(), lr=0.001)# after that we go about our business as usualfor e in range(epoch): opt.zero_grad() output = net(input) loss = criterion(output, target_output) loss.backward() opt.step()
from dni import *# Custom DNI networkclass MyCustomDNI(DNINetwork): def __init__(self, input_size, hidden_size, output_size, num_layers=2, bias=True): super(LinearDNI, self).__init__(input_size, hidden_size, output_size) self.input_size = input_size self.hidden_size = hidden_size * 4 self.output_size = output_size self.num_layers = num_layers self.bias = bias self.net = [self.layer( input_size if l == 0 else self.hidden_size, self.hidden_size ) for l in range(self.num_layers)] # bind layers to this class (so that they're searchable by pytorch) for ctr, n in enumerate(self.net): setattr(self, 'layer'+str(ctr), n) # final layer (yeah, no kidding) self.final = nn.Linear(self.hidden_size, output_size) def layer(self, input_size, hidden_size): return nn.Linear(input_size, hidden_size) def forward(self, input, hidden): output = input for layer in self.net: output = F.relu(layer(output)) output = self.final(output) return output, None# Custom network, can be anything extending nn.Modulenet = WhateverNetwork(**kwargs) opt = optim.Adam(net.parameters(), lr=0.001)# use DNI to optimize this network with MyCustomDNI, pass custom params to the DNI netsnet = DNI(net, grad_optim='adam', grad_lr=0.0001, dni_network=MyCustomDNI, dni_params={'num_layers': 3, 'bias': True})# after that we go about our business as usualfor e in range(epoch): opt.zero_grad() output = net(input, *args) loss = criterion(output, target_output) loss.backward()
Oh come on.
This package ships with 3 types of DNI networks:
LinearDNI: Linear -> ReLU
* num_layers -> Linear
LinearSigmoidDNI: Linear -> ReLU
* num_layers -> Linear
-> Sigmoid
LinearBatchNormDNI: Linear -> BatchNorm1d -> ReLU
* num_layers -> Linear
RNNDNI: stacked LSTM
s, GRU
s or RNN
s
Conv2dDNI: Conv2d -> BatchNorm2d -> MaxPool2d / AvgPool2d -> ReLU
* num_layers -> Conv2d -> AvgPool2d
Custom DNI nets can be created using the DNINetwork
interface:
from dni import *class MyDNI(DNINetwork): def __init__(self, input_size, hidden_size, output_size, **kwargs): super(MyDNI, self).__init__(input_size, hidden_size, output_size) ... def forward(self, input, hidden): ... return output, hidden
Refer to tasks/mnist/README.md
Refer to tasks/word_language_model/README.md
The tasks included in this project are the same as those in pytorch-dnc, except that they're trained here using DNI.
Using a linear SG module makes the implicit assumption that loss is a quadratic function of the activations
For best performance one should adapt the SG module architecture to the loss function used. For MSE linear SG is a reasonable choice, however for log loss one should use architectures including a sigmoid applied pointwise to a linear SG
Learning rates of the order of 1e-5
with momentum of 0.9
works well for rmsprop, adam works well with 0.001
上一篇: DNI_Torch
下一篇:better-dni
还没有评论,说两句吧!
热门资源
Keras-ResNeXt
Keras ResNeXt Implementation of ResNeXt models...
seetafaceJNI
项目介绍 基于中科院seetaface2进行封装的JAVA...
spark-corenlp
This package wraps Stanford CoreNLP annotators ...
capsnet-with-caps...
CapsNet with capsule-wise convolution Project ...
inferno-boilerplate
This is a very basic boilerplate example for pe...
智能在线
400-630-6780
聆听.建议反馈
E-mail: support@tusaishared.com