Neural Networks

Solution: optimizer got an empty parameter list

Recently while building a custom Pytorch model i got a puzzling error “ValueError: optimizer got an empty parameter list” which got some hours of troublshooting, only to find that it just needs a small trick.

Reproducing the error:
It usually happens when you have a model with your own defined parameters, e.g. consider following example with only one parameter:

 

class PytorchModel(nn.Module):

 def __init__(self, word_dim, hidden_dim=100, bptt_truncate=4):
 super(PytorchModel, self).__init__()

# Assign instance variables
 self.word_dim = word_d

 self.E = torch.Tensor(E)

# SGD / rmsprop: Initialize parameters
 self.mE = torch.zeros(E.shape)

 def forward(self, x_t, s_t1_prev=s_t1, s_t2_prev=s_t2):

## Custom forward function code

return [outputs, params]

# Forward ENDs

While passing the model to the SGD function will through the aforementioned error because it is getting parameter list object as emtpy.

Result:

 

model = PytorchModel(50)
optimizer = optim.SGD(model.parameters(), lr=0.01)

#"ValueError: optimizer got an empty parameter list"

 

Reason:

Pytorch wants the these parameters registered FIRST!  it can be done via nn.Parameter(..)

Solution:

 

class PytorchModel(nn.Module):

 def __init__(self, word_dim, hidden_dim=100, bptt_truncate=4):
 super(PytorchModel, self).__init__()

# Assign instance variables
 self.word_dim = word_d

# Pytorch: Created shared variables
# nn.Parameter is a special kind of Variable, that will get
# automatically registered as Module's parameter once it's assigned
# as an attribute. Parameters and buffers need to be registered, or
# they won't appear in .parameters() (doesn't apply to buffers), and
# won't be converted when e.g. .cuda() is called. You can use
# .register_buffer() to register buffers.
# nn.Parameters require gradients by default.

 self.E = nn.Parameter(torch.Tensor(E) )

# SGD / rmsprop: Initialize parameters
 self.mE = torch.zeros(E.shape)

 def forward(self, x_t, s_t1_prev=s_t1, s_t2_prev=s_t2):

## Custom forward function code

 return [outputs, params]

# Forward ENDs

This will work fine then.

Tagged

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.