LibtorchArtificialNeuralNet

Overview

This class can be used to generate a simple, feedforward artificial neural network (ANN) using the underlying objects imported from libtorch (C++ API of pytorch). Note: to be able to use these capabilities, MOOSE needs to be installed with libtorch support. For more information, visit the installation instuctions on the MOOSE website. For a more detailed introduction to neural networks, we refer the reader to Müller et al. (1995). The architecture of a simple feedforward neural network is presented below. The first layer from the left to the right are referred to as input and output layers, while the layers between them are the hidden layers.

Figure 1: The architecture of the simple feedforward neural network in MOOSE-STM.

We see that the outputs ( $var element = document.getElementById("moose-equation-6f5c6c85-1667-4fbc-9e61-a805bd7f8783");katex.render("\\textbf{y}", element, {displayMode:false,throwOnError:false});$ ) of the neural net can be expressed as function of the inputs ( $var element = document.getElementById("moose-equation-5a1e6817-6ddb-47a3-83e7-7d759afacc37");katex.render("\\textbf{x}", element, {displayMode:false,throwOnError:false});$ ) and the corresponding model parameters (weights $var element = document.getElementById("moose-equation-19df9c22-62cc-480e-92a4-62965395ed2b");katex.render("w_{i,j}", element, {displayMode:false,throwOnError:false});$ , organized in weight matrics $var element = document.getElementById("moose-equation-821fca66-3225-475d-985b-f750700017f2");katex.render("\\textbf{W}", element, {displayMode:false,throwOnError:false});$ and biases $var element = document.getElementById("moose-equation-28996df7-3105-4858-a0c7-73609282fa8c");katex.render("b_i", element, {displayMode:false,throwOnError:false});$ organized in the bias vector $var element = document.getElementById("moose-equation-c3103f28-194a-4623-a1d6-87ae37a8b217");katex.render("\\textbf{b}", element, {displayMode:false,throwOnError:false});$ ) in the following nested form:

(1)var element = document.getElementById("moose-equation-b3e686b1-87a9-4c34-9ff3-19560aadb9c5");katex.render("\\textbf{y} = \\sigma(\\textbf{W}^{(3)}\\sigma(\\textbf{W}^{(2)}\\sigma(\\textbf{W}^{(1)}\\textbf{x}+\\textbf{b}^{(1)})+\\textbf{b}^{(2)})+\\textbf{b}^{(3)}),", element, {displayMode:true,throwOnError:false});

where $var element = document.getElementById("moose-equation-3ae7a857-d8f9-4b5a-b5d3-956682d3d745");katex.render("\\sigma", element, {displayMode:false,throwOnError:false});$ denotes the activation function. At the moment, the Moose implementation supports relu, elu, gelu, sigmoid and linear activation functions. In this class, no activation function is applied on the output layer. It is apparent that the real functional dependence (target function) between the inputs and outputs is approximated by the function in Eq. (1). As in most cases, the error in this approximation depends on the smoothness of the target function and the values of the model parameters. The weights and biases in the function are determined by minimizing the error between the approximate outputs of the neural net corresponding reference (training) values over a training set.

Example usage

To be able to use this neural network, we have to construct one using a name, the number of expected input and output neurons, an expected hideen-layer-structure and the activation functions for the layers. If no activation function is given, relu is used for every hidden neuron:

  // Define neurons per hidden layer: we will have two hidden layers with 4 neurons each
  std::vector<unsigned int> num_neurons_per_layer({4, 4});
  // Create the neural network with name "test", number of input neurons = 3,
  // number of output neurons = 1, and activation functions from the input file.
  std::shared_ptr<Moose::LibtorchArtificialNeuralNet> nn =
      std::make_shared<Moose::LibtorchArtificialNeuralNet>(
          "test",
          3,
          1,
          num_neurons_per_layer,
          getParam<std::vector<std::string>>("activation_functions"));

For training a neural network, we need to initialize an optimizer (ADAM in this case), then supply known input-output combinations for the function-to-be-approximated and let the optimizer set the parameters of the neural network to ensure that the answer supplied by the neural network is as close to the supplied values as possible. Once step in this optimization process is shown below:

  // Create an Adam optimizer
  torch::optim::Adam optimizer(nn->parameters(), torch::optim::AdamOptions(0.02));
  // reset the gradients
  optimizer.zero_grad();
  // This is our test input
  torch::Tensor input = at::ones({1, 3}, at::kDouble);
  // This is our test output (we know the result)
  torch::Tensor output = at::ones({1}, at::kDouble);
  // This is our prediction for the test input
  torch::Tensor prediction = nn->forward(input);
  // We save our first prediction
  _nn_values.push_back(prediction.item<double>());
  // We compute the loss
  torch::Tensor loss = torch::mse_loss(prediction, output);
  // We propagate the error back to compute gradient
  loss.backward();
  // We update the weights using the computed gradients
  optimizer.step();

For more detailed instructions on training a neural network, visit the Stochastic Tools module!

References

Berndt Müller, Joachim Reinhardt, and Michael T Strickland. Neural networks: an introduction. Springer Science & Business Media, 1995.

@book{muller1995neural,
    author = {M{\"u}ller, Berndt and Reinhardt, Joachim and Strickland, Michael T},
    title = "Neural networks: an introduction",
    year = "1995",
    publisher = "Springer Science \\& Business Media"
}

(moose/test/src/libtorch/vectorpostprocessors/LibtorchArtificialNeuralNetTest.C)

// This file is part of the MOOSE framework
// https://www.mooseframework.org
//
// All rights reserved, see COPYRIGHT for full restrictions
// https://github.com/idaholab/moose/blob/master/COPYRIGHT
//
// Licensed under LGPL 2.1, please see LICENSE for details
// https://www.gnu.org/licenses/lgpl-2.1.html

#ifdef LIBTORCH_ENABLED

#include <torch/torch.h>
#include "LibtorchArtificialNeuralNet.h"
#include "LibtorchArtificialNeuralNetTest.h"

registerMooseObject("MooseTestApp", LibtorchArtificialNeuralNetTest);

InputParameters
LibtorchArtificialNeuralNetTest::validParams()
{
  InputParameters params = GeneralVectorPostprocessor::validParams();

  params.addParam<std::vector<std::string>>(
      "activation_functions", std::vector<std::string>({"relu"}), "Test activation functions");

  return params;
}

LibtorchArtificialNeuralNetTest::LibtorchArtificialNeuralNetTest(const InputParameters & params)
  : GeneralVectorPostprocessor(params), _nn_values(declareVector("nn_values"))
{
  torch::manual_seed(11);

  // Define neurons per hidden layer: we will have two hidden layers with 4 neurons each
  std::vector<unsigned int> num_neurons_per_layer({4, 4});
  // Create the neural network with name "test", number of input neurons = 3,
  // number of output neurons = 1, and activation functions from the input file.
  std::shared_ptr<Moose::LibtorchArtificialNeuralNet> nn =
      std::make_shared<Moose::LibtorchArtificialNeuralNet>(
          "test",
          3,
          1,
          num_neurons_per_layer,
          getParam<std::vector<std::string>>("activation_functions"));

  // Create an Adam optimizer
  torch::optim::Adam optimizer(nn->parameters(), torch::optim::AdamOptions(0.02));
  // reset the gradients
  optimizer.zero_grad();
  // This is our test input
  torch::Tensor input = at::ones({1, 3}, at::kDouble);
  // This is our test output (we know the result)
  torch::Tensor output = at::ones({1}, at::kDouble);
  // This is our prediction for the test input
  torch::Tensor prediction = nn->forward(input);
  // We save our first prediction
  _nn_values.push_back(prediction.item<double>());
  // We compute the loss
  torch::Tensor loss = torch::mse_loss(prediction, output);
  // We propagate the error back to compute gradient
  loss.backward();
  // We update the weights using the computed gradients
  optimizer.step();
  // Obtain another prediction
  prediction = nn->forward(input);
  // We save our second prediction
  _nn_values.push_back(prediction.item<double>());
}

#endif

(moose/test/src/libtorch/vectorpostprocessors/LibtorchArtificialNeuralNetTest.C)

// This file is part of the MOOSE framework
// https://www.mooseframework.org
//
// All rights reserved, see COPYRIGHT for full restrictions
// https://github.com/idaholab/moose/blob/master/COPYRIGHT
//
// Licensed under LGPL 2.1, please see LICENSE for details
// https://www.gnu.org/licenses/lgpl-2.1.html

#ifdef LIBTORCH_ENABLED

#include <torch/torch.h>
#include "LibtorchArtificialNeuralNet.h"
#include "LibtorchArtificialNeuralNetTest.h"

registerMooseObject("MooseTestApp", LibtorchArtificialNeuralNetTest);

InputParameters
LibtorchArtificialNeuralNetTest::validParams()
{
  InputParameters params = GeneralVectorPostprocessor::validParams();

  params.addParam<std::vector<std::string>>(
      "activation_functions", std::vector<std::string>({"relu"}), "Test activation functions");

  return params;
}

LibtorchArtificialNeuralNetTest::LibtorchArtificialNeuralNetTest(const InputParameters & params)
  : GeneralVectorPostprocessor(params), _nn_values(declareVector("nn_values"))
{
  torch::manual_seed(11);

  // Define neurons per hidden layer: we will have two hidden layers with 4 neurons each
  std::vector<unsigned int> num_neurons_per_layer({4, 4});
  // Create the neural network with name "test", number of input neurons = 3,
  // number of output neurons = 1, and activation functions from the input file.
  std::shared_ptr<Moose::LibtorchArtificialNeuralNet> nn =
      std::make_shared<Moose::LibtorchArtificialNeuralNet>(
          "test",
          3,
          1,
          num_neurons_per_layer,
          getParam<std::vector<std::string>>("activation_functions"));

  // Create an Adam optimizer
  torch::optim::Adam optimizer(nn->parameters(), torch::optim::AdamOptions(0.02));
  // reset the gradients
  optimizer.zero_grad();
  // This is our test input
  torch::Tensor input = at::ones({1, 3}, at::kDouble);
  // This is our test output (we know the result)
  torch::Tensor output = at::ones({1}, at::kDouble);
  // This is our prediction for the test input
  torch::Tensor prediction = nn->forward(input);
  // We save our first prediction
  _nn_values.push_back(prediction.item<double>());
  // We compute the loss
  torch::Tensor loss = torch::mse_loss(prediction, output);
  // We propagate the error back to compute gradient
  loss.backward();
  // We update the weights using the computed gradients
  optimizer.step();
  // Obtain another prediction
  prediction = nn->forward(input);
  // We save our second prediction
  _nn_values.push_back(prediction.item<double>());
}

#endif