Title: | Gaussian Process Models for Scalar and Functional Inputs |
---|---|
Description: | Construction and smart selection of Gaussian process models for analysis of computer experiments with emphasis on treatment of functional inputs that are regularly sampled. This package offers: (i) flexible modeling of functional-input regression problems through the fairly general Gaussian process model; (ii) built-in dimension reduction for functional inputs; (iii) heuristic optimization of the structural parameters of the model (e.g., active inputs, kernel function, type of distance). An in-depth tutorial in the use of funGp is provided in Betancourt et al. (2024) <doi:10.18637/jss.v109.i05> and Metamodeling background is provided in Betancourt et al. (2020) <doi:10.1016/j.ress.2020.106870>. The algorithm for structural parameter optimization is described in <https://hal.science/hal-02532713>. |
Authors: | Jose Betancourt [cre, aut], François Bachoc [aut], Thierry Klein [aut], Jeremy Rohmer [aut], Yves Deville [ctb], Deborah Idier [ctb] |
Maintainer: | Jose Betancourt <[email protected]> |
License: | GPL-3 |
Version: | 1.0.0 |
Built: | 2025-02-21 03:44:35 UTC |
Source: | https://github.com/djbetancourt-gh/fungp |
Construction and smart selection of Gaussian process models for analysis of computer experiments with emphasis on treatment of functional inputs that are regularly sampled. Smart selection is based on Ant Colony Optimization ACO algorithm.
Main methods
fgpm: creation of funGp regression models
predict,fgpm-method: output estimation at new input points based on a funGp model
simulate,fgpm-method: random sampling from a funGp Gaussian process model
update,fgpm-method: modification of data and hyperparameters of a funGp model
Plotters
plot,fgpm-method: validation plot for a fgpm
model
plot.predict.fgpm: plot of predictions based on a fgpm
model
plot.simulate.fgpm: plot of simulations based on a fgpm
model
Main method
fgpm_factory: structural parameter optimization
Functions for pre-optimization
decay: regularized initial pheromones
decay2probs: normalized initial pheromones
Plotters post-optimization
plot,Xfgpm-method: plot of the evolution of the algorithm with which = "evolution"
or of the absolute and relative quality of the optimized model with which = "diag"
Correction post-optimization of input data structures
which_on: indices of active inputs in a model
structure delivered by fgpm_factory
get_active_in: extraction of active input
data based on a model structure delivered by fgpm_factory
Manual: funGp: An R Package for Gaussian Process Regression with Scalar and Functional Inputs (doi:10.18637/jss.v109.i05)
Paper: - Gaussian process metamodeling of functional-input code for coastal flood hazard assessment (doi:10.1016/j.ress.2020.106870)
Tech. report: Ant Colony Based Model Selection for Functional-Input Gaussian Process Regression (https://hal.science/hal-02532713)
José Betancourt, François Bachoc and Thierry Klein
Déborah Idier and Jérémy Rohmer
This package was first developed in the frame of the RISCOPE research project, funded by the French Agence Nationale de la Recherche (ANR) for the period 2017-2021 (ANR, project No. 16CE04-0011, RISCOPE.fr), and certified by SAFE Cluster.
Maintainer: Jose Betancourt [email protected]
Authors:
François Bachoc [email protected]
Thierry Klein [email protected]
Jeremy Rohmer [email protected]
Other contributors:
Yves Deville [email protected] [contributor]
Deborah Idier [email protected] [contributor]
Useful links:
Report bugs at https://github.com/djbetancourt-gh/funGp/issues
fgpm
model in a Xfgpm
objectRefit a fgpm
model as described in a Xfgpm
object.
## S4 method for signature 'Xfgpm' x[[i]]
## S4 method for signature 'Xfgpm' x[[i]]
x |
A |
i |
An integer giving the index of the model to refit. The
models are in decreasing fit quality as assessed by the
Leave-One-Out |
While the syntax may suggest that the function
extracts a fitted fgpm
model, this not true. The
fgpm
model is refitted using the call that was used
when this model was assessed. The refitted fgpm
model
keeps the same structural parameters as the one assessed
(active variables, kernel, ...), but since the optimization
uses random initial values, the optimized hyper-parameters may
differ from those of the corresponding fgpm
in the
Xfgpm
object x
. As a result, the model can be
different and show a different LOO performance.
The slot @model
returns the best fgpm
as
assessed in a Xfgm
model x
. So this model can be
expected to be close to the same as x[[1]]
. Yet due to
the refit, the two models x@model
and x[[1]]
can
differ, see the explanations in the Caution section.
The modelDef
function to extract the
definition of a fgpm
model e.g., to evaluate it using new
data sIn
, fIn
and sOut
.
## see `?xm` to see how to recreate the pre-caclulated `Xfgpm` object `xm`. xm[[2]]
## see `?xm` to see how to recreate the pre-caclulated `Xfgpm` object `xm`. xm[[2]]
Register of model structures and their performance statistics, if available.
sols
Object of class "data.frame"
. Compendium of model structures arranged by rows. Each
column is linked to one structural parameter of the model such as the state of one variable (inactive,
active) or the type of kernel function.
args
Object of class "list"
. Compendium of model structures represented by objects of class
"modelCall"
.
fitness
Object of class "numeric"
. Performance statistic of each model, if available.
José Betancourt, François Bachoc, Thierry Klein and Jérémy Rohmer
Set of analytic functions that take functional variables as inputs. Since they run quickly, they can be used for testing of funGp functionalities as if they were black box computer models. They cover different situations (number of scalar inputs and complexity of the inputs-output mathematical relationship).
fgp_BB1(sIn, fIn, n.tr) fgp_BB2(sIn, fIn, n.tr) fgp_BB3(sIn, fIn, n.tr) fgp_BB4(sIn, fIn, n.tr) fgp_BB5(sIn, fIn, n.tr) fgp_BB6(sIn, fIn, n.tr) fgp_BB7(sIn, fIn, n.tr)
fgp_BB1(sIn, fIn, n.tr) fgp_BB2(sIn, fIn, n.tr) fgp_BB3(sIn, fIn, n.tr) fgp_BB4(sIn, fIn, n.tr) fgp_BB5(sIn, fIn, n.tr) fgp_BB6(sIn, fIn, n.tr) fgp_BB7(sIn, fIn, n.tr)
sIn |
Object with class |
fIn |
Object with class |
n.tr |
Object with class |
For all the functions, the scalar inputs
are in the real interval
and
the
functional inputs
are defined on the interval
. Expressions for the values are as follows.
fgp_BB1
With
x1 * sin(x2) + x1 * mean(f1) - x2^2 * diff(range(f2))
fgp_BB2
With and
x1 * sin(x2) + mean(exp(x1 * t1) * f1) - x2^2 * mean(f2^2 * t2)
fgp_BB3
With and
is the first analytical example in Muehlenstaedt et al (2017)
x1 + 2 * x2 + 4 * mean(t1 * f1) + mean(f2)
fgp_BB4
With and
is the
second analytical example in preprint of Muehlenstaedt et al (2017)
(x2 - (5 / (4 * pi^2)) * x1^2 + (5 / pi) * x1 - 6)^2 + 10 * (1 - (1 / (8 * pi))) * cos(x1) + 10 + (4 / 3) * pi * (42 * mean(f1 * (1 - t1)) + pi * ((x1 + 5) / 5) + 15) * mean(t2 * f2))
fgp_BB5
With and
is
inspired by the second analytical example in final version of Muehlenstaedt et al (2017)
(x2 - (5 / (4 * pi^2)) * x1^2 + (5 / pi) * x1 - 6)^2 + 10 * (1 - (1 / (8 * pi))) * cos(x1) + 10 + (4 / 3) * pi * (42 * mean(15 * f1 * (1 - t1) - 5) + pi * ((x1 + 5) / 5) + 15) * mean(15 * t2 * f2))
fgp_BB6
With and
is inspired by the analytical example in Nanty et al (2016)
2 * x1^2 + 2 * mean(f1 + t1) + 2 * mean(f2 + t2) + max(f2) + x2
fgp_BB7
With and
is
inspired by the second analytical example in final version of Muehlenstaedt et al (2017)
(x2 + 4 * x3 - (5 / (4 * pi^2)) * x1^2 + (5 / pi) * x1 - 6)^2 + 10 * (1 - (1 / (8 * pi))) * cos(x1) * x2^2 * x5^3 + 10 + (4 / 3) * pi * (42 * sin(x4) * mean(15 * f1 * (1 - t1) - 5) + pi * (((x1 * x5 + 5) / 5) + 15) * mean(15 * t2 * f2))
An object of class "matrix"
with the values of the output at the specified input coordinates.
The functions listed above were used to validate the functionality and stability of this package. Several tests involving all main functions, plotters and getters were run for scalar-input, functional-input and hybrid-input models. In all cases, the output of the functions were correct from the statistical and programmatic perspectives. For an example on the kind of tests performed, the interested user is referred to the introductory funGp manual (doi:10.18637/jss.v109.i05).
Muehlenstaedt, T., Fruth, J., and Roustant, O. (2017), "Computer experiments with functional inputs and scalar outputs by a norm-based approach". Statistics and Computing, 27, 1083-1097. [SC]
Nanty, S., Helbert, C., Marrel, A., Pérot, N., and Prieur, C. (2016), "Sampling, metamodeling, and sensitivity analysis of numerical simulators with functional stochastic inputs". SIAM/ASA Journal on Uncertainty Quantification, 4(1), 636-659. doi:10.1137/15M1033319
This function is intended to aid the selection of the heuristic parameters tao0, delta and dispr in the call to the model selection function fgpm_factory. The values computed by decay are the ones that would be used by the ant colony algorithm as initial pheromone load of the links pointing out to projection on each dimension. For more details, check the technical report explaining the ant colony algorithm implemented in funGp, and the manual of the package (doi:10.18637/jss.v109.i05).
decay( k, pmax = NULL, tao0 = 0.1, delta = 2, dispr = 1.4, doplot = TRUE, deliver = FALSE )
decay( k, pmax = NULL, tao0 = 0.1, delta = 2, dispr = 1.4, doplot = TRUE, deliver = FALSE )
k |
A number indicating the dimension of the functional input under analysis. |
pmax |
An optional number specifying the hypothetical maximum projection dimension of this input. The user will be able to set this value later in the call to fgpm_factory as a constraint. If not specified, it takes the value of k. |
tao0 |
Explained in the description of dispr. |
delta |
Explained in the description of dispr. |
dispr |
The arguments tao0, delta and dispr, are optional numbers specifying the loss function that determines the initial pheromone load on the links pointing out to projection dimensions. Such a function is defined as
with p taking the values of the projection dimensions. The argument tao0 indicates the pheromone
load in the links pointing out to the smallest dimensions; delta specifies how many dimensions
should preserve the maximum pheromone load; dispr determines how fast the pheromone load drops
in dimensions further than |
doplot |
An optional boolean indicating if the pheromone loads should be plotted. Default = TRUE. |
deliver |
An optional boolean indicating if the pheromone loads should be returned. Default = FALSE. |
If deliver is TRUE, an object of class "numeric"
containing the initial pheromone values
corresponding to the specified projection dimensions. Otherwise, the function plots the pheromones and
nothing is returned.
José Betancourt, François Bachoc, Thierry Klein and Jérémy Rohmer
* decay2probs for the function to generate the initial probability load;
* fgpm_factory for heuristic funGp model selection.
# using default decay arguments____________________________________________________________ # input of dimension 15 projected maximum in dimension 15 decay(15) # input of dimension 15 projected maximum in dimension 8 decay(15, 8) # playing with decay arguments_____________________________________________________________ # input of dimension 15 projected maximum in dimension 15 decay(15) # using a larger value of tao0 decay(15, tao0 = .3) # using a larger value of tao0, keeping it fixed up to higher dimensions decay(15, tao0 = .3, delta = 5) # using a larger value of tao0, keeping it fixed up to higher dimensions, with slower decay decay(15, tao0 = .3, delta = 5, dispr = 5.2) # requesting pheromone values______________________________________________________________ # input of dimension 15 projected maximum in dimension 15 decay(15, deliver = TRUE)
# using default decay arguments____________________________________________________________ # input of dimension 15 projected maximum in dimension 15 decay(15) # input of dimension 15 projected maximum in dimension 8 decay(15, 8) # playing with decay arguments_____________________________________________________________ # input of dimension 15 projected maximum in dimension 15 decay(15) # using a larger value of tao0 decay(15, tao0 = .3) # using a larger value of tao0, keeping it fixed up to higher dimensions decay(15, tao0 = .3, delta = 5) # using a larger value of tao0, keeping it fixed up to higher dimensions, with slower decay decay(15, tao0 = .3, delta = 5, dispr = 5.2) # requesting pheromone values______________________________________________________________ # input of dimension 15 projected maximum in dimension 15 decay(15, deliver = TRUE)
This function is intended to aid the selection of the heuristic parameters tao0, delta and dispr in the call to the model selection function fgpm_factory. The values computed by decay2probs are the ones that would be used by the ant colony algorithm as probability load of the links pointing out to projection on each dimension. These values result from the normalization of the initial pheromone loads delivered by the decay function, which are made to sum 1. For more details, check the technical report explaining the ant colony algorithm implemented in funGp, and the manual of the package (doi:10.18637/jss.v109.i05).
decay2probs( k, pmax = NULL, tao0 = 0.1, delta = 2, dispr = 1.4, doplot = TRUE, deliver = FALSE )
decay2probs( k, pmax = NULL, tao0 = 0.1, delta = 2, dispr = 1.4, doplot = TRUE, deliver = FALSE )
k |
A number indicating the dimension of the functional input under analysis. |
pmax |
An optional number specifying the hypothetical maximum projection dimension of this input. The user will be able to set this value later in the call to fgpm_factory as a constraint. If not specified, it takes the value of k. |
tao0 |
Explained in the description of dispr. |
delta |
Explained in the description of dispr. |
dispr |
The arguments tao0, delta and dispr, are optional numbers specifying the loss function that determines the initial pheromone load on the links pointing out to projection dimensions. Such a function is defined as
with p taking the values of the projection dimensions. The argument tao0 indicates the pheromone
load in the links pointing out to the smallest dimensions; delta specifies how many dimensions
should preserve the maximum pheromone load; dispr determines how fast the pheromone load drops
in dimensions further than |
doplot |
An optional boolean indicating if the probability loads should be plotted. Default = TRUE. |
deliver |
An optional boolean indicating if the probability loads should be returned. Default = FALSE. |
If deliver is TRUE, an object of class "numeric"
containing the normalized initial pheromone values
corresponding to the specified projection dimensions. Otherwise, the function plots the normalized
pheromones and nothing is returned.
José Betancourt, François Bachoc, Thierry Klein and Jérémy Rohmer
* decay for the function to generate the initial pheromone load;
* fgpm_factory for heuristic model selection in funGp.
# using default decay arguments____________________________________________________________ # input of dimension 15 projected maximum in dimension 15 decay(15) # initial pheromone load decay2probs(15) # initial probability load # input of dimension 15 projected maximum in dimension 8 decay(15, 8) # initial pheromone load decay2probs(15, 8) # initial probability load # playing with decay2probs arguments_______________________________________________________ # varying the initial pheromone load decay(15) # input of dimension 15 projected maximum in dimension 15 decay(15, tao0 = .3) # larger value of tao0 decay(15, tao0 = .3, delta = 5) # larger tao0 kept to higher dimensions decay(15, tao0 = .3, delta = 5, dispr = 5.2) # larger tao0 kept to higher dimensions # and slower decay # varying the initial probability load decay2probs(15) # input of dimension 15 projected maximum in dimension 15 decay2probs(15, tao0 = .3) # larger value of tao0 (no effect whatsoever) decay2probs(15, tao0 = .3, delta = 5) # larger tao0 kept to higher dimensions decay2probs(15, tao0 = .3, delta = 5, dispr = 5.2) # larger tao0 kept to higher dimensions # and slower decay # requesting probability values____________________________________________________________ # input of dimension 15 projected maximum in dimension 15 decay2probs(15, deliver = TRUE)
# using default decay arguments____________________________________________________________ # input of dimension 15 projected maximum in dimension 15 decay(15) # initial pheromone load decay2probs(15) # initial probability load # input of dimension 15 projected maximum in dimension 8 decay(15, 8) # initial pheromone load decay2probs(15, 8) # initial probability load # playing with decay2probs arguments_______________________________________________________ # varying the initial pheromone load decay(15) # input of dimension 15 projected maximum in dimension 15 decay(15, tao0 = .3) # larger value of tao0 decay(15, tao0 = .3, delta = 5) # larger tao0 kept to higher dimensions decay(15, tao0 = .3, delta = 5, dispr = 5.2) # larger tao0 kept to higher dimensions # and slower decay # varying the initial probability load decay2probs(15) # input of dimension 15 projected maximum in dimension 15 decay2probs(15, tao0 = .3) # larger value of tao0 (no effect whatsoever) decay2probs(15, tao0 = .3, delta = 5) # larger tao0 kept to higher dimensions decay2probs(15, tao0 = .3, delta = 5, dispr = 5.2) # larger tao0 kept to higher dimensions # and slower decay # requesting probability values____________________________________________________________ # input of dimension 15 projected maximum in dimension 15 decay2probs(15, deliver = TRUE)
User reminder of the fgpm function call.
string
Object of class "character"
. User call reminder in string format.
José Betancourt, François Bachoc, Thierry Klein and Jérémy Rohmer
fgpm
modelThis is the formal representation for data structures linked to the kernel of a Gaussian process model within the funGp package.
kerType
Object of class "character"
. Kernel type. To be set from {"gauss", "matern5_2",
"matern3_2"}.
f_disType
Object of class "character"
. Distance type. To be set from {"L2_bygroup",
"L2_index"}.
varHyp
Object of class "numeric"
. Estimated variance parameter.
s_lsHyps
Object of class "numeric"
. Estimated length-scale parameters for scalar inputs.
f_lsHyps
Object of class "numeric"
. Estimated length-scale parameters for functional
inputs.
f_lsOwners
Object of class "character"
. Index of functional input variable linked to each
element in f_lsHyps
José Betancourt, François Bachoc, Thierry Klein and Jérémy Rohmer
This function enables fitting of Gaussian process regression models. The inputs can be either scalar, functional or a combination of both types.
fgpm( sIn = NULL, fIn = NULL, sOut, kerType = "matern5_2", f_disType = "L2_bygroup", f_pdims = 3, f_basType = "B-splines", var.hyp = NULL, ls_s.hyp = NULL, ls_f.hyp = NULL, nugget = 1e-08, n.starts = 1, n.presample = 20, par.clust = NULL, trace = TRUE, pbars = TRUE, control.optim = list(trace = TRUE), ... )
fgpm( sIn = NULL, fIn = NULL, sOut, kerType = "matern5_2", f_disType = "L2_bygroup", f_pdims = 3, f_basType = "B-splines", var.hyp = NULL, ls_s.hyp = NULL, ls_f.hyp = NULL, nugget = 1e-08, n.starts = 1, n.presample = 20, par.clust = NULL, trace = TRUE, pbars = TRUE, control.optim = list(trace = TRUE), ... )
sIn |
An optional matrix of scalar input values to train the model. Each column must match an input variable and each row a training point. Either scalar input coordinates (sIn), functional input coordinates (fIn), or both must be provided. |
fIn |
An optional list of functional input values to train the model. Each element of the list must be a matrix containing the set of curves corresponding to one functional input. Either scalar input coordinates (sIn), functional input coordinates (fIn), or both must be provided. |
sOut |
A vector (or 1-column matrix) containing the values of the scalar output at the specified input points. |
kerType |
An optional character string specifying the covariance structure to be used. To be chosen between "gauss", "matern5_2" and "matern3_2". Default is "matern5_2". |
f_disType |
An optional array of character strings specifying the distance function to be used for each functional coordinates within the covariance function of the Gaussian process. To be chosen between "L2_bygroup" and "L2_byindex". The L2_bygroup distance considers each curve as a whole and uses a single length-scale parameter per functional input variable. The L2_byindex distance uses as many length-scale parameters per functional input as discretization points it has. For instance an input discretized as a vector of size 8 will use 8 length-scale parameters when using L2_byindex. If dimension reduction of a functional input is requested, then L2_byindex uses as many length scale parameters as effective dimensions used to represent the input. A single character string can also be passed as a general selection for all the functional inputs of the model. More details in the reference article (doi:10.1016/j.ress.2020.106870) and the in-depth package manual (doi:10.18637/jss.v109.i05). Default is "L2_bygroup". |
f_pdims |
An optional array with the projection dimension for each functional input. For each input, the projection dimension should be an integer between 0 and its original dimension, with 0 denoting no projection. A single character string can also be passed as a general selection for all the functional inputs of the model. Default is 3. |
f_basType |
An optional array of character strings specifying the family of basis functions to be used in the projection of each functional input. To be chosen between "B-splines" and "PCA". A single character string can also be passed as a general selection for all the functional inputs of the model. This argument will be ignored for those inputs for which no projection was requested (i.e., for which f_pdims = 0). Default is "B-splines". |
var.hyp |
An optional number indicating the value that should be used as the variance parameter of the model. If not provided, it is estimated through likelihood maximization. |
ls_s.hyp |
An optional numeric array indicating the values that should be used as length-scale parameters for the scalar inputs. If provided, the size of the array should match the number of scalar inputs. If not provided, these parameters are estimated through likelihood maximization. |
ls_f.hyp |
An optional numeric array indicating the values that should be used as length-scale parameters for the functional inputs. If provided, the size of the array should match the number of effective dimensions. Each input using the "L2_bygroup" distance will count 1 effective dimension, and each input using the "L2_byindex" distance will count as many effective dimensions as specified by the corresponding element of the f_pdims argument. For instance, two functional inputs of original dimensions 10 and 22, the first one projected onto a space of dimension 5 with "L2_byindex" distance, and the second one not projected with "L2_bygroup" distance will make up a total of 6 effective dimensions; five for the first functional input and one for second one. If this argument is not provided, the functional length-scale parameters are estimated through likelihood maximization. |
nugget |
An optional variance value standing for the homogeneous nugget effect. A tiny nugget might help to overcome numerical problems related to the ill-conditioning of the covariance matrix. Default is 1e-8. |
n.starts |
An optional integer indicating the number of initial points to use for the optimization of the hyperparameters. A parallel processing cluster can be exploited in order to speed up the evaluation of multiple initial points. More details in the description of the argument par.clust below. Default is 1. |
n.presample |
An optional integer indicating the number of points to be tested in order to select the
n.starts initial points. The n.presample points will be randomly sampled from the hyper-rectangle defined by: |
par.clust |
An optional parallel processing cluster created with the |
trace |
An optional boolean indicating if control messages native of the funGp package should be printed to
console. Default is TRUE. For complementary control on the display of funGp-native progress bars and
|
pbars |
An optional boolean indicating if progress bars should be displayed. Default is TRUE. |
control.optim |
An optional list to be passed as the |
... |
Extra control parameters. Currently only used internally for some |
An object of class fgpm containing the data structures representing the fitted funGp model.
José Betancourt, François Bachoc, Thierry Klein and Jérémy Rohmer
Betancourt, J., Bachoc, F., Klein, T., Idier, D., Rohmer, J., and Deville, Y. (2024), "funGp: An R Package for Gaussian Process Regression with Scalar and Functional Inputs". Journal of Statistical Software, 109, 5, 1–51. (doi:10.18637/jss.v109.i05)
Betancourt, J., Bachoc, F., Klein, T., Idier, D., Pedreros, R., and Rohmer, J. (2020), "Gaussian process metamodeling of functional-input code for coastal flood hazard assessment". Reliability Engineering & System Safety, 198, 106870. (doi:10.1016/j.ress.2020.106870) [HAL]
Betancourt, J., Bachoc, F., Klein, T., and Gamboa, F. (2020), Technical Report: "Ant Colony Based Model Selection for Functional-Input Gaussian Process Regression. Ref. D3.b (WP3.2)". RISCOPE project. [HAL]
Betancourt, J., Bachoc, F., and Klein, T. (2020), R Package Manual: "Gaussian Process Regression for Scalar and Functional Inputs with funGp - The in-depth tour". RISCOPE project. [HAL]
* plot,fgpm-method: validation plot for a fgpm
model;
* predict,fgpm-method for predictions based on a fgpm
model;
* simulate,fgpm-method for simulations based on a fgpm
model;
* update,fgpm-method for post-creation updates on a fgpm
model;
* fgpm_factory for funGp heuristic model selection.
# creating funGp model using default fgpm arguments________________________________________ # generating input data for training set.seed(100) n.tr <- 25 sIn <- expand.grid(x1 = seq(0,1,length = sqrt(n.tr)), x2 = seq(0,1,length = sqrt(n.tr))) fIn <- list(f1 = matrix(runif(n.tr*10), ncol = 10), f2 = matrix(runif(n.tr*22), ncol = 22)) # generating output data for training sOut <- fgp_BB3(sIn, fIn, n.tr) # building a scalar-input funGp model ms <- fgpm(sIn = sIn, sOut = sOut) # building a functional-input funGp model mf <- fgpm(fIn = fIn, sOut = sOut) # building a hybrid-input funGp model msf <- fgpm(sIn = sIn, fIn = fIn, sOut = sOut) # plotting the three models plot(ms) plot(mf) plot(msf) # printing the three models summary(ms) # equivalent to show(ms) summary(mf) # equivalent to show(mf) summary(msf) # equivalent to show(msf) # recovering useful information from a funGp model_________________________________________ # building the model set.seed(100) n.tr <- 25 sIn <- expand.grid(x1 = seq(0,1,length = sqrt(n.tr)), x2 = seq(0,1,length = sqrt(n.tr))) fIn <- list(f1 = matrix(runif(n.tr*10), ncol = 10), f2 = matrix(runif(n.tr*22), ncol = 22)) sOut <- fgp_BB3(sIn, fIn, n.tr) m1 <- fgpm(sIn = sIn, fIn = fIn, sOut = sOut) # recovering data from model slots m1@f_proj@coefs # list of projection coefficients for the functional inputs m1@f_proj@basis # list of projection basis functions for the functional inputs Map(function(a, b) a %*% t(b), m1@f_proj@coefs, m1@f_proj@basis) # list of projected # functional inputs tcrossprod(m1@preMats$L) # training auto-covariance matrix # making predictions based on a funGp model________________________________________________ # building the model set.seed(100) n.tr <- 25 sIn <- expand.grid(x1 = seq(0,1,length = sqrt(n.tr)), x2 = seq(0,1,length = sqrt(n.tr))) fIn <- list(f1 = matrix(runif(n.tr*10), ncol = 10), f2 = matrix(runif(n.tr*22), ncol = 22)) sOut <- fgp_BB3(sIn, fIn, n.tr) m1 <- fgpm(sIn = sIn, fIn = fIn, sOut = sOut) # generating input data for prediction n.pr <- 100 sIn.pr <- as.matrix(expand.grid(x1 = seq(0,1,length = sqrt(n.pr)), x2 = seq(0,1,length = sqrt(n.pr)))) fIn.pr <- list(f1 = matrix(runif(n.pr*10), ncol = 10), matrix(runif(n.pr*22), ncol = 22)) # making predictions m1.preds <- predict(m1, sIn.pr = sIn.pr, fIn.pr = fIn.pr) # plotting predictions plot(m1.preds) # simulating from a funGp model____________________________________________________________ # building the model set.seed(100) n.tr <- 25 sIn <- expand.grid(x1 = seq(0,1,length = sqrt(n.tr)), x2 = seq(0,1,length = sqrt(n.tr))) fIn <- list(f1 = matrix(runif(n.tr*10), ncol = 10), f2 = matrix(runif(n.tr*22), ncol = 22)) sOut <- fgp_BB3(sIn, fIn, n.tr) m1 <- fgpm(sIn = sIn, fIn = fIn, sOut = sOut) # generating input data for simulation n.sm <- 100 sIn.sm <- as.matrix(expand.grid(x1 = seq(0,1,length = sqrt(n.sm)), x2 = seq(0,1,length = sqrt(n.sm)))) fIn.sm <- list(f1 = matrix(runif(n.sm*10), ncol = 10), matrix(runif(n.sm*22), ncol = 22)) # making simulations m1.sims <- simulate(m1, nsim = 10, sIn.sm = sIn.sm, fIn.sm = fIn.sm) # plotting simulations plot(m1.sims) # creating funGp model using custom fgpm arguments_________________________________________ # generating input and output data set.seed(100) n.tr <- 25 sIn <- expand.grid(x1 = seq(0,1,length = sqrt(n.tr)), x2 = seq(0,1,length = sqrt(n.tr))) fIn <- list(f1 = matrix(runif(n.tr*10), ncol = 10), f2 = matrix(runif(n.tr*22), ncol = 22)) sOut <- fgp_BB3(sIn, fIn, n.tr) # original dimensions # f1: 10 # f2: 22 # building a the model with the following structure # - Kernel: Gaussian # - f1: L2_byindex distance, no projection -> 10 length-scale parameters # - f2: L2_bygroup distance, B-spline basis of dimension 5 -> 1 length-scale parameter m1 <- fgpm(sIn = sIn, fIn = fIn, sOut = sOut, kerType = "gauss", f_disType = c("L2_byindex", "L2_bygroup"), f_pdims = c(0,5), f_basType = c(NA, "B-splines")) # plotting the model plot(m1) # printing the model m1 # equivalent to show(m1) ## Not run: # multistart and parallelization in fgpm___________________________________________________ # generating input and output data set.seed(100) n.tr <- 243 sIn <- expand.grid(x1 = seq(0,1,length = n.tr^(1/5)), x2 = seq(0,1,length = n.tr^(1/5)), x3 = seq(0,1,length = n.tr^(1/5)), x4 = seq(0,1,length = n.tr^(1/5)), x5 = seq(0,1,length = n.tr^(1/5))) fIn <- list(f1 = matrix(runif(n.tr*10), ncol = 10), f2 = matrix(runif(n.tr*22), ncol = 22)) sOut <- fgp_BB7(sIn, fIn, n.tr) # calling fgpm with multistart in parallel cl <- parallel::makeCluster(2) m1 <- fgpm(sIn = sIn, fIn = fIn, sOut = sOut, n.starts = 10, par.clust = cl) # (~14 seconds) parallel::stopCluster(cl) # NOTE: in order to provide progress bars for the monitoring of time consuming processes # ran in parallel, funGp relies on the doFuture and future packages. Parallel processes # suddenly interrupted by the user tend to leave corrupt connections. This problem is # originated outside funGp, which limits our control over it. In the initial (unpublished) # version of the funGp manual, we provide a temporary solution to the issue and we remain # attentive in case it appears a more elegant way to handle it or a manner to suppress it. # # funGp original (unpublished) manual: https://hal.science/hal-02536624 ## End(Not run)
# creating funGp model using default fgpm arguments________________________________________ # generating input data for training set.seed(100) n.tr <- 25 sIn <- expand.grid(x1 = seq(0,1,length = sqrt(n.tr)), x2 = seq(0,1,length = sqrt(n.tr))) fIn <- list(f1 = matrix(runif(n.tr*10), ncol = 10), f2 = matrix(runif(n.tr*22), ncol = 22)) # generating output data for training sOut <- fgp_BB3(sIn, fIn, n.tr) # building a scalar-input funGp model ms <- fgpm(sIn = sIn, sOut = sOut) # building a functional-input funGp model mf <- fgpm(fIn = fIn, sOut = sOut) # building a hybrid-input funGp model msf <- fgpm(sIn = sIn, fIn = fIn, sOut = sOut) # plotting the three models plot(ms) plot(mf) plot(msf) # printing the three models summary(ms) # equivalent to show(ms) summary(mf) # equivalent to show(mf) summary(msf) # equivalent to show(msf) # recovering useful information from a funGp model_________________________________________ # building the model set.seed(100) n.tr <- 25 sIn <- expand.grid(x1 = seq(0,1,length = sqrt(n.tr)), x2 = seq(0,1,length = sqrt(n.tr))) fIn <- list(f1 = matrix(runif(n.tr*10), ncol = 10), f2 = matrix(runif(n.tr*22), ncol = 22)) sOut <- fgp_BB3(sIn, fIn, n.tr) m1 <- fgpm(sIn = sIn, fIn = fIn, sOut = sOut) # recovering data from model slots m1@f_proj@coefs # list of projection coefficients for the functional inputs m1@f_proj@basis # list of projection basis functions for the functional inputs Map(function(a, b) a %*% t(b), m1@f_proj@coefs, m1@f_proj@basis) # list of projected # functional inputs tcrossprod(m1@preMats$L) # training auto-covariance matrix # making predictions based on a funGp model________________________________________________ # building the model set.seed(100) n.tr <- 25 sIn <- expand.grid(x1 = seq(0,1,length = sqrt(n.tr)), x2 = seq(0,1,length = sqrt(n.tr))) fIn <- list(f1 = matrix(runif(n.tr*10), ncol = 10), f2 = matrix(runif(n.tr*22), ncol = 22)) sOut <- fgp_BB3(sIn, fIn, n.tr) m1 <- fgpm(sIn = sIn, fIn = fIn, sOut = sOut) # generating input data for prediction n.pr <- 100 sIn.pr <- as.matrix(expand.grid(x1 = seq(0,1,length = sqrt(n.pr)), x2 = seq(0,1,length = sqrt(n.pr)))) fIn.pr <- list(f1 = matrix(runif(n.pr*10), ncol = 10), matrix(runif(n.pr*22), ncol = 22)) # making predictions m1.preds <- predict(m1, sIn.pr = sIn.pr, fIn.pr = fIn.pr) # plotting predictions plot(m1.preds) # simulating from a funGp model____________________________________________________________ # building the model set.seed(100) n.tr <- 25 sIn <- expand.grid(x1 = seq(0,1,length = sqrt(n.tr)), x2 = seq(0,1,length = sqrt(n.tr))) fIn <- list(f1 = matrix(runif(n.tr*10), ncol = 10), f2 = matrix(runif(n.tr*22), ncol = 22)) sOut <- fgp_BB3(sIn, fIn, n.tr) m1 <- fgpm(sIn = sIn, fIn = fIn, sOut = sOut) # generating input data for simulation n.sm <- 100 sIn.sm <- as.matrix(expand.grid(x1 = seq(0,1,length = sqrt(n.sm)), x2 = seq(0,1,length = sqrt(n.sm)))) fIn.sm <- list(f1 = matrix(runif(n.sm*10), ncol = 10), matrix(runif(n.sm*22), ncol = 22)) # making simulations m1.sims <- simulate(m1, nsim = 10, sIn.sm = sIn.sm, fIn.sm = fIn.sm) # plotting simulations plot(m1.sims) # creating funGp model using custom fgpm arguments_________________________________________ # generating input and output data set.seed(100) n.tr <- 25 sIn <- expand.grid(x1 = seq(0,1,length = sqrt(n.tr)), x2 = seq(0,1,length = sqrt(n.tr))) fIn <- list(f1 = matrix(runif(n.tr*10), ncol = 10), f2 = matrix(runif(n.tr*22), ncol = 22)) sOut <- fgp_BB3(sIn, fIn, n.tr) # original dimensions # f1: 10 # f2: 22 # building a the model with the following structure # - Kernel: Gaussian # - f1: L2_byindex distance, no projection -> 10 length-scale parameters # - f2: L2_bygroup distance, B-spline basis of dimension 5 -> 1 length-scale parameter m1 <- fgpm(sIn = sIn, fIn = fIn, sOut = sOut, kerType = "gauss", f_disType = c("L2_byindex", "L2_bygroup"), f_pdims = c(0,5), f_basType = c(NA, "B-splines")) # plotting the model plot(m1) # printing the model m1 # equivalent to show(m1) ## Not run: # multistart and parallelization in fgpm___________________________________________________ # generating input and output data set.seed(100) n.tr <- 243 sIn <- expand.grid(x1 = seq(0,1,length = n.tr^(1/5)), x2 = seq(0,1,length = n.tr^(1/5)), x3 = seq(0,1,length = n.tr^(1/5)), x4 = seq(0,1,length = n.tr^(1/5)), x5 = seq(0,1,length = n.tr^(1/5))) fIn <- list(f1 = matrix(runif(n.tr*10), ncol = 10), f2 = matrix(runif(n.tr*22), ncol = 22)) sOut <- fgp_BB7(sIn, fIn, n.tr) # calling fgpm with multistart in parallel cl <- parallel::makeCluster(2) m1 <- fgpm(sIn = sIn, fIn = fIn, sOut = sOut, n.starts = 10, par.clust = cl) # (~14 seconds) parallel::stopCluster(cl) # NOTE: in order to provide progress bars for the monitoring of time consuming processes # ran in parallel, funGp relies on the doFuture and future packages. Parallel processes # suddenly interrupted by the user tend to leave corrupt connections. This problem is # originated outside funGp, which limits our control over it. In the initial (unpublished) # version of the funGp manual, we provide a temporary solution to the issue and we remain # attentive in case it appears a more elegant way to handle it or a manner to suppress it. # # funGp original (unpublished) manual: https://hal.science/hal-02536624 ## End(Not run)
This function enables the smart exploration of the solution space of potential structural configurations of a funGp model, and the consequent selection of a high quality configuration. funGp currently relies on an ant colony based algorithm to perform this task. The algorithm defines the solution space based on the levels of each structural parameter currently available in the fgpm function, and performs a smart exploration of it. More details on the algorithm are provided in a dedicated technical report. funGp might evolve in the future to include improvements in the current algorithm or alternative solution methods.
fgpm_factory( sIn = NULL, fIn = NULL, sOut = NULL, ind.vl = NULL, ctraints = list(), setup = list(), time.lim = Inf, nugget = 1e-08, n.starts = 1, n.presample = 20, par.clust = NULL, trace = TRUE, pbars = interactive() )
fgpm_factory( sIn = NULL, fIn = NULL, sOut = NULL, ind.vl = NULL, ctraints = list(), setup = list(), time.lim = Inf, nugget = 1e-08, n.starts = 1, n.presample = 20, par.clust = NULL, trace = TRUE, pbars = interactive() )
sIn |
An optional matrix of scalar input values to train the model. Each column must match an input variable and each row a training point. Either scalar input coordinates (sIn), functional input coordinates (fIn), or both must be provided. |
fIn |
An optional list of functional input values to train the model. Each element of the list must be a matrix containing the set of curves corresponding to one functional input. Either scalar input coordinates (sIn), functional input coordinates (fIn), or both must be provided. |
sOut |
A vector (or 1-column matrix) containing the values of the scalar output at the specified input points. |
ind.vl |
An optional numerical matrix specifying which points in the three structures above should be used for training and which for validation. If provided, the optimization will be conducted in terms of the hold-out coefficient of determination Q², which comes from training the model with a subset of the points, and then estimating the prediction error in the remaining points. In that case, each column of ind.vl will be interpreted as one validation set, and the multiple columns will imply replicates. In the simplest case, ind.vl will be a one-column matrix or simply an array, meaning that a simple replicate should be used for each model configuration explored. If not provided, the optimization will be conducted in terms of the leave-one-out cross-validation Q², which for a total number of n observations, comes from training the model n times, each using n-1 points for training and the remaining one for validation. This procedure is typically costly due to the large number of hyperparameter optimizations that should be conducted, nonetheless, fgpm_factory implements the virtual equations introduced by Dubrule (1983) for Gaussian processes, which require a single hyperparameter optimization. See the reference below for more details. |
ctraints |
An optional list specifying the constraints of the structural optimization problem. Valid
entries for this list are: |
setup |
An optional list indicating the value for some parameters of the structural optimization
algorithm. The ant colony optimization algorithm available at this time allows the following entries: |
time.lim |
An optional number specifying a time limit in seconds to be used as stopping condition for the structural optimization. |
nugget |
An optional variance value standing for the homogeneous nugget effect. A tiny nugget might help to overcome numerical problems related to the ill-conditioning of the covariance matrix. Default is 1e-8. |
n.starts |
An optional integer indicating the number of initial points to use for the optimization of the hyperparameters. A parallel processing cluster can be exploited in order to speed up the evaluation of multiple initial points. More details in the description of the argument par.clust below. Default is 1. |
n.presample |
An optional integer indicating the number of points to be tested in order to select the
n.starts initial points. The n.presample points will be randomly sampled from the hyper-rectangle defined by: |
par.clust |
An optional parallel processing cluster created with the |
trace |
An optional boolean indicating if control messages native of the funGp package should be
printed to console. Default is TRUE. For complementary control on the display of funGp-native progress bars, have a look at
the |
pbars |
An optional boolean indicating if progress bars should be displayed. Default is TRUE. |
An object of class Xfgpm containing the data structures linked to the structural optimization
of a funGp model. It includes as the main component an object of class fgpm corresponding to the
optimized model. It is accessible through the @model
slot of the Xfgpm object.
José Betancourt, François Bachoc, Thierry Klein and Jérémy Rohmer
Betancourt, J., Bachoc, F., Klein, T., Idier, D., Rohmer, J., and Deville, Y. (2024), "funGp: An R Package for Gaussian Process Regression with Scalar and Functional Inputs". Journal of Statistical Software, 109, 5, 1–51. (doi:10.18637/jss.v109.i05)
Betancourt, J., Bachoc, F., Klein, T., Idier, D., Pedreros, R., and Rohmer, J. (2020), "Gaussian process metamodeling of functional-input code for coastal flood hazard assessment". Reliability Engineering & System Safety, 198, 106870. (doi:10.1016/j.ress.2020.106870) [HAL]
Betancourt, J., Bachoc, F., Klein, T., and Gamboa, F. (2020), Technical Report: "Ant Colony Based Model Selection for Functional-Input Gaussian Process Regression. Ref. D3.b (WP3.2)". RISCOPE project. [HAL]
Betancourt, J., Bachoc, F., and Klein, T. (2020), R Package Manual: "Gaussian Process Regression for Scalar and Functional Inputs with funGp - The in-depth tour". RISCOPE project. [HAL]
Dubrule, O. (1983), "Cross validation of kriging in a unique neighborhood". Journal of the International Association for Mathematical Geology, 15, 687-699. [MG]
* plot,Xfgpm-method with
which = "evolution"
for visualizing the evolution of
the ACO algorithm, or with which = "diag"
for a
diagnostic plot;
* get_active_in for post-processing of input data structures following a fgpm_factory call;
* predict,fgpm-method for predictions based on a funGp model;
* simulate,fgpm-method for simulations based on a funGp model;
* update,fgpm-method for post-creation updates on a funGp model.
#construction of a fgpm object set.seed(100) n.tr <- 32 x1 <- x2 <- x3 <- x4 <- x5 <- seq(0,1,length = n.tr^(1/5)) sIn <- expand.grid(x1 = x1, x2 = x2, x3 = x3, x4 = x4, x5 = x5) fIn <- list(f1 = matrix(runif(n.tr * 10), ncol = 10), f2 = matrix(runif(n.tr * 22), ncol = 22)) sOut <- fgp_BB7(sIn, fIn, n.tr) # optimizing the model structure with fgpm_factory (~12 seconds) ## Not run: xm <- fgpm_factory(sIn = sIn, fIn = fIn, sOut = sOut) ## End(Not run) # assessing the quality of the model # in the absolute and also w.r.t. the other explored models plot(xm, which = "diag") # checking the evolution of the algorithm plot(xm, which = "evol") # Summary of the tested configurations summary(xm) # checking the log of crashed iterations print([email protected]) # building the model with the default fgpm arguments to compare set.seed(100) n.tr <- 32 x1 <- x2 <- x3 <- x4 <- x5 <- seq(0,1,length = n.tr^(1/5)) sIn <- expand.grid(x1 = x1, x2 = x2, x3 = x3, x4 = x4, x5 = x5) fIn <- list(f1 = matrix(runif(n.tr * 10), ncol = 10), f2 <- matrix(runif(n.tr * 22), ncol = 22)) sOut <- fgp_BB7(sIn, fIn, n.tr) m1 <- fgpm(sIn = sIn, fIn = fIn, sOut = sOut) plot(m1) # plotting the model # improving performance with more iterations_______________________________________________ # call to fgpm_factory (~22 seconds) ## Not run: xm25 <- fgpm_factory(sIn = sIn, fIn = fIn, sOut = sOut, setup = list(n.iter = 25)) ## End(Not run) # assessing evolution and quality plot(xm25, which = "evol") plot(xm25, which = "diag") # custom solution space____________________________________________________________________ myctr <- list(s_keepOn = c(1,2), # keep both scalar inputs always on f_keepOn = c(2), # keep f2 always active f_disTypes = list("2" = c("L2_byindex")), # only use L2_byindex distance for f2 f_fixDims = matrix(c(2,4), ncol = 1), # f2 projected in dimension 4 f_maxDims = matrix(c(1,5), ncol = 1), # f1 projected in dimension max 5 f_basTypes = list("1" = c("B-splines")), # only use B-splines projection for f1 kerTypes = c("matern5_2", "gauss")) # test only Matern 5/2 and Gaussian kernels # # call to fgpm_factory (~12 seconds) ## Not run: xmc <- fgpm_factory(sIn = sIn, fIn = fIn, sOut = sOut, ctraints = myctr) ## End(Not run) # assessing evolution and quality plot(xmc, which = "evol") plot(xmc, which = "diag") # verifying constraints with the log of some successfully built models summary(xmc) # custom heuristic parameters______________________________________________________________ mysup <- list(n.iter = 30, n.pop = 12, tao0 = .15, dop.s = 1.2, dop.f = 1.3, delta.f = 4, dispr.f = 1.1, q0 = .85, rho.l = .2, u.gbest = TRUE, n.ibest = 2, rho.g = .08) # call to fgpm_factory (~20 seconds) ## Not run: xmh <- fgpm_factory(sIn = sIn, fIn = fIn, sOut = sOut, setup = mysup) ## End(Not run) # verifying heuristic setup through the details of the Xfgpm object unlist(xmh@details$param) # stopping condition based on time_________________________________________________________ mysup <- list(n.iter = 2000) mytlim <- 60 # call to fgpm_factory (~60 seconds) ## Not run: xms <- fgpm_factory(sIn = sIn, fIn = fIn, sOut = sOut, setup = mysup, time.lim = mytlim) ## End(Not run) summary(xms) ## Not run: # parallelization in the model factory_____________________________________________________ # generating input and output data set.seed(100) n.tr <- 243 sIn <- expand.grid(x1 = seq(0,1,length = n.tr^(1/5)), x2 = seq(0,1,length = n.tr^(1/5)), x3 = seq(0,1,length = n.tr^(1/5)), x4 = seq(0,1,length = n.tr^(1/5)), x5 = seq(0,1,length = n.tr^(1/5))) fIn <- list(f1 = matrix(runif(n.tr*10), ncol = 10), f2 = matrix(runif(n.tr*22), ncol = 22)) sOut <- fgp_BB7(sIn, fIn, n.tr) # calling fgpm_factory in parallel cl <- parallel::makeCluster(2) xm.par <- fgpm_factory(sIn = sIn, fIn = fIn, sOut = sOut, par.clust = cl) # (~260 seconds) parallel::stopCluster(cl) # NOTE: in order to provide progress bars for the monitoring of time consuming processes # ran in parallel, funGp relies on the doFuture and future packages. Parallel processes # suddenly interrupted by the user tend to leave corrupt connections. This problem is # originated outside funGp, which limits our control over it. In the initial (unpublished) # version of the funGp manual, we provide a temporary solution to the issue and we remain # attentive in case it appears a more elegant way to handle it or a manner to suppress it. # # funGp original (unpublished) manual: https://hal.science/hal-02536624 ## End(Not run)
#construction of a fgpm object set.seed(100) n.tr <- 32 x1 <- x2 <- x3 <- x4 <- x5 <- seq(0,1,length = n.tr^(1/5)) sIn <- expand.grid(x1 = x1, x2 = x2, x3 = x3, x4 = x4, x5 = x5) fIn <- list(f1 = matrix(runif(n.tr * 10), ncol = 10), f2 = matrix(runif(n.tr * 22), ncol = 22)) sOut <- fgp_BB7(sIn, fIn, n.tr) # optimizing the model structure with fgpm_factory (~12 seconds) ## Not run: xm <- fgpm_factory(sIn = sIn, fIn = fIn, sOut = sOut) ## End(Not run) # assessing the quality of the model # in the absolute and also w.r.t. the other explored models plot(xm, which = "diag") # checking the evolution of the algorithm plot(xm, which = "evol") # Summary of the tested configurations summary(xm) # checking the log of crashed iterations print(xm@log.crashes) # building the model with the default fgpm arguments to compare set.seed(100) n.tr <- 32 x1 <- x2 <- x3 <- x4 <- x5 <- seq(0,1,length = n.tr^(1/5)) sIn <- expand.grid(x1 = x1, x2 = x2, x3 = x3, x4 = x4, x5 = x5) fIn <- list(f1 = matrix(runif(n.tr * 10), ncol = 10), f2 <- matrix(runif(n.tr * 22), ncol = 22)) sOut <- fgp_BB7(sIn, fIn, n.tr) m1 <- fgpm(sIn = sIn, fIn = fIn, sOut = sOut) plot(m1) # plotting the model # improving performance with more iterations_______________________________________________ # call to fgpm_factory (~22 seconds) ## Not run: xm25 <- fgpm_factory(sIn = sIn, fIn = fIn, sOut = sOut, setup = list(n.iter = 25)) ## End(Not run) # assessing evolution and quality plot(xm25, which = "evol") plot(xm25, which = "diag") # custom solution space____________________________________________________________________ myctr <- list(s_keepOn = c(1,2), # keep both scalar inputs always on f_keepOn = c(2), # keep f2 always active f_disTypes = list("2" = c("L2_byindex")), # only use L2_byindex distance for f2 f_fixDims = matrix(c(2,4), ncol = 1), # f2 projected in dimension 4 f_maxDims = matrix(c(1,5), ncol = 1), # f1 projected in dimension max 5 f_basTypes = list("1" = c("B-splines")), # only use B-splines projection for f1 kerTypes = c("matern5_2", "gauss")) # test only Matern 5/2 and Gaussian kernels # # call to fgpm_factory (~12 seconds) ## Not run: xmc <- fgpm_factory(sIn = sIn, fIn = fIn, sOut = sOut, ctraints = myctr) ## End(Not run) # assessing evolution and quality plot(xmc, which = "evol") plot(xmc, which = "diag") # verifying constraints with the log of some successfully built models summary(xmc) # custom heuristic parameters______________________________________________________________ mysup <- list(n.iter = 30, n.pop = 12, tao0 = .15, dop.s = 1.2, dop.f = 1.3, delta.f = 4, dispr.f = 1.1, q0 = .85, rho.l = .2, u.gbest = TRUE, n.ibest = 2, rho.g = .08) # call to fgpm_factory (~20 seconds) ## Not run: xmh <- fgpm_factory(sIn = sIn, fIn = fIn, sOut = sOut, setup = mysup) ## End(Not run) # verifying heuristic setup through the details of the Xfgpm object unlist(xmh@details$param) # stopping condition based on time_________________________________________________________ mysup <- list(n.iter = 2000) mytlim <- 60 # call to fgpm_factory (~60 seconds) ## Not run: xms <- fgpm_factory(sIn = sIn, fIn = fIn, sOut = sOut, setup = mysup, time.lim = mytlim) ## End(Not run) summary(xms) ## Not run: # parallelization in the model factory_____________________________________________________ # generating input and output data set.seed(100) n.tr <- 243 sIn <- expand.grid(x1 = seq(0,1,length = n.tr^(1/5)), x2 = seq(0,1,length = n.tr^(1/5)), x3 = seq(0,1,length = n.tr^(1/5)), x4 = seq(0,1,length = n.tr^(1/5)), x5 = seq(0,1,length = n.tr^(1/5))) fIn <- list(f1 = matrix(runif(n.tr*10), ncol = 10), f2 = matrix(runif(n.tr*22), ncol = 22)) sOut <- fgp_BB7(sIn, fIn, n.tr) # calling fgpm_factory in parallel cl <- parallel::makeCluster(2) xm.par <- fgpm_factory(sIn = sIn, fIn = fIn, sOut = sOut, par.clust = cl) # (~260 seconds) parallel::stopCluster(cl) # NOTE: in order to provide progress bars for the monitoring of time consuming processes # ran in parallel, funGp relies on the doFuture and future packages. Parallel processes # suddenly interrupted by the user tend to leave corrupt connections. This problem is # originated outside funGp, which limits our control over it. In the initial (unpublished) # version of the funGp manual, we provide a temporary solution to the issue and we remain # attentive in case it appears a more elegant way to handle it or a manner to suppress it. # # funGp original (unpublished) manual: https://hal.science/hal-02536624 ## End(Not run)
This is the formal representation of Gaussian process models within the funGp package. Gaussian process models are useful statistical tools in the modeling of complex input-output relationships.
Main methods
fgpm: creation of funGp regression models
predict,fgpm-method: output estimation at new input points based on a fgpm
model
simulate,fgpm-method: random sampling from a fgpm
model
update,fgpm-method: modification of data and hyperparameters of a fgpm
model
Plotters
plot,fgpm-method: validation plot for a fgpm
model
plot.predict.fgpm: plot of predictions based on a fgpm
model
plot.simulate.fgpm: plot of simulations based on a fgpm
model
howCalled
Object of class "modelCall"
. User call reminder.
type
Object of class "character"
. Type of model based on type of inputs. To be set from
{"scalar", "functional", "hybrid"}.
ds
Object of class "numeric"
. Number of scalar inputs.
df
Object of class "numeric"
. Number of functional inputs.
f_dims
Object of class "numeric"
. An array with the original dimension of each functional
input.
sIn
Object of class "matrix"
. The scalar input points. Variables are arranged by columns and
coordinates by rows.
fIn
Object of class "list"
. The functional input points. Each element of the list contains
a functional input in the form of a matrix. In each matrix, curves representing functional coordinates
are arranged by rows.
sOut
Object of class "matrix"
. The scalar output values at the coordinates specified by sIn
and/or fIn.
n.tot
Object of class "integer"
. Number of observed points used to compute the training-training
and training-prediction covariance matrices.
n.tr
Object of class "integer"
. Among all the points loaded in the model, the amount used for
training.
f_proj
Object of class "fgpProj"
. Data structures related to the projection of functional
inputs. Check fgpProj for more details.
kern
Object of class "fgpKern"
. Data structures related to the kernel of the Gaussian process
model. Check fgpKern for more details.
nugget
Object of class "numeric"
. Variance parameter standing for the homogeneous nugget effect.
preMats
Object of class "list"
. L and LInvY matrices pre-computed for prediction. L is a lower
diagonal matrix such that equals the training auto-covariance matrix
. On the other
hand,
.
convergence
Object of class "numeric"
. Integer code either confirming convergence or indicating
an error. Check the convergence component of the Value returned by optim
.
negLogLik
Object of class "numeric"
. Negated log-likelihood obained by optim
during hyperparameter optimization.
Manual: funGp: An R Package for Gaussian Process Regression with Scalar and Functional Inputs (doi:10.18637/jss.v109.i05)
José Betancourt, François Bachoc, Thierry Klein and Jérémy Rohmer
fgpm
modelThis is the formal representation for data structures linked to projection of inputs in a Gaussian process model within the funGp package.
pdims
Object of class "numeric"
. Projection dimension of each input.
basType
Object of class "character"
. To be chosen from {"PCA", "B-splines"}.
basis
Object of class "list"
. Projection basis. For functional inputs, each element
(fDims_i x fpDims_i) contains the basis functions used for the projection of one functional input.
coefs
Object of class "list"
. Each element (n x fpDims_i) contains the coefficients used for
the projection of one functional input.
José Betancourt, François Bachoc, Thierry Klein and Jérémy Rohmer
The fgpm_factory function returns an object of class "Xfgpm"
with the function call of all the evaluated models stored in the @log.success@args
and
@log.crashes@args
slots. The get_active_in
function interprets the arguments linked to any
structural configuration and returns a list with two elements: (i) a matrix
of scalar input
variables kept active; and (ii) a list
of functional input variables kept active.
get_active_in(sIn = NULL, fIn = NULL, args)
get_active_in(sIn = NULL, fIn = NULL, args)
sIn |
An optional matrix of scalar input coordinates with all the orignal scalar input variables. |
fIn |
An optional list of functional input coordinates with all the original functional input variables. |
args |
An object of class |
An object of class "list"
, containing the following information extracted from the
args parameter: (i) a matrix
of scalar input variables kept active; and (ii) a list
of functional input variables kept active.
José Betancourt, François Bachoc, Thierry Klein and Jérémy Rohmer
Betancourt, J., Bachoc, F., Klein, T., Idier, D., Rohmer, J., and Deville, Y. (2024), "funGp: An R Package for Gaussian Process Regression with Scalar and Functional Inputs". Journal of Statistical Software, 109, 5, 1–51. (doi:10.18637/jss.v109.i05)
Betancourt, J., Bachoc, F., and Klein, T. (2020), R Package Manual: "Gaussian Process Regression for Scalar and Functional Inputs with funGp - The in-depth tour". RISCOPE project. [HAL]
* which_on for details on how to obtain only the indices of the active inputs.
* modelCall for details on the args argument.
* fgpm_factory for funGp heuristic model selection.
* Xfgpm for details on object delivered by fgpm_factory.
# Use precalculated Xfgpm object named xm # indices of active inputs in the best model [email protected]@args[[1]] # the full fgpm call set.seed(100) n.tr <- 32 sIn <- expand.grid(x1 = seq(0,1,length = n.tr^(1/5)), x2 = seq(0,1,length = n.tr^(1/5)), x3 = seq(0,1,length = n.tr^(1/5)), x4 = seq(0,1,length = n.tr^(1/5)), x5 = seq(0,1,length = n.tr^(1/5))) fIn <- list(f1 = matrix(runif(n.tr*10), ncol = 10), f2 = matrix(runif(n.tr*22), ncol = 22)) which_on(sIn, fIn, [email protected]@args[[1]]) # only the indices extracted by which_on # data structures of active inputs active <- get_active_in(sIn, fIn, [email protected]@args[[1]]) active$sIn.on # scalar data structures active$fIn.on # functional data structures # identifying selected model and corresponding fgpm arguments opt.model <- xm@model opt.args <- [email protected]@args[[1]] # generating new input data for prediction n.pr <- 243 sIn.pr <- expand.grid(x1 = seq(0,1,length = n.pr^(1/5)), x2 = seq(0,1,length = n.pr^(1/5)), x3 = seq(0,1,length = n.pr^(1/5)), x4 = seq(0,1,length = n.pr^(1/5)), x5 = seq(0,1,length = n.pr^(1/5))) fIn.pr <- list(f1 = matrix(runif(n.pr*10), ncol = 10), f2 = matrix(runif(n.pr*22), ncol = 22)) # pruning data structures for prediction to keep only active inputs!! active <- get_active_in(sIn.pr, fIn.pr, opt.args) # making predictions preds <- predict(opt.model, sIn.pr = active$sIn.on, fIn.pr = active$fIn.on) # plotting predictions plot(preds) # preparing new data for simulation based on inputs kept active____________________________ opt.model <- xm@model opt.args <- [email protected]@args[[1]] # generating new input data for simulation n.sm <- 243 sIn.sm <- expand.grid(x1 = seq(0,1,length = n.pr^(1/5)), x2 = seq(0,1,length = n.pr^(1/5)), x3 = seq(0,1,length = n.pr^(1/5)), x4 = seq(0,1,length = n.pr^(1/5)), x5 = seq(0,1,length = n.pr^(1/5))) fIn.sm <- list(f1 = matrix(runif(n.sm*10), ncol = 10), f2 = matrix(runif(n.sm*22), ncol = 22)) # pruning data structures for simulation to keep only active inputs!! active <- get_active_in(sIn.sm, fIn.sm, opt.args) # making light simulations sims_l <- simulate(opt.model, nsim = 10, sIn.sm = active$sIn.on, fIn.sm = active$fIn.on) # plotting light simulations plot(sims_l) ## Not run: # rebuilding of 3 best models using new data_______________________________________________ # NOTE: this example is of higher complexity than the previous ones. We recomend you run # the previous examples and understand the @log.success and @log.crashes slots in # the Xfgpm object delivered by fgpm_factory. # # In the second example above we showed how to use get_active_in to prune the input # data structures for prediction based on the fgpm arguments of the best model found # by fgpm_factory. In this new example we generalize that concept by: (i) rebuilding # the 3 best models found by fgpm_factory using new data, (ii) pruning the input # data structures used for prediction with each of the models, and (iii) plotting # the predictions made by the three models. The key ingredient here is that the # three best models might have different scalar and functional inputs active. The # get_active_in function will allow to process the data structures in order to # extract only the scalar inputs required to re-build the model and then to make # predictions with each model. Check also the funGp manual for further details # # funGp manual: https://doi.org/10.18637/jss.v109.i05 # <<<<<<< PART 1: calling fgpm_factory to perform the structural optimization >>>>>>> # ------------------------------------------------------------------- # this part is precalculated and loaded via data("precalculated_Xfgpm_objects") summary(xm) # <<<<<<< PART 2: re-building the three best models found by fgpm_factory >>>>>>> # --------------------------------------------------------------- # recovering the fgpm arguments of the three best models argStack <- [email protected]@args[1:3] # new data arrived, now we have 243 observations n.nw <- 243 # more points! sIn.nw <- expand.grid(x1 = seq(0,1,length = n.nw^(1/5)), x2 = seq(0,1,length = n.nw^(1/5)), x3 = seq(0,1,length = n.nw^(1/5)), x4 = seq(0,1,length = n.nw^(1/5)), x5 = seq(0,1,length = n.nw^(1/5))) fIn.nw <- list(f1 = matrix(runif(n.nw*10), ncol = 10), f2 = matrix(runif(n.nw*22), ncol = 22)) sOut.nw <- fgp_BB7(sIn.nw, fIn.nw, n.nw) # the second best model modelDef(xm,2) # re-building the three best models based on the new data (compact code with all 3 calls) newEnv <- list(sIn = sIn.nw, fIn = fIn.nw, sOut = sOut.nw) modStack <- lapply(1:3, function(i) eval(parse(text = modelDef(xm,i)), env = newEnv)) # <<<<<<< PART 3: making predictions from the three best models found by fgpm_factory >>>>>>> # --------------------------------------------------------------------------- # generating input data for prediction n.pr <- 32 sIn.pr <- expand.grid(x1 = seq(0,1,length = n.pr^(1/5)), x2 = seq(0,1,length = n.pr^(1/5)), x3 = seq(0,1,length = n.pr^(1/5)), x4 = seq(0,1,length = n.pr^(1/5)), x5 = seq(0,1,length = n.pr^(1/5))) fIn.pr <- list(f1 = matrix(runif(n.pr*10), ncol = 10), matrix(runif(n.pr*22), ncol = 22)) # making predictions based on the three best models (compact code with all 3 calls) preds <- do.call(cbind, Map(function(model, args) { active <- get_active_in(sIn.pr, fIn.pr, args) predict(model, sIn.pr = active$sIn.on, fIn.pr = active$fIn.on)$mean }, modStack, argStack)) # <<<<<<< PART 4: plotting predictions from the three best models found by fgpm_factory >>>>>>> # ----------------------------------------------------------------------------- # plotting predictions made by the three models plot(1, xlim = c(1,nrow(preds)), ylim = range(preds), xaxt = "n", xlab = "Prediction point index", ylab = "Output", main = "Predictions with best 3 structural configurations") axis(1, 1:nrow(preds)) for (i in seq_len(n.pr)) {lines(rep(i,2), range(preds[i,1:3]), col = "grey35", lty = 3)} points(preds[,1], pch = 21, bg = "black") points(preds[,2], pch = 23, bg = "red") points(preds[,3], pch = 24, bg = "green") legend("bottomleft", legend = c("Model 1", "Model 2", "Model 3"), pch = c(21, 23, 24), pt.bg = c("black", "red", "green"), inset = c(.02,.08)) ## End(Not run)
# Use precalculated Xfgpm object named xm # indices of active inputs in the best model xm@log.success@args[[1]] # the full fgpm call set.seed(100) n.tr <- 32 sIn <- expand.grid(x1 = seq(0,1,length = n.tr^(1/5)), x2 = seq(0,1,length = n.tr^(1/5)), x3 = seq(0,1,length = n.tr^(1/5)), x4 = seq(0,1,length = n.tr^(1/5)), x5 = seq(0,1,length = n.tr^(1/5))) fIn <- list(f1 = matrix(runif(n.tr*10), ncol = 10), f2 = matrix(runif(n.tr*22), ncol = 22)) which_on(sIn, fIn, xm@log.success@args[[1]]) # only the indices extracted by which_on # data structures of active inputs active <- get_active_in(sIn, fIn, xm@log.success@args[[1]]) active$sIn.on # scalar data structures active$fIn.on # functional data structures # identifying selected model and corresponding fgpm arguments opt.model <- xm@model opt.args <- xm@log.success@args[[1]] # generating new input data for prediction n.pr <- 243 sIn.pr <- expand.grid(x1 = seq(0,1,length = n.pr^(1/5)), x2 = seq(0,1,length = n.pr^(1/5)), x3 = seq(0,1,length = n.pr^(1/5)), x4 = seq(0,1,length = n.pr^(1/5)), x5 = seq(0,1,length = n.pr^(1/5))) fIn.pr <- list(f1 = matrix(runif(n.pr*10), ncol = 10), f2 = matrix(runif(n.pr*22), ncol = 22)) # pruning data structures for prediction to keep only active inputs!! active <- get_active_in(sIn.pr, fIn.pr, opt.args) # making predictions preds <- predict(opt.model, sIn.pr = active$sIn.on, fIn.pr = active$fIn.on) # plotting predictions plot(preds) # preparing new data for simulation based on inputs kept active____________________________ opt.model <- xm@model opt.args <- xm@log.success@args[[1]] # generating new input data for simulation n.sm <- 243 sIn.sm <- expand.grid(x1 = seq(0,1,length = n.pr^(1/5)), x2 = seq(0,1,length = n.pr^(1/5)), x3 = seq(0,1,length = n.pr^(1/5)), x4 = seq(0,1,length = n.pr^(1/5)), x5 = seq(0,1,length = n.pr^(1/5))) fIn.sm <- list(f1 = matrix(runif(n.sm*10), ncol = 10), f2 = matrix(runif(n.sm*22), ncol = 22)) # pruning data structures for simulation to keep only active inputs!! active <- get_active_in(sIn.sm, fIn.sm, opt.args) # making light simulations sims_l <- simulate(opt.model, nsim = 10, sIn.sm = active$sIn.on, fIn.sm = active$fIn.on) # plotting light simulations plot(sims_l) ## Not run: # rebuilding of 3 best models using new data_______________________________________________ # NOTE: this example is of higher complexity than the previous ones. We recomend you run # the previous examples and understand the @log.success and @log.crashes slots in # the Xfgpm object delivered by fgpm_factory. # # In the second example above we showed how to use get_active_in to prune the input # data structures for prediction based on the fgpm arguments of the best model found # by fgpm_factory. In this new example we generalize that concept by: (i) rebuilding # the 3 best models found by fgpm_factory using new data, (ii) pruning the input # data structures used for prediction with each of the models, and (iii) plotting # the predictions made by the three models. The key ingredient here is that the # three best models might have different scalar and functional inputs active. The # get_active_in function will allow to process the data structures in order to # extract only the scalar inputs required to re-build the model and then to make # predictions with each model. Check also the funGp manual for further details # # funGp manual: https://doi.org/10.18637/jss.v109.i05 # <<<<<<< PART 1: calling fgpm_factory to perform the structural optimization >>>>>>> # ------------------------------------------------------------------- # this part is precalculated and loaded via data("precalculated_Xfgpm_objects") summary(xm) # <<<<<<< PART 2: re-building the three best models found by fgpm_factory >>>>>>> # --------------------------------------------------------------- # recovering the fgpm arguments of the three best models argStack <- xm@log.success@args[1:3] # new data arrived, now we have 243 observations n.nw <- 243 # more points! sIn.nw <- expand.grid(x1 = seq(0,1,length = n.nw^(1/5)), x2 = seq(0,1,length = n.nw^(1/5)), x3 = seq(0,1,length = n.nw^(1/5)), x4 = seq(0,1,length = n.nw^(1/5)), x5 = seq(0,1,length = n.nw^(1/5))) fIn.nw <- list(f1 = matrix(runif(n.nw*10), ncol = 10), f2 = matrix(runif(n.nw*22), ncol = 22)) sOut.nw <- fgp_BB7(sIn.nw, fIn.nw, n.nw) # the second best model modelDef(xm,2) # re-building the three best models based on the new data (compact code with all 3 calls) newEnv <- list(sIn = sIn.nw, fIn = fIn.nw, sOut = sOut.nw) modStack <- lapply(1:3, function(i) eval(parse(text = modelDef(xm,i)), env = newEnv)) # <<<<<<< PART 3: making predictions from the three best models found by fgpm_factory >>>>>>> # --------------------------------------------------------------------------- # generating input data for prediction n.pr <- 32 sIn.pr <- expand.grid(x1 = seq(0,1,length = n.pr^(1/5)), x2 = seq(0,1,length = n.pr^(1/5)), x3 = seq(0,1,length = n.pr^(1/5)), x4 = seq(0,1,length = n.pr^(1/5)), x5 = seq(0,1,length = n.pr^(1/5))) fIn.pr <- list(f1 = matrix(runif(n.pr*10), ncol = 10), matrix(runif(n.pr*22), ncol = 22)) # making predictions based on the three best models (compact code with all 3 calls) preds <- do.call(cbind, Map(function(model, args) { active <- get_active_in(sIn.pr, fIn.pr, args) predict(model, sIn.pr = active$sIn.on, fIn.pr = active$fIn.on)$mean }, modStack, argStack)) # <<<<<<< PART 4: plotting predictions from the three best models found by fgpm_factory >>>>>>> # ----------------------------------------------------------------------------- # plotting predictions made by the three models plot(1, xlim = c(1,nrow(preds)), ylim = range(preds), xaxt = "n", xlab = "Prediction point index", ylab = "Output", main = "Predictions with best 3 structural configurations") axis(1, 1:nrow(preds)) for (i in seq_len(n.pr)) {lines(rep(i,2), range(preds[i,1:3]), col = "grey35", lty = 3)} points(preds[,1], pch = 21, bg = "black") points(preds[,2], pch = 23, bg = "red") points(preds[,3], pch = 24, bg = "green") legend("bottomleft", legend = c("Model 1", "Model 2", "Model 3"), pch = c(21, 23, 24), pt.bg = c("black", "red", "green"), inset = c(.02,.08)) ## End(Not run)
User reminder of the fgpm function call.
string
Object of class "character"
. User call reminder in string format.
José Betancourt, François Bachoc, Thierry Klein and Jérémy Rohmer
fgpm
from within a Xfgpm
objectRetrieve the fgpm
model with index (or rank) i
from
within a Xfgpm
object. By evaluating this code in an
environment containing suitable objects sIn
, fIn
and
sOut
we can re-create a fgpm
object.
modelDef( object, ind, trace = TRUE, pbars = TRUE, control.optim = list(trace = TRUE) )
modelDef( object, ind, trace = TRUE, pbars = TRUE, control.optim = list(trace = TRUE) )
object |
A |
ind |
The index (or rank) of the model in |
trace |
An optional boolean indicating whether funGp-native progress
messages should be displayed. Default is TRUE. See the |
pbars |
An optional boolean indicating whether progress bars managed by
|
control.optim |
An optional list to be passed as the control argument to
|
The models are sorted by decreasing quality so i = 1
extracts
the definition of the best model.
A parsed R code defining the fgpm
model.
Remind that the models are sorted by decreasing quality so
i = 1
extracts the definition of the best model.
The [[,Xfgpm-method
that can also be used
to re-create a fgpm
object using the same data
as that used to create the Xfgpm
object in
object
.
## ========================================================================= ## Using the pre-calculated object `xm` to save time. See `?xm` to re-create ## this object. ## ========================================================================= ## 'xm@model' is the best 'fgpm' model in 'xm' plot(xm@model) ## see the R code to use to recreate the model modelDef(xm, i = 1) ## Not run: ## Define new data in a list. Using an environment would also work, ## including the global environment, which is the default in `eval`. L <- list() set.seed(341) n.new <- 3^5 x1 <- x2 <- x3 <- x4 <- x5 <- seq(0, 1, length = n.new^(1/5)) ## create the data objects required to fit the model L$sIn <- as.matrix(expand.grid(x1 = x1, x2 = x2, x3 = x3, x4 = x4, x5 = x5)) L$fIn <- list(f1 = matrix(runif(n.new * 10), ncol = 10), f2 = matrix(runif(n.new * 22), ncol = 22)) L$sOut <- fgp_BB7(L$sIn, L$fIn, n.new) ## Now evaluate fgpm.new <- eval(modelDef(xm, i = 1), envir = L) plot(fgpm.new, main = "Re-created 'fgpm' model with different data") plot(xm[[1]], main = "Re-created 'fgpm' model with the same data") ## End(Not run)
## ========================================================================= ## Using the pre-calculated object `xm` to save time. See `?xm` to re-create ## this object. ## ========================================================================= ## 'xm@model' is the best 'fgpm' model in 'xm' plot(xm@model) ## see the R code to use to recreate the model modelDef(xm, i = 1) ## Not run: ## Define new data in a list. Using an environment would also work, ## including the global environment, which is the default in `eval`. L <- list() set.seed(341) n.new <- 3^5 x1 <- x2 <- x3 <- x4 <- x5 <- seq(0, 1, length = n.new^(1/5)) ## create the data objects required to fit the model L$sIn <- as.matrix(expand.grid(x1 = x1, x2 = x2, x3 = x3, x4 = x4, x5 = x5)) L$fIn <- list(f1 = matrix(runif(n.new * 10), ncol = 10), f2 = matrix(runif(n.new * 22), ncol = 22)) L$sOut <- fgp_BB7(L$sIn, L$fIn, n.new) ## Now evaluate fgpm.new <- eval(modelDef(xm, i = 1), envir = L) plot(fgpm.new, main = "Re-created 'fgpm' model with different data") plot(xm[[1]], main = "Re-created 'fgpm' model with the same data") ## End(Not run)
"fgpm"
This method provides a diagnostic plot for the validation of regression models. It displays a calibration plot based on the leave-one-out predictions of the output at the points used to train the model.
## S4 method for signature 'fgpm' plot(x, y = NULL, ...)
## S4 method for signature 'fgpm' plot(x, y = NULL, ...)
x |
A |
y |
Not used. |
... |
Graphical parameters. These currently include
|
Plot the Leave-One-Out (LOO) calibration.
# generating input and output data for training set.seed(100) n.tr <- 25 sIn <- expand.grid(x1 = seq(0,1,length = sqrt(n.tr)), x2 = seq(0, 1, length = sqrt(n.tr))) fIn <- list(f1 = matrix(runif(n.tr*10), ncol = 10), f2 = matrix(runif(n.tr*22), ncol = 22)) sOut <- fgp_BB3(sIn, fIn, n.tr) # building the model m1 <- fgpm(sIn = sIn, fIn = fIn, sOut = sOut) # plotting the model plot(m1) # change some graphical parameters if wanted plot(m1, line = "SpringGreen3" , pch = 21, pt.col = "orangered", pt.bg = "gold", main = "LOO cross-validation")
# generating input and output data for training set.seed(100) n.tr <- 25 sIn <- expand.grid(x1 = seq(0,1,length = sqrt(n.tr)), x2 = seq(0, 1, length = sqrt(n.tr))) fIn <- list(f1 = matrix(runif(n.tr*10), ncol = 10), f2 = matrix(runif(n.tr*22), ncol = 22)) sOut <- fgp_BB3(sIn, fIn, n.tr) # building the model m1 <- fgpm(sIn = sIn, fIn = fIn, sOut = sOut) # plotting the model plot(m1) # change some graphical parameters if wanted plot(m1, line = "SpringGreen3" , pch = 21, pt.col = "orangered", pt.bg = "gold", main = "LOO cross-validation")
"Xfgpm"
Plot an object with class "Xfgpm"
representing
a collection of functional GP models corresponding to
different structural parameters.
Two types of graphics can be shown depending on the choice of
which
. The choice which = "diag"
is used to display
diagnostics of the quality of the optimized model. Two types of
diagnostic plots are shown as sub-plots by default, but each can be
discarded if wanted. The choice which = "evol"
is used to
assess the quality of the fitted fgpm
models on the basis of
Leave-One-Out cross-validation.
The choice which = "diag"
(default) provides two plots for
assessing the quality of the output delivered by the model selection
algorithm in the fgpm_factory function. The first
one is a calibration plot similar to the one offered for
fgpm objects by plot,fgpm-method.
This plot allows to validate the absolute quality of the selected
model. The second one displays the performance statistic of all
the models successfully evaluated by the model selection
algorithm. This provides a notion of the relative quality of the
selected model with respect to the other models that can be made
using the same data.
The choice which = "evol"
displays the evolution of the
quality of the configurations evaluated along the iterations, by
the model selection algorithm in the fgpm_factory
function. For
each iteration, the performance statistic of all the evaluated
models is printed, along with the corresponding median of the
group. The plot also includes the global maximum, which
corresponds to the best performance statistic obtained up to the
current iteration. In this plot, it is typical to have some points
falling relatively far from the maximum, even after multiple
iterations. This happens mainly because we have multiple
categorical features, whose alteration might change the
performance statistic in a nonsmooth way. On the other hand, the
points that fall below zero usually correspond to models whose
hyperparameters were hard to optimize. This occurs sporadically
during the log-likelihood optimization for Gaussian processes, due
to the non-linearity of the objective function. As long as the
maximum keeps improving and the median remains close to it, none
of the two aforementioned phenomena is matter for worries. Both
of them respond to the mechanism of exploration implemented in the
algorithm, which makes it able to progressively move towards
better model configurations.
## S4 method for signature 'Xfgpm' plot( x, y = NULL, which = c("diag", "evol"), calib = TRUE, fitp = TRUE, horiz = FALSE, ... )
## S4 method for signature 'Xfgpm' plot( x, y = NULL, which = c("diag", "evol"), calib = TRUE, fitp = TRUE, horiz = FALSE, ... )
x |
The |
y |
Not used. |
which |
Character giving the type of plot wanted. Can take the value
|
calib |
Logical. If |
fitp |
Logical. If |
horiz |
Logical. Used only when |
... |
Other graphical parameters such as |
* fgpm_factory for structural optimization of funGp models.
# generating input and output data set.seed(100) n.tr <- 2^5 x1 <- x2 <- x3 <- x4 <- x5 <- seq(0, 1, length = n.tr^(1/5)) sIn <- expand.grid(x1 = x1, x2 = x2, x3 = x3, x4 = x4, x5 = x5) fIn <- list(f1 = matrix(runif(n.tr * 10), ncol = 10), f2 = matrix(runif(n.tr * 22), ncol = 22)) sOut <- fgp_BB7(sIn, fIn, n.tr) ## Not run: # optimizing the model structure with 'fgpm_factory' (~10 seconds) xm <- fgpm_factory(sIn = sIn, fIn = fIn, sOut = sOut) # assessing the quality of the model - absolute and w.r.t. the other # explored models plot(xm, which = "evol") # diagnostics (two subplots) plot(xm, which = "diag") plot(xm, which = "diag", horiz = TRUE) # diagnostics (one plot) plot(xm, which = "diag", fitp = FALSE) plot(xm, which = "diag", calib = FALSE) # customizing some graphical parameters plot(xm, calib.gpars = list(xlim = c(800,1000), ylim = c(600,1200)), fitp.gpars = list(main = "Relative quality", legends = FALSE)) ## End(Not run)
# generating input and output data set.seed(100) n.tr <- 2^5 x1 <- x2 <- x3 <- x4 <- x5 <- seq(0, 1, length = n.tr^(1/5)) sIn <- expand.grid(x1 = x1, x2 = x2, x3 = x3, x4 = x4, x5 = x5) fIn <- list(f1 = matrix(runif(n.tr * 10), ncol = 10), f2 = matrix(runif(n.tr * 22), ncol = 22)) sOut <- fgp_BB7(sIn, fIn, n.tr) ## Not run: # optimizing the model structure with 'fgpm_factory' (~10 seconds) xm <- fgpm_factory(sIn = sIn, fIn = fIn, sOut = sOut) # assessing the quality of the model - absolute and w.r.t. the other # explored models plot(xm, which = "evol") # diagnostics (two subplots) plot(xm, which = "diag") plot(xm, which = "diag", horiz = TRUE) # diagnostics (one plot) plot(xm, which = "diag", fitp = FALSE) plot(xm, which = "diag", calib = FALSE) # customizing some graphical parameters plot(xm, calib.gpars = list(xlim = c(800,1000), ylim = c(600,1200)), fitp.gpars = list(main = "Relative quality", legends = FALSE)) ## End(Not run)
fgpm
modelThis method displays the predicted output values delivered by a funGp Gaussian process model.
## S3 method for class 'predict.fgpm' plot(x, y = NULL, sOut.pr = NULL, calib = TRUE, sortp = TRUE, ...)
## S3 method for class 'predict.fgpm' plot(x, y = NULL, sOut.pr = NULL, calib = TRUE, sortp = TRUE, ...)
x |
An object with S3 class |
y |
An optional vector (or 1-column matrix) containing the true values of the scalar output at the prediction points. If provided, the method will display two figures: (i) a calibration plot with true vs predicted output values, and (ii) a plot including the true and predicted output along with the confidence bands, sorted according to the increasing order of the true output. If not provided, only the second plot will be made, and the predictions will be arranged according to the increasing order of the predicted output. |
sOut.pr |
Alias of |
calib |
An optional boolean indicating if the calibration
plot should be displayed. Ignored if |
sortp |
An optional boolean indicating if the plot of sorted output should be displayed. Default is TRUE. |
... |
Additional arguments affecting the display. Since this method allows to generate two plots from a single function call, the extra arguments for each plot should be included in a list. For the calibration plot, the list should be called calib.gpars. For the plot of the output in increasing order, the list should be called sortp.gpars. The following typical graphics parameters are valid entries of both lists: xlim, ylim, xlab, ylab, main. The boolean argument legends can also be included in any of the two lists in order to control the display of legends in the corresponding plot. |
José Betancourt, François Bachoc and Thierry Klein
* fgpm for the construction of funGp models;
* plot,fgpm-method for model diagnostic plots;
* simulate,fgpm-method for simulations based on a funGp model;
* plot.simulate.fgpm for simulation plots.
# plotting predictions without the true output values_______________________ # building the model set.seed(100) n.tr <- 25 sIn <- expand.grid(x1 = seq(0, 1, length = sqrt(n.tr)), x2 = seq(0, 1, length = sqrt(n.tr))) fIn <- list(f1 = matrix(runif(n.tr * 10), ncol = 10), f2 = matrix(runif(n.tr * 22), ncol = 22)) sOut <- fgp_BB3(sIn, fIn, n.tr) m1 <- fgpm(sIn = sIn, fIn = fIn, sOut = sOut) # making predictions n.pr <- 100 sIn.pr <- as.matrix(expand.grid(x1 = seq(0,1,length = sqrt(n.pr)), x2 = seq(0,1,length = sqrt(n.pr)))) fIn.pr <- list(f1 = matrix(runif(n.pr * 10), ncol = 10), f2 = matrix(runif(n.pr * 22), ncol = 22)) m1.preds <- predict(m1, sIn.pr = sIn.pr, fIn.pr = fIn.pr) # plotting predictions plot(m1.preds) # plotting predictions and true output values_______________________________ # building the model set.seed(100) n.tr <- 25 sIn <- expand.grid(x1 = seq(0, 1, length = sqrt(n.tr)), x2 = seq(0, 1, length = sqrt(n.tr))) fIn <- list(f1 = matrix(runif(n.tr * 10), ncol = 10), f2 = matrix(runif(n.tr * 22), ncol = 22)) sOut <- fgp_BB3(sIn, fIn, n.tr) m1 <- fgpm(sIn = sIn, fIn = fIn, sOut = sOut) # making predictions n.pr <- 100 sIn.pr <- as.matrix(expand.grid(x1 = seq(0,1,length = sqrt(n.pr)), x2 = seq(0,1,length = sqrt(n.pr)))) fIn.pr <- list(f1 = matrix(runif(n.pr*10), ncol = 10), f2 = matrix(runif(n.pr*22), ncol = 22)) m1.preds <- predict(m1, sIn.pr = sIn.pr, fIn.pr = fIn.pr) # generating output data for validation sOut.pr <- fgp_BB3(sIn.pr, fIn.pr, n.pr) # plotting predictions. Note that the 2-nd argument is the output, 'y' plot(m1.preds, sOut.pr) # only calibration plot plot(m1.preds, sOut.pr = sOut.pr, sortp = FALSE) # only sorted output plot plot(m1.preds, sOut.pr = sOut.pr, calib = FALSE)
# plotting predictions without the true output values_______________________ # building the model set.seed(100) n.tr <- 25 sIn <- expand.grid(x1 = seq(0, 1, length = sqrt(n.tr)), x2 = seq(0, 1, length = sqrt(n.tr))) fIn <- list(f1 = matrix(runif(n.tr * 10), ncol = 10), f2 = matrix(runif(n.tr * 22), ncol = 22)) sOut <- fgp_BB3(sIn, fIn, n.tr) m1 <- fgpm(sIn = sIn, fIn = fIn, sOut = sOut) # making predictions n.pr <- 100 sIn.pr <- as.matrix(expand.grid(x1 = seq(0,1,length = sqrt(n.pr)), x2 = seq(0,1,length = sqrt(n.pr)))) fIn.pr <- list(f1 = matrix(runif(n.pr * 10), ncol = 10), f2 = matrix(runif(n.pr * 22), ncol = 22)) m1.preds <- predict(m1, sIn.pr = sIn.pr, fIn.pr = fIn.pr) # plotting predictions plot(m1.preds) # plotting predictions and true output values_______________________________ # building the model set.seed(100) n.tr <- 25 sIn <- expand.grid(x1 = seq(0, 1, length = sqrt(n.tr)), x2 = seq(0, 1, length = sqrt(n.tr))) fIn <- list(f1 = matrix(runif(n.tr * 10), ncol = 10), f2 = matrix(runif(n.tr * 22), ncol = 22)) sOut <- fgp_BB3(sIn, fIn, n.tr) m1 <- fgpm(sIn = sIn, fIn = fIn, sOut = sOut) # making predictions n.pr <- 100 sIn.pr <- as.matrix(expand.grid(x1 = seq(0,1,length = sqrt(n.pr)), x2 = seq(0,1,length = sqrt(n.pr)))) fIn.pr <- list(f1 = matrix(runif(n.pr*10), ncol = 10), f2 = matrix(runif(n.pr*22), ncol = 22)) m1.preds <- predict(m1, sIn.pr = sIn.pr, fIn.pr = fIn.pr) # generating output data for validation sOut.pr <- fgp_BB3(sIn.pr, fIn.pr, n.pr) # plotting predictions. Note that the 2-nd argument is the output, 'y' plot(m1.preds, sOut.pr) # only calibration plot plot(m1.preds, sOut.pr = sOut.pr, sortp = FALSE) # only sorted output plot plot(m1.preds, sOut.pr = sOut.pr, calib = FALSE)
fgpm
modelThis method displays the simulated output values delivered by a funGp Gaussian process model.
## S3 method for class 'simulate.fgpm' plot(x, y = NULL, detail = NA, ...)
## S3 method for class 'simulate.fgpm' plot(x, y = NULL, detail = NA, ...)
x |
An object with S3 class |
y |
Not used. |
detail |
An optional character string specifying the data
elements that should be included in the plot, to be chosen
between |
... |
Additional arguments affecting the display. The following typical graphics parameters are valid entries: xlim, ylim, xlab, ylab, main. The boolean argument legends can also be included in any of the two lists in order to control the display of legends in the corresponding plot. |
José Betancourt, François Bachoc and Thierry Klein
* fgpm for the construction of funGp models;
* plot,fgpm-method for model diagnostic plots;
* predict,fgpm-method for predictions based on a funGp model;
* plot.predict.fgpm for prediction plots.
# plotting light simulations________________________________________________ # building the model set.seed(100) n.tr <- 25 sIn <- expand.grid(x1 = seq(0, 1, length = sqrt(n.tr)), x2 = seq(0, 1, length = sqrt(n.tr))) fIn <- list(f1 = matrix(runif(n.tr * 10), ncol = 10), f2 = matrix(runif(n.tr * 22), ncol = 22)) sOut <- fgp_BB3(sIn, fIn, n.tr) m1 <- fgpm(sIn = sIn, fIn = fIn, sOut = sOut) # making light simulations n.sm <- 100 sIn.sm <- as.matrix(expand.grid(x1 = seq(0, 1, length = sqrt(n.sm)), x2 = seq(0, 1, length = sqrt(n.sm)))) fIn.sm <- list(f1 = matrix(runif(n.sm * 10), ncol = 10), f2 = matrix(runif(n.sm * 22), ncol = 22)) simsl <- simulate(m1, nsim = 10, sIn.sm = sIn.sm, fIn.sm = fIn.sm) # plotting light simulations plot(simsl) # plotting full simulations_________________________________________________ # building the model set.seed(100) n.tr <- 25 sIn <- expand.grid(x1 = seq(0, 1, length = sqrt(n.tr)), x2 = seq(0, 1, length = sqrt(n.tr))) fIn <- list(f1 = matrix(runif(n.tr * 10), ncol = 10), f2 = matrix(runif(n.tr * 22), ncol = 22)) sOut <- fgp_BB3(sIn, fIn, n.tr) m1 <- fgpm(sIn = sIn, fIn = fIn, sOut = sOut) # making full simulations n.sm <- 100 sIn.sm <- as.matrix(expand.grid(x1 = seq(0, 1, length = sqrt(n.sm)), x2 = seq(0, 1 ,length = sqrt(n.sm)))) fIn.sm <- list(f1 = matrix(runif(n.sm * 10), ncol = 10), f2 = matrix(runif(n.sm * 22), ncol = 22)) simsf <- simulate(m1, nsim = 10, sIn.sm = sIn.sm, fIn.sm = fIn.sm, detail = "full") # plotting full simulations in "full" mode plot(simsf) # plotting full simulations in "light" mode plot(simsf, detail = "light")
# plotting light simulations________________________________________________ # building the model set.seed(100) n.tr <- 25 sIn <- expand.grid(x1 = seq(0, 1, length = sqrt(n.tr)), x2 = seq(0, 1, length = sqrt(n.tr))) fIn <- list(f1 = matrix(runif(n.tr * 10), ncol = 10), f2 = matrix(runif(n.tr * 22), ncol = 22)) sOut <- fgp_BB3(sIn, fIn, n.tr) m1 <- fgpm(sIn = sIn, fIn = fIn, sOut = sOut) # making light simulations n.sm <- 100 sIn.sm <- as.matrix(expand.grid(x1 = seq(0, 1, length = sqrt(n.sm)), x2 = seq(0, 1, length = sqrt(n.sm)))) fIn.sm <- list(f1 = matrix(runif(n.sm * 10), ncol = 10), f2 = matrix(runif(n.sm * 22), ncol = 22)) simsl <- simulate(m1, nsim = 10, sIn.sm = sIn.sm, fIn.sm = fIn.sm) # plotting light simulations plot(simsl) # plotting full simulations_________________________________________________ # building the model set.seed(100) n.tr <- 25 sIn <- expand.grid(x1 = seq(0, 1, length = sqrt(n.tr)), x2 = seq(0, 1, length = sqrt(n.tr))) fIn <- list(f1 = matrix(runif(n.tr * 10), ncol = 10), f2 = matrix(runif(n.tr * 22), ncol = 22)) sOut <- fgp_BB3(sIn, fIn, n.tr) m1 <- fgpm(sIn = sIn, fIn = fIn, sOut = sOut) # making full simulations n.sm <- 100 sIn.sm <- as.matrix(expand.grid(x1 = seq(0, 1, length = sqrt(n.sm)), x2 = seq(0, 1 ,length = sqrt(n.sm)))) fIn.sm <- list(f1 = matrix(runif(n.sm * 10), ncol = 10), f2 = matrix(runif(n.sm * 22), ncol = 22)) simsf <- simulate(m1, nsim = 10, sIn.sm = sIn.sm, fIn.sm = fIn.sm, detail = "full") # plotting full simulations in "full" mode plot(simsf) # plotting full simulations in "light" mode plot(simsf, detail = "light")
A dataset containing the results of the application of
fgpm_factory
to fgp_BB7
analytic black-box
function. See Examples for details.
Five objects of class "Xfgpm"
:
With 32 training points and default parameters.
With 32 training points and 25 iterations of the algorithm.
With 32 training points and customized solution space.
With 32 training points and customized heuristic parameters.
With 32 training points and a time budget constraint and large number of iterations.
## Not run: ################################################################## ## Construction of xm object with default parameters (~12 seconds) ################################################################## set.seed(100) n.tr <- 32 x1 <- x2 <- x3 <- x4 <- x5 <- seq(0,1,length = n.tr^(1/5)) sIn <- expand.grid(x1 = x1, x2 = x2, x3 = x3, x4 = x4, x5 = x5) fIn <- list(f1 = matrix(runif(n.tr * 10), ncol = 10), f2 = matrix(runif(n.tr * 22), ncol = 22)) sOut <- fgp_BB7(sIn, fIn, n.tr) xm <- fgpm_factory(sIn = sIn, fIn = fIn, sOut = sOut) ################################################################## ## Construction of xm25 object with 25 iterations (~20 seconds) ################################################################## xm25 <- fgpm_factory(sIn = sIn, fIn = fIn, sOut = sOut, setup = list(n.iter = 25)) ################################################################## ## Construction of xmc object with customized solution space (~12 seconds) ################################################################## myctr <- list(s_keepOn = c(1,2), # keep both scalar inputs always on f_keepOn = c(2), # keep f2 always active f_disTypes = list("2" = c("L2_byindex")), # only use L2_byindex distance for f2 f_fixDims = matrix(c(2,4), ncol = 1), # f2 projected in dimension 4 f_maxDims = matrix(c(1,5), ncol = 1), # f1 projected in dimension max 5 f_basTypes = list("1" = c("B-splines")), # only use B-splines projection for f1 kerTypes = c("matern5_2", "gauss")) # test only Matern 5/2 and Gaussian kernels xmc <- fgpm_factory(sIn = sIn, fIn = fIn, sOut = sOut, ctraints = myctr) ################################################################## ## Construction of xmc object with customized heuristic parameters (~15 seconds) ################################################################## mysup <- list(n.iter = 30, n.pop = 12, tao0 = .15, dop.s = 1.2, dop.f = 1.3, delta.f = 4, dispr.f = 1.1, q0 = .85, rho.l = .2, u.gbest = TRUE, n.ibest = 2, rho.g = .08) xmh <- fgpm_factory(sIn = sIn, fIn = fIn, sOut = sOut, setup = mysup) ################################################################## ## Construction of xmc object with time budget constraint (~60 seconds) ################################################################## mysup <- list(n.iter = 2000) mytlim <- 60 xms <- fgpm_factory(sIn = sIn, fIn = fIn, sOut = sOut, setup = mysup, time.lim = mytlim) ## End(Not run)
## Not run: ################################################################## ## Construction of xm object with default parameters (~12 seconds) ################################################################## set.seed(100) n.tr <- 32 x1 <- x2 <- x3 <- x4 <- x5 <- seq(0,1,length = n.tr^(1/5)) sIn <- expand.grid(x1 = x1, x2 = x2, x3 = x3, x4 = x4, x5 = x5) fIn <- list(f1 = matrix(runif(n.tr * 10), ncol = 10), f2 = matrix(runif(n.tr * 22), ncol = 22)) sOut <- fgp_BB7(sIn, fIn, n.tr) xm <- fgpm_factory(sIn = sIn, fIn = fIn, sOut = sOut) ################################################################## ## Construction of xm25 object with 25 iterations (~20 seconds) ################################################################## xm25 <- fgpm_factory(sIn = sIn, fIn = fIn, sOut = sOut, setup = list(n.iter = 25)) ################################################################## ## Construction of xmc object with customized solution space (~12 seconds) ################################################################## myctr <- list(s_keepOn = c(1,2), # keep both scalar inputs always on f_keepOn = c(2), # keep f2 always active f_disTypes = list("2" = c("L2_byindex")), # only use L2_byindex distance for f2 f_fixDims = matrix(c(2,4), ncol = 1), # f2 projected in dimension 4 f_maxDims = matrix(c(1,5), ncol = 1), # f1 projected in dimension max 5 f_basTypes = list("1" = c("B-splines")), # only use B-splines projection for f1 kerTypes = c("matern5_2", "gauss")) # test only Matern 5/2 and Gaussian kernels xmc <- fgpm_factory(sIn = sIn, fIn = fIn, sOut = sOut, ctraints = myctr) ################################################################## ## Construction of xmc object with customized heuristic parameters (~15 seconds) ################################################################## mysup <- list(n.iter = 30, n.pop = 12, tao0 = .15, dop.s = 1.2, dop.f = 1.3, delta.f = 4, dispr.f = 1.1, q0 = .85, rho.l = .2, u.gbest = TRUE, n.ibest = 2, rho.g = .08) xmh <- fgpm_factory(sIn = sIn, fIn = fIn, sOut = sOut, setup = mysup) ################################################################## ## Construction of xmc object with time budget constraint (~60 seconds) ################################################################## mysup <- list(n.iter = 2000) mytlim <- 60 xms <- fgpm_factory(sIn = sIn, fIn = fIn, sOut = sOut, setup = mysup, time.lim = mytlim) ## End(Not run)
fgpm
Gaussian process modelThis method enables prediction based on a fgpm
model, at any given set of
points. Check fgpm
for information on how to create fgpm
models.
## S4 method for signature 'fgpm' predict(object, sIn.pr = NULL, fIn.pr = NULL, detail = c("light", "full"), ...)
## S4 method for signature 'fgpm' predict(object, sIn.pr = NULL, fIn.pr = NULL, detail = c("light", "full"), ...)
object |
An object of class fgpm corresponding to the funGp model that should be used to predict the output. |
sIn.pr |
An optional matrix of scalar input coordinates at which the output values should be predicted. Each column is interpreted as a scalar input variable and each row as a coordinate. Either scalar input coordinates (sIn.pr), functional input coordinates (fIn.pr), or both must be provided. |
fIn.pr |
An optional list of functional input coordinates at which the output values should be predicted. Each element of the list is interpreted as a functional input variable. Every functional input variable should be provided as a matrix with one curve per row. Either scalar input coordinates (sIn.pr), functional input coordinates (fIn.pr), or both must be provided. |
detail |
An optional character specifying the extent of information that should be delivered
by the method, to be chosen between |
... |
Not used. |
An object of class "list"
containing the data structures linked to predictions. For
light predictions, the list will include the mean, standard deviation and limits of the 95%
confidence intervals at the prediction points. For full predictions, it will include the same
information, plus the training-prediction cross-covariance matrix and the prediction auto-covariance
matrix.
José Betancourt, François Bachoc, Thierry Klein and Jérémy Rohmer
* plot.predict.fgpm for the prediction plot of a fgpm
model;
* simulate,fgpm-method for simulations based on a fgpm
model;
* plot.simulate.fgpm for the simulation plot of a fgpm
model.
# light predictions________________________________________________________________________ # building the model set.seed(100) n.tr <- 25 sIn <- expand.grid(x1 = seq(0,1,length = sqrt(n.tr)), x2 = seq(0,1,length = sqrt(n.tr))) fIn <- list(f1 = matrix(runif(n.tr*10), ncol = 10), f2 = matrix(runif(n.tr*22), ncol = 22)) sOut <- fgp_BB3(sIn, fIn, n.tr) m1 <- fgpm(sIn = sIn, fIn = fIn, sOut = sOut) # generating input data for prediction n.pr <- 100 sIn.pr <- as.matrix(expand.grid(x1 = seq(0,1,length = sqrt(n.pr)), x2 = seq(0,1,length = sqrt(n.pr)))) fIn.pr <- list(f1 = matrix(runif(n.pr*10), ncol = 10), matrix(runif(n.pr*22), ncol = 22)) # making predictions m1.preds <- predict(m1, sIn.pr = sIn.pr, fIn.pr = fIn.pr) # checking content of the list summary(m1.preds) # ~R output:~ # Length Class Mode # mean 100 -none- numeric # sd 100 -none- numeric # lower95 100 -none- numeric # upper95 100 -none- numeric # plotting predictions plot(m1.preds) # comparison against true output___________________________________________________________ # building the model set.seed(100) n.tr <- 25 sIn <- expand.grid(x1 = seq(0,1,length = sqrt(n.tr)), x2 = seq(0,1,length = sqrt(n.tr))) fIn <- list(f1 = matrix(runif(n.tr*10), ncol = 10), f2 = matrix(runif(n.tr*22), ncol = 22)) sOut <- fgp_BB3(sIn, fIn, n.tr) m1 <- fgpm(sIn = sIn, fIn = fIn, sOut = sOut) # making predictions n.pr <- 100 sIn.pr <- as.matrix(expand.grid(x1 = seq(0,1,length = sqrt(n.pr)), x2 = seq(0,1,length = sqrt(n.pr)))) fIn.pr <- list(f1 = matrix(runif(n.pr*10), ncol = 10), matrix(runif(n.pr*22), ncol = 22)) m1.preds <- predict(m1, sIn.pr = sIn.pr, fIn.pr = fIn.pr) # generating output data for validation sOut.pr <- fgp_BB3(sIn.pr, fIn.pr, n.pr) # plotting predictions along with true output values plot(m1.preds, sOut.pr) # full predictions_________________________________________________________________________ # building the model set.seed(100) n.tr <- 25 sIn <- expand.grid(x1 = seq(0,1,length = sqrt(n.tr)), x2 = seq(0,1,length = sqrt(n.tr))) fIn <- list(f1 = matrix(runif(n.tr*10), ncol = 10), f2 = matrix(runif(n.tr*22), ncol = 22)) sOut <- fgp_BB3(sIn, fIn, n.tr) m1 <- fgpm(sIn = sIn, fIn = fIn, sOut = sOut) # making full predictions n.pr <- 100 sIn.pr <- as.matrix(expand.grid(x1 = seq(0,1,length = sqrt(n.pr)), x2 = seq(0,1,length = sqrt(n.pr)))) fIn.pr <- list(f1 = matrix(runif(n.pr*10), ncol = 10), matrix(runif(n.pr*22), ncol = 22)) m1.preds_f <- predict(m1, sIn.pr = sIn.pr, fIn.pr = fIn.pr, detail = "full") # checking content of the list summary(m1.preds_f) # ~R output:~ # Length Class Mode # mean 100 -none- numeric # sd 100 -none- numeric # K.tp 2500 -none- numeric # K.pp 10000 -none- numeric # lower95 100 -none- numeric # upper95 100 -none- numeric # plotting predictions plot(m1.preds)
# light predictions________________________________________________________________________ # building the model set.seed(100) n.tr <- 25 sIn <- expand.grid(x1 = seq(0,1,length = sqrt(n.tr)), x2 = seq(0,1,length = sqrt(n.tr))) fIn <- list(f1 = matrix(runif(n.tr*10), ncol = 10), f2 = matrix(runif(n.tr*22), ncol = 22)) sOut <- fgp_BB3(sIn, fIn, n.tr) m1 <- fgpm(sIn = sIn, fIn = fIn, sOut = sOut) # generating input data for prediction n.pr <- 100 sIn.pr <- as.matrix(expand.grid(x1 = seq(0,1,length = sqrt(n.pr)), x2 = seq(0,1,length = sqrt(n.pr)))) fIn.pr <- list(f1 = matrix(runif(n.pr*10), ncol = 10), matrix(runif(n.pr*22), ncol = 22)) # making predictions m1.preds <- predict(m1, sIn.pr = sIn.pr, fIn.pr = fIn.pr) # checking content of the list summary(m1.preds) # ~R output:~ # Length Class Mode # mean 100 -none- numeric # sd 100 -none- numeric # lower95 100 -none- numeric # upper95 100 -none- numeric # plotting predictions plot(m1.preds) # comparison against true output___________________________________________________________ # building the model set.seed(100) n.tr <- 25 sIn <- expand.grid(x1 = seq(0,1,length = sqrt(n.tr)), x2 = seq(0,1,length = sqrt(n.tr))) fIn <- list(f1 = matrix(runif(n.tr*10), ncol = 10), f2 = matrix(runif(n.tr*22), ncol = 22)) sOut <- fgp_BB3(sIn, fIn, n.tr) m1 <- fgpm(sIn = sIn, fIn = fIn, sOut = sOut) # making predictions n.pr <- 100 sIn.pr <- as.matrix(expand.grid(x1 = seq(0,1,length = sqrt(n.pr)), x2 = seq(0,1,length = sqrt(n.pr)))) fIn.pr <- list(f1 = matrix(runif(n.pr*10), ncol = 10), matrix(runif(n.pr*22), ncol = 22)) m1.preds <- predict(m1, sIn.pr = sIn.pr, fIn.pr = fIn.pr) # generating output data for validation sOut.pr <- fgp_BB3(sIn.pr, fIn.pr, n.pr) # plotting predictions along with true output values plot(m1.preds, sOut.pr) # full predictions_________________________________________________________________________ # building the model set.seed(100) n.tr <- 25 sIn <- expand.grid(x1 = seq(0,1,length = sqrt(n.tr)), x2 = seq(0,1,length = sqrt(n.tr))) fIn <- list(f1 = matrix(runif(n.tr*10), ncol = 10), f2 = matrix(runif(n.tr*22), ncol = 22)) sOut <- fgp_BB3(sIn, fIn, n.tr) m1 <- fgpm(sIn = sIn, fIn = fIn, sOut = sOut) # making full predictions n.pr <- 100 sIn.pr <- as.matrix(expand.grid(x1 = seq(0,1,length = sqrt(n.pr)), x2 = seq(0,1,length = sqrt(n.pr)))) fIn.pr <- list(f1 = matrix(runif(n.pr*10), ncol = 10), matrix(runif(n.pr*22), ncol = 22)) m1.preds_f <- predict(m1, sIn.pr = sIn.pr, fIn.pr = fIn.pr, detail = "full") # checking content of the list summary(m1.preds_f) # ~R output:~ # Length Class Mode # mean 100 -none- numeric # sd 100 -none- numeric # K.tp 2500 -none- numeric # K.pp 10000 -none- numeric # lower95 100 -none- numeric # upper95 100 -none- numeric # plotting predictions plot(m1.preds)
fgpm
modelThis method enables simulation of Gaussian process values at any given set of points
based on a pre-built fgpm
model. Check fgpm
for information on how to create funGp models.
## S4 method for signature 'fgpm' simulate( object, nsim = 1, seed = NULL, sIn.sm = NULL, fIn.sm = NULL, nugget.sm = 0, detail = c("light", "full"), ... )
## S4 method for signature 'fgpm' simulate( object, nsim = 1, seed = NULL, sIn.sm = NULL, fIn.sm = NULL, nugget.sm = 0, detail = c("light", "full"), ... )
object |
An object of class fgpm corresponding to the funGp model from which simulations must be performed. |
nsim |
An optional integer indicating the number of samples to produce. Default is 1. |
seed |
An optional value interpreted as an integer, that will be used as argument of
|
sIn.sm |
An optional matrix of scalar input coordinates at which the output values should be simulated. Each column is interpreted as a scalar input variable and each row as a coordinate. Either scalar input coordinates (sIn.sm), functional input coordinates (fIn.sm), or both must be provided. |
fIn.sm |
An optional list of functional input coordinates at which the output values should be simulated. Each element of the list is interpreted as a functional input variable. Every functional input variable should be provided as a matrix with one curve per row. Either scalar input coordinates (sIn.sm), functional input coordinates (fIn.sm), or both must be provided. |
nugget.sm |
An optional number corresponding to a numerical nugget effect. If provided, this number is added to the main diagonal of the simulation covariance matrix in order to prevent numerical instabilities during Cholesky decomposition. A small number in the order of 1e-8 is often enough. Default is 0. |
detail |
An optional character specifying the extent of information that should be delivered
by the method, to be chosen between |
... |
Not used. |
An object containing the data structures linked to simulations. For light simulations, the output will be a matrix of simulated output values, with as many rows as requested random samples. For full simulations, the output will be a list with the matrix of simulated output values, along with the predicted mean, standard deviation and limits of the 95% confidence intervals at the simulation points.
José Betancourt, François Bachoc, Thierry Klein and Jérémy Rohmer
* plot.simulate.fgpm for the simulation plot of a fgpm
model;
* predict,fgpm-method for predictions based on a fgpm
model;
* plot.predict.fgpm for the prediction plot of a fgpm
model.
# light simulations _______________________________________________________________________ # building the model set.seed(100) n.tr <- 25 sIn <- expand.grid(x1 = seq(0,1,length = sqrt(n.tr)), x2 = seq(0,1,length = sqrt(n.tr))) fIn <- list(f1 = matrix(runif(n.tr*10), ncol = 10), f2 = matrix(runif(n.tr*22), ncol = 22)) sOut <- fgp_BB3(sIn, fIn, n.tr) m1 <- fgpm(sIn = sIn, fIn = fIn, sOut = sOut) # generating input data for simulation n.sm <- 100 sIn.sm <- as.matrix(expand.grid(x1 = seq(0,1,length = sqrt(n.sm)), x2 = seq(0,1,length = sqrt(n.sm)))) fIn.sm <- list(f1 = matrix(runif(n.sm*10), ncol = 10), matrix(runif(n.sm*22), ncol = 22)) # making light simulations m1.sims_l <- simulate(m1, nsim = 10, sIn.sm = sIn.sm, fIn.sm = fIn.sm) # plotting light simulations plot(m1.sims_l) # full simulations ________________________________________________________________________ # building the model set.seed(100) n.tr <- 25 sIn <- expand.grid(x1 = seq(0,1,length = sqrt(n.tr)), x2 = seq(0,1,length = sqrt(n.tr))) fIn <- list(f1 = matrix(runif(n.tr*10), ncol = 10), f2 = matrix(runif(n.tr*22), ncol = 22)) sOut <- fgp_BB3(sIn, fIn, n.tr) m1 <- fgpm(sIn = sIn, fIn = fIn, sOut = sOut) # making full simulations m1.sims_f <- simulate(m1, nsim = 10, sIn.sm = sIn.sm, fIn.sm = fIn.sm, detail = "full") # checking content of the list summary(m1.sims_f) # ~R output:~ # Length Class Mode # sims 1000 -none- numeric # mean 100 -none- numeric # sd 100 -none- numeric # lower95 100 -none- numeric # upper95 100 -none- numeric # plotting full simulations in full mode plot(m1.sims_f) # plotting full simulations in light mode plot(m1.sims_f, detail = "light")
# light simulations _______________________________________________________________________ # building the model set.seed(100) n.tr <- 25 sIn <- expand.grid(x1 = seq(0,1,length = sqrt(n.tr)), x2 = seq(0,1,length = sqrt(n.tr))) fIn <- list(f1 = matrix(runif(n.tr*10), ncol = 10), f2 = matrix(runif(n.tr*22), ncol = 22)) sOut <- fgp_BB3(sIn, fIn, n.tr) m1 <- fgpm(sIn = sIn, fIn = fIn, sOut = sOut) # generating input data for simulation n.sm <- 100 sIn.sm <- as.matrix(expand.grid(x1 = seq(0,1,length = sqrt(n.sm)), x2 = seq(0,1,length = sqrt(n.sm)))) fIn.sm <- list(f1 = matrix(runif(n.sm*10), ncol = 10), matrix(runif(n.sm*22), ncol = 22)) # making light simulations m1.sims_l <- simulate(m1, nsim = 10, sIn.sm = sIn.sm, fIn.sm = fIn.sm) # plotting light simulations plot(m1.sims_l) # full simulations ________________________________________________________________________ # building the model set.seed(100) n.tr <- 25 sIn <- expand.grid(x1 = seq(0,1,length = sqrt(n.tr)), x2 = seq(0,1,length = sqrt(n.tr))) fIn <- list(f1 = matrix(runif(n.tr*10), ncol = 10), f2 = matrix(runif(n.tr*22), ncol = 22)) sOut <- fgp_BB3(sIn, fIn, n.tr) m1 <- fgpm(sIn = sIn, fIn = fIn, sOut = sOut) # making full simulations m1.sims_f <- simulate(m1, nsim = 10, sIn.sm = sIn.sm, fIn.sm = fIn.sm, detail = "full") # checking content of the list summary(m1.sims_f) # ~R output:~ # Length Class Mode # sims 1000 -none- numeric # mean 100 -none- numeric # sd 100 -none- numeric # lower95 100 -none- numeric # upper95 100 -none- numeric # plotting full simulations in full mode plot(m1.sims_f) # plotting full simulations in light mode plot(m1.sims_f, detail = "light")
fgpm
objectsDisplay the structure of a fgpm
object and the value of the parameters (variance and length-scales).
## S4 method for signature 'fgpm' summary(object, ...)
## S4 method for signature 'fgpm' summary(object, ...)
object |
An |
... |
Not used yet. |
This method is actually identical to the show
method
for this class which is called when the name of the object is
entered in an interactive session.
m <- xm@model class(m) summary(m) m
m <- xm@model class(m) summary(m) m
Xfgpm
objectsDisplay a summary of the structure of a Xfgpm
object, with a short description of up to n
fgpm
objects visited during the ACO optimization.
## S4 method for signature 'Xfgpm' summary(object, n = 24, ...)
## S4 method for signature 'Xfgpm' summary(object, n = 24, ...)
object |
A |
n |
Maximal number of lines ( |
... |
Not used yet. |
The displayed information depends on the number of
candidate inputs, in order to maintain compact tables. The
inputs are labelled with integer suffixes, the prefix being
"X"
for scalar inputs and "F"
for functional
inputs.
With a small number of inputs, the list
contains only one data frame. For each candidate
input (either scalar or functional) a column with
the input name indicates if the input
is active (cross x
) or not (white space)
in the fgpm
object corresponding to the row. For each
functional variable also shown are: the distance used D_
,
the dimension Bas_
after dimension reduction,
the type of basis used B_
. Remind that the
kernel (Kern
) is the same for all functional
inputs. Also shown is the value of the Leave-One-Out
coefficient Q²
.
With a large number of inputs, the list contains two data frames. The first one tells which inputs are active among the scalar and functional candidate inputs. The second data frame gives more details for functional inputs as before.
An object inheriting from list
, actually a list
containing one or two data frames depending on the number of
inputs. In each data frame, the n
rows provide
information on the best fgpm
objects visited.
summary(xm)
summary(xm)
fgpm
modelsThis method enables the update of data or hyperparameters of a fgpm
model.
It corresponds to an object of the class fgpm. The method allows addition, subtraction
and substitution of data points, as well as substitution and re-estimation of hyperparameters.
## S4 method for signature 'fgpm' update( object, sIn.nw = NULL, fIn.nw = NULL, sOut.nw = NULL, sIn.sb = NULL, fIn.sb = NULL, sOut.sb = NULL, ind.sb = NULL, ind.dl = NULL, var.sb = NULL, ls_s.sb = NULL, ls_f.sb = NULL, var.re = FALSE, ls_s.re = FALSE, ls_f.re = FALSE, extend = FALSE, trace = TRUE, pbars = TRUE, control.optim = list(trace = TRUE), ... )
## S4 method for signature 'fgpm' update( object, sIn.nw = NULL, fIn.nw = NULL, sOut.nw = NULL, sIn.sb = NULL, fIn.sb = NULL, sOut.sb = NULL, ind.sb = NULL, ind.dl = NULL, var.sb = NULL, ls_s.sb = NULL, ls_f.sb = NULL, var.re = FALSE, ls_s.re = FALSE, ls_f.re = FALSE, extend = FALSE, trace = TRUE, pbars = TRUE, control.optim = list(trace = TRUE), ... )
object |
An object of class fgpm corresponding to the funGp model to update. |
sIn.nw |
An optional matrix of scalar input values to be added to the model. Each column must match an input variable and each row a scalar coordinate. |
fIn.nw |
An optional list of functional input values to be added to the model. Each element of the list must be a matrix containing the set of curves corresponding to one functional input. |
sOut.nw |
An optional vector (or 1-column matrix) containing the values of the scalar output at the new input points. |
sIn.sb |
An optional matrix of scalar input values to be used as substitutes of other scalar input values already stored in the model. Each column must match an input variable and each row a coordinate. |
fIn.sb |
An optional list of functional input values to be added to the model. Each element of the list must be a matrix containing the set of curves corresponding to one functional input. |
sOut.sb |
An optional vector (or 1-column matrix) containing the values of the scalar output at the substituting input points. |
ind.sb |
An optional numeric array indicating the indices of the input and output points stored in the model, that should be replaced by the values specified through sIn.sb, fIn.sb and/or sOut.sb. |
ind.dl |
An optional numeric array indicating the indices of the input and output points stored in the model that should be deleted. |
var.sb |
An optional number indicating the value that should be used to substitute the current variance parameter of the model. |
ls_s.sb |
An optional numerical array indicating the values that should be used to substitute the current length-scale parameters for the scalar inputs of the model. |
ls_f.sb |
An optional numerical array indicating the values that should be used to substitute the current length-scale parameters for the functional inputs of the model. |
var.re |
An optional boolean indicating whether the variance parameter should be re-estimated. Default is FALSE. |
ls_s.re |
An optional boolean indicating whether the length-scale parameters of the scalar inputs should be re-estimated. Default is FALSE. |
ls_f.re |
An optional boolean indicating whether the length-scale parameters of the functional inputs should be re-estimated. Default is FALSE. |
extend |
An optional boolean indicating whether the re-optimization should extend from the current
hyperparameters of the model using them as initial points. Default is FALSE, meaning that the
re-optimization picks brand new initial points in the way described in |
trace |
An optional boolean indicating whether funGp-native progress messages and a summary update
should be displayed. Default is TRUE. See the |
pbars |
An optional boolean indicating whether progress bars managed by |
control.optim |
An optional list to be passed as the control argument to |
... |
Not used. |
The arguments listed above enable the completion of the following updating tasks:
Deletion of data points: ind.dl;
Addition of data points: sIn.nw, fIn.nw, sOut.nw;
Substitution of data points: sIn.sb, fIn.sb, sOut.sb, ind.sb;
Substitution of hyperparameters: var.sb, ls_s.sb, ls_f.sb;
Re-estimation of hyperparameters: var.re, ls_s.re, ls_f.re.
All the arguments listed above are optional since any of these tasks can be requested without need to request any of the other tasks. In fact, most of the arguments can be used even if the other arguments related to the same task are not. For instance, the re-estimation of the variance can be requested via var.re without requiring re-estimation of the scalar or functional length-scale parameters. The only two exceptions are: (i) for data addition, the new output sOut.nw should always be provided and the new input points should correspond to the set of variables already stored in the fgpm object passed for update; and (ii) for data substitution, the argument ind.sb is always mandatory.
Conflicting task combinations:
Data points deletion and substitution;
Substitution and re-estimation of the same hyperparameter.
Note that the parameters of the model will not be updated after modifying the model unless explicitly requested through the var.re, ls_s.re and ls_f.re arguments. If, for instance, some points are added to the model without requesting parameter re-estimation, the new data will be included in the training-training and training-prediction covariance matrices, but the hyperparameters will not be updated. This allows to make updates in the data that might help to improve predictions, without the immediate need to perform a training procedure that could be time consuming. At any later time, the user is allowed to request the re-estimation of the hyperparameters, which will make the model fully up to date.
An object of class fgpm representing the updated funGp model.
José Betancourt, François Bachoc, Thierry Klein and Jérémy Rohmer
* fgpm for creation of a funGp model;
* predict,fgpm-method for predictions based on a fgpm
model;
* simulate,fgpm-method for simulations based on a fgpm
model.
# deletion and addition of data points_____________________________________________________ # building the model set.seed(100) n.tr <- 25 sIn <- expand.grid(x1 = seq(0,1,length = sqrt(n.tr)), x2 = seq(0,1,length = sqrt(n.tr))) fIn <- list(f1 = matrix(runif(n.tr*10), ncol = 10), f2 = matrix(runif(n.tr*22), ncol = 22)) sOut <- fgp_BB3(sIn, fIn, n.tr) m1 <- fgpm(sIn = sIn, fIn = fIn, sOut = sOut) # deleting two points ind.dl <- sample(1:[email protected], 2) m1up <- update(m1, ind.dl = ind.dl) # adding five points n.nw <- 5 sIn.nw <- matrix(runif(n.nw * m1@ds), nrow = n.nw) fIn.nw <- list(f1 = matrix(runif(n.nw*10), ncol = 10), f2 = matrix(runif(n.nw*22), ncol = 22)) sOut.nw <- fgp_BB3(sIn.nw, fIn.nw, n.nw) m1up <- update(m1, sIn.nw = sIn.nw, fIn.nw = fIn.nw, sOut.nw = sOut.nw) # substitution of data points______________________________________________________________ # building the model set.seed(100) n.tr <- 25 sIn <- expand.grid(x1 = seq(0,1,length = sqrt(n.tr)), x2 = seq(0,1,length = sqrt(n.tr))) fIn <- list(f1 = matrix(runif(n.tr*10), ncol = 10), f2 = matrix(runif(n.tr*22), ncol = 22)) sOut <- fgp_BB3(sIn, fIn, n.tr) m1 <- fgpm(sIn = sIn, fIn = fIn, sOut = sOut) # generating substituting input data for updating n.sb <- 2 sIn.sb <- matrix(runif(n.sb * m1@ds), nrow = n.sb) fIn.sb <- list(f1 = matrix(runif(n.sb*10), ncol = 10), f2 = matrix(runif(n.sb*22), ncol = 22)) # generating substituting output data for updating sOut.sb <- fgp_BB3(sIn.sb, fIn.sb, n.sb) # generating indices for substitution ind.sb <- sample(1:([email protected]), n.sb) # updating all, the scalar inputs, functional inputs and the outputs m1up <- update(m1, sIn.sb = sIn.sb, fIn.sb = fIn.sb, sOut.sb = sOut.sb, ind.sb = ind.sb) # updating only some of the data structures m1up1 <- update(m1, sIn.sb = sIn.sb, ind.sb = ind.sb) # only the scalar inputs m1up2 <- update(m1, sOut.sb = sOut.sb, ind.sb = ind.sb) # only the outputs m1up3 <- update(m1, sIn.sb = sIn.sb, sOut.sb = sOut.sb, ind.sb = ind.sb) # the scalar inputs # and the outputs # substitution of hyperparameters__________________________________________________________ # building the model set.seed(100) n.tr <- 25 sIn <- expand.grid(x1 = seq(0,1,length = sqrt(n.tr)), x2 = seq(0,1,length = sqrt(n.tr))) fIn <- list(f1 = matrix(runif(n.tr*10), ncol = 10), f2 = matrix(runif(n.tr*22), ncol = 22)) sOut <- fgp_BB3(sIn, fIn, n.tr) m1 <- fgpm(sIn = sIn, fIn = fIn, sOut = sOut) # defining hyperparameters for substitution var.sb <- 3 ls_s.sb <- c(2.44, 1.15) ls_f.sb <- c(5.83, 4.12) # updating the model m1up <- update(m1, var.sb = var.sb, ls_s.sb = ls_s.sb, ls_f.sb = ls_f.sb) # re-estimation of hyperparameters_________________________________________________________ # building the model set.seed(100) n.tr <- 25 sIn <- expand.grid(x1 = seq(0,1,length = sqrt(n.tr)), x2 = seq(0,1,length = sqrt(n.tr))) fIn <- list(f1 = matrix(runif(n.tr*10), ncol = 10), f2 = matrix(runif(n.tr*22), ncol = 22)) sOut <- fgp_BB3(sIn, fIn, n.tr) m1 <- fgpm(sIn = sIn, fIn = fIn, sOut = sOut) # re-estimating the hyperparameters m1up <- update(m1, var.re = TRUE) # only the variance m1up <- update(m1, ls_s.re = TRUE) # only the scalar length-scale parameters m1up <- update(m1, ls_s.re = TRUE, ls_f.re = TRUE) # all length-scale parameters m1up <- update(m1, var.re = TRUE, ls_s.re = TRUE, ls_f.re = TRUE) # all hyperparameters # same as above but now extending optimization from previously stored values m1up <- update(m1, var.re = TRUE, extend = TRUE) m1up <- update(m1, ls_s.re = TRUE, extend = TRUE) m1up <- update(m1, ls_s.re = TRUE, ls_f.re = TRUE, extend = TRUE) m1up <- update(m1, var.re = TRUE, ls_s.re = TRUE, ls_f.re = TRUE, extend = TRUE)
# deletion and addition of data points_____________________________________________________ # building the model set.seed(100) n.tr <- 25 sIn <- expand.grid(x1 = seq(0,1,length = sqrt(n.tr)), x2 = seq(0,1,length = sqrt(n.tr))) fIn <- list(f1 = matrix(runif(n.tr*10), ncol = 10), f2 = matrix(runif(n.tr*22), ncol = 22)) sOut <- fgp_BB3(sIn, fIn, n.tr) m1 <- fgpm(sIn = sIn, fIn = fIn, sOut = sOut) # deleting two points ind.dl <- sample(1:m1@n.tot, 2) m1up <- update(m1, ind.dl = ind.dl) # adding five points n.nw <- 5 sIn.nw <- matrix(runif(n.nw * m1@ds), nrow = n.nw) fIn.nw <- list(f1 = matrix(runif(n.nw*10), ncol = 10), f2 = matrix(runif(n.nw*22), ncol = 22)) sOut.nw <- fgp_BB3(sIn.nw, fIn.nw, n.nw) m1up <- update(m1, sIn.nw = sIn.nw, fIn.nw = fIn.nw, sOut.nw = sOut.nw) # substitution of data points______________________________________________________________ # building the model set.seed(100) n.tr <- 25 sIn <- expand.grid(x1 = seq(0,1,length = sqrt(n.tr)), x2 = seq(0,1,length = sqrt(n.tr))) fIn <- list(f1 = matrix(runif(n.tr*10), ncol = 10), f2 = matrix(runif(n.tr*22), ncol = 22)) sOut <- fgp_BB3(sIn, fIn, n.tr) m1 <- fgpm(sIn = sIn, fIn = fIn, sOut = sOut) # generating substituting input data for updating n.sb <- 2 sIn.sb <- matrix(runif(n.sb * m1@ds), nrow = n.sb) fIn.sb <- list(f1 = matrix(runif(n.sb*10), ncol = 10), f2 = matrix(runif(n.sb*22), ncol = 22)) # generating substituting output data for updating sOut.sb <- fgp_BB3(sIn.sb, fIn.sb, n.sb) # generating indices for substitution ind.sb <- sample(1:(m1@n.tot), n.sb) # updating all, the scalar inputs, functional inputs and the outputs m1up <- update(m1, sIn.sb = sIn.sb, fIn.sb = fIn.sb, sOut.sb = sOut.sb, ind.sb = ind.sb) # updating only some of the data structures m1up1 <- update(m1, sIn.sb = sIn.sb, ind.sb = ind.sb) # only the scalar inputs m1up2 <- update(m1, sOut.sb = sOut.sb, ind.sb = ind.sb) # only the outputs m1up3 <- update(m1, sIn.sb = sIn.sb, sOut.sb = sOut.sb, ind.sb = ind.sb) # the scalar inputs # and the outputs # substitution of hyperparameters__________________________________________________________ # building the model set.seed(100) n.tr <- 25 sIn <- expand.grid(x1 = seq(0,1,length = sqrt(n.tr)), x2 = seq(0,1,length = sqrt(n.tr))) fIn <- list(f1 = matrix(runif(n.tr*10), ncol = 10), f2 = matrix(runif(n.tr*22), ncol = 22)) sOut <- fgp_BB3(sIn, fIn, n.tr) m1 <- fgpm(sIn = sIn, fIn = fIn, sOut = sOut) # defining hyperparameters for substitution var.sb <- 3 ls_s.sb <- c(2.44, 1.15) ls_f.sb <- c(5.83, 4.12) # updating the model m1up <- update(m1, var.sb = var.sb, ls_s.sb = ls_s.sb, ls_f.sb = ls_f.sb) # re-estimation of hyperparameters_________________________________________________________ # building the model set.seed(100) n.tr <- 25 sIn <- expand.grid(x1 = seq(0,1,length = sqrt(n.tr)), x2 = seq(0,1,length = sqrt(n.tr))) fIn <- list(f1 = matrix(runif(n.tr*10), ncol = 10), f2 = matrix(runif(n.tr*22), ncol = 22)) sOut <- fgp_BB3(sIn, fIn, n.tr) m1 <- fgpm(sIn = sIn, fIn = fIn, sOut = sOut) # re-estimating the hyperparameters m1up <- update(m1, var.re = TRUE) # only the variance m1up <- update(m1, ls_s.re = TRUE) # only the scalar length-scale parameters m1up <- update(m1, ls_s.re = TRUE, ls_f.re = TRUE) # all length-scale parameters m1up <- update(m1, var.re = TRUE, ls_s.re = TRUE, ls_f.re = TRUE) # all hyperparameters # same as above but now extending optimization from previously stored values m1up <- update(m1, var.re = TRUE, extend = TRUE) m1up <- update(m1, ls_s.re = TRUE, extend = TRUE) m1up <- update(m1, ls_s.re = TRUE, ls_f.re = TRUE, extend = TRUE) m1up <- update(m1, var.re = TRUE, ls_s.re = TRUE, ls_f.re = TRUE, extend = TRUE)
The fgpm_factory function returns an object of class "Xfgpm"
with the function calls of all the evaluated models stored in the @log.success@args
and
@log.crashes@args
slots. The which_on
function interprets the arguments linked to any
structural configuration and returns a list with two elements: (i) an array
of indices of the scalar
inputs kept active; and (ii) an array
of indices of the functional inputs kept active.
which_on(sIn = NULL, fIn = NULL, args)
which_on(sIn = NULL, fIn = NULL, args)
sIn |
An optional matrix of scalar input coordinates with all the orignal scalar input variables.
This is used only to know the total number of scalar input variables. Any |
fIn |
An optional list of functional input coordinates with all the original functional input
variables. This is used only to know the total number of functional input variables. Any |
args |
An object of class |
An object of class "list"
, containing the following information extracted from the
args parameter: (i) an array of indices of the scalar inputs kept active; and (ii) an array of
indices of the functional inputs kept active.
José Betancourt, François Bachoc, Thierry Klein and Jérémy Rohmer
Betancourt, J., Bachoc, F., Klein, T., Idier, D., Rohmer, J., and Deville, Y. (2024), "funGp: An R Package for Gaussian Process Regression with Scalar and Functional Inputs". Journal of Statistical Software, 109, 5, 1–51. (doi:10.18637/jss.v109.i05)
Betancourt, J., Bachoc, F., Klein, T., Idier, D., Rohmer, J., and Deville, Y. (2024), "funGp: An R Package for Gaussian Process Regression with Scalar and Functional Inputs". Journal of Statistical Software, 109, 5, 1–51. (doi:10.18637/jss.v109.i05)
Betancourt, J., Bachoc, F., and Klein, T. (2020), R Package Manual: "Gaussian Process Regression for Scalar and Functional Inputs with funGp - The in-depth tour". RISCOPE project. [HAL]
* get_active_in for details on how to obtain the data structures linked to the active inputs;
* modelCall for details on the args argument;
* fgpm_factory for funGp heuristic model selection;
* Xfgpm for details on object delivered by fgpm_factory.
# extracting the indices of the active inputs in an optimized model________________________ # use precalculated Xfgpm object named xm # active inputs in the best model [email protected]@args[[1]] # the full fgpm call set.seed(100) n.tr <- 32 sIn <- expand.grid(x1 = seq(0,1,length = n.tr^(1/5)), x2 = seq(0,1,length = n.tr^(1/5)), x3 = seq(0,1,length = n.tr^(1/5)), x4 = seq(0,1,length = n.tr^(1/5)), x5 = seq(0,1,length = n.tr^(1/5))) fIn <- list(f1 = matrix(runif(n.tr*10), ncol = 10), f2 = matrix(runif(n.tr*22), ncol = 22)) which_on(sIn, fIn, [email protected]@args[[1]]) # only the indices extracted by which_on
# extracting the indices of the active inputs in an optimized model________________________ # use precalculated Xfgpm object named xm # active inputs in the best model xm@log.success@args[[1]] # the full fgpm call set.seed(100) n.tr <- 32 sIn <- expand.grid(x1 = seq(0,1,length = n.tr^(1/5)), x2 = seq(0,1,length = n.tr^(1/5)), x3 = seq(0,1,length = n.tr^(1/5)), x4 = seq(0,1,length = n.tr^(1/5)), x5 = seq(0,1,length = n.tr^(1/5))) fIn <- list(f1 = matrix(runif(n.tr*10), ncol = 10), f2 = matrix(runif(n.tr*22), ncol = 22)) which_on(sIn, fIn, xm@log.success@args[[1]]) # only the indices extracted by which_on
This is the formal representation of the assembly of data structures delivered by the model
selection routines in the funGp package. An Xfgpm
object contains the
trace of an optimization process, conducted to build Gaussian process models of outstanding performance.
Main methods
fgpm_factory: structural optimization of fgpm
models,
creator of the "Xfgpm"
class.
Plotters
plot,Xfgpm-method: plot of the evolution of the algorithm with which = "evolution"
or of the absolute and relative quality of the optimized model with which = "diag"
.
factoryCall
Object of class "factoryCall"
. User call reminder.
model
Object of class "fgpm"
. Model selected by the heuristic structural
optimization algorithm.
stat
Object of class "character"
. Performance measure optimized to select the model. To be
set from "Q2loocv", "Q2hout".
fitness
Object of class "numeric"
. Value of the performance measure for the selected model.
structure
Object of class "data.frame"
. Structural configuration of the selected model.
log.success
Object of class "antsLog"
. Record of models successfully
evaluated during the structural optimization. It contains the structural configuration both in
data.frame and "modelCall"
format, along with the fitness of each model. The
models are sorted by fitness, starting with the best model in the first position.
log.crashes
Object of class "antsLog"
. Record of models crashed during the
structural optimization. It contains the structural configuration of each model, both in data.frame
and "modelCall"
format.
n.solspace
Object of class "numeric"
. Number of possible structural configurations for
the optimization instance resolved.
n.explored
Object of class "numeric"
. Number of structural configurations successfully
evaluated by the algorithm.
details
Object of class "list"
. Further information about the parameters of the ant colony
optimization algorithm and the evolution of the fitness along the iterations.
sIn
An object of class "matrix"
containing a copy of
the provided scalar inputs.
fIn
An object of class "list"
containing a copy of
the provided functional inputs.
sOut
An object of class "matrix"
containing a copy of the provided outputs.
Manual funGp: An R Package for Gaussian Process Regression with Scalar and Functional Inputs (doi:10.18637/jss.v109.i05)
José Betancourt, François Bachoc, Thierry Klein and Jérémy Rohmer