3.1. FitSnap
The FitSnap
class houses all objects needed for performing a fit. These objects
are instantiated from other core classes described in the rest of the docs. The two main
inputs to a FitSnap
instance are (1) settings and (2) an MPI communicator. The
settings can be a nested dictionary as shown in some examples, while the MPI communicator
is typically the world communicator containing all resources dictated by the mpirun
command. After instantiating FitSnap
with these inputs, the instance will contain
its own instance of ParallelTools
which houses functions and data structures that
operate on the resources in the input communicator. The settings will be stored in an instance
of the Config
class. The FitSnap
class is documented below.
FitSnap
contains instances of two classes that help with MPI communication and
settings; ParallelTools
and Config
, respectively. These two classes are
explained below. In addition to these classes, there are other core classes
Scraper
, Calculator
, and Solver
which are explained in later
sections.
3.1.1. Parallel Tools
Parallel Tools are a collection of data structures and functions for transferring data in FitSNAP workflow in a massively parallel fashion.
The ParallelTools
instance pt
of a FitSNAP instance fs
can be used
to create shared memory arrays, for example:
# Create a shared array called `a`.
fs.pt.create_shared_array('a', nrows, ncols)
# Change the shared array at a rank-dependent element.
fs.pt.shared_arrays['a'].array[rank,0] = rank
# Observe that the change happened on all procs.
print(f"Shared array on rank {rank}: {fs.pt.shared_arrays['a'].array}")
Currently these tools reside in a single file parallel_tools.py
which houses some
classes described below.
- class fitsnap3lib.parallel_tools.DistributedList(proc_length)
This class is used for distributed memory Python lists. The class to wraps Python’s list to ensure size stays the same allowing collection at end. This class is normally used like, for example:
pt.add_2_fitsnap("Groups", DistributedList(nconfigs))
- Parameters:
proc_length (int) – Number of elements for the list on current process.
- _len
length of distributed list held by current proc
- Type:
int
- _list
local section of distributed list
- Type:
list
- get_list()
Returns deepcopy of internal list
- exception fitsnap3lib.parallel_tools.GracefulError(*args, **kwargs)
- class fitsnap3lib.parallel_tools.ParallelTools(comm=None)
This class creates and contains arrays used for fitting, across multiple processors.
- check_fitsnap_exist
Checks whether fitsnap dictionaries exist before creating a new one, set to False to allow recreating a dictionary.
- Type:
bool
- add_2_fitsnap(name, an_object)
Add an object, such as a DistributedList, to the pt.fitsnap_dict dictionary. This dictionary contains configuration information such as group name, filename, testing bools, etc. This function is normally used in conjunction with the DistributedList class, where a distributed memory list is added with a keyname name.
- Parameters:
name (str) – Key name of the object being added.
an_object – A Python object to add, usually an instance of our DistributedList class.
Create a shared memory array as a key in the
pt.shared_array
dictionary. This function uses theSharedArray
class to instantiate a shared memory array in the supplied dictionary keyname
.If the key name already exists, this function will free the memory associated with the existing array.
If not using MPI, i.e.
stubs == 0
, we create aStubsArray
.- Parameters:
name (str) – Name of the array which will be the key name.
size1 (int) – First dimension size.
size2 (int) – Optional second dimension size, defaults to 1.
dtype (str) – Optional data type character, defaults to d for double.
- exception(err)
Gracefully exit with an exception.
- Parameters:
err (str) – Error message to exit with.
- free()
Free memory associated with all shared arrays.
- gather_fitsnap(name, allgather: bool = True)
Gather distributed lists. :param allgather: If true then we allgather. When number of procs is large this will use more memory.
- get_ncpn(nconfigs)
Get number of configs per node; return nconfigs if stubs.
- Parameters:
nconfigs – integer number of configurations on this process, typically length of list of data dictionaries.
Returns number of configs per node, reduced across procs, or just nconfigs if stubs.
- new_slice_a()
Create array to show which sub a matrix indices belong to which proc. For linear solvers, the A matrix may be composed of either summed per-atom descriptors OR per-atom descriptors. For nonlinear solvers, the A matrix is composed of per-atom quantities like bispectrum components, etc.
- new_slice_b()
Create array to show which sub b matrix indices belong to which proc.
- new_slice_c()
Create array to show which sub c matrix indices belong to which proc.
- new_slice_dgrad()
Create array to show which sub dgrad matrix indices belong to which proc.
- new_slice_neighlist()
Create array to show which sub neighlist matrix indices belong to which proc.
- new_slice_t()
Create array to show which sub types matrix indices belong to which proc.
- slice_array(name)
Slices an array using Python’s native slice function. Creates an attribute pt.shared_arrays[name].sliced_array containing the sliced array.
Instantiating this class will create a shared memory array in the
array
attribute.- Parameters:
size1 (int) – First dimension of the array.
size2 (int) – Optional second dimension of the array, defaults to 1.
size3 (int) – Optional third dimension of the array, defaults to 1.
dtype (str) – Optional data type, defaults to d for double.
multinode (int) – Optional multinode flag used for scalapack purposes.
comms (MPI.Comm) – MPI communicator.
Array of numbers that share memory across processes in the communicator.
- Type:
np.ndarray
- class fitsnap3lib.parallel_tools.StubsArray(size1, size2=1, size3=1, dtype='d')
Instantiating this class will create a stubs array in the
array
attribute. In plain speak, this is just a normal numpy array.- Parameters:
size1 (int) – First dimension of the array.
size2 (int) – Optional second dimension of the array, defaults to 1.
size3 (int) – Optional third dimension of the array, defaults to 1.
dtype (str) – Optional data type, defaults to d for double.
- array
Array of numbers that share memory across processes in the communicator.
- Type:
np.ndarray
3.1.2. Config
The Config class is used for storing settings associated with a FitSNAP instance. Throughout the code and library examples, you may see code snippets like:
fs.config.sections["GROUPS"].group_table
where fs
is the FitSNAP instance being accessed. In this snippet, the sections
attribute contains keys, such as "GROUPS"
, which contains attributes like the group
table which we can access. In this regard, fs.config
stores all the settings relevant
to a particular FitSNAP instance or fit, which can then be easily accessed anywhere else
throughout the code.
- class fitsnap3lib.io.input.Config(pt, input=None, arguments_lst: list = [])
Class for storing input settings in a config instance. The config instance is first created in io/output.py. If given a path to an input script, we use Python’s native ConfigParser to parse the settings. If given a nested dictionary, the sections are determined from the first keys and specific settings from the nested keys.
- Parameters:
pt – A ParallelTools instance.
input – Optional input can either be a filename or a dictionary.
arguments_lst – List of args that can be supplied at the command line.
- infile
String for optional input filename. Defaults to None.
- indict
Dictionary for optional input dictionary of settings, to replace input file. Defaults to None.
- convert_to_dict(original_input=False)
Convert the current config (settings) object to a dictionary. Note that datatypes may not be preserved.
- Parameters:
original_input – optional, set to True to return the original input
- Returns:
Python dictionary containing the same elements as the original or current config (settings) object.
- Return type:
config_dict
- parse_cmdline(arguments_lst: list = [])
Parse command line args if using executable mode, or a list if using library mode.
- view_state(sections: list | str = [], original_input=False)
Print a view to screen of the sections contained in the FitSNAP configuration object in its current state. When no ‘sections’ argument is provided, all sections’ information will be printed to screen. If the argument contains invalid section name, it will be printed with a warning, but will not crash.
- Parameters:
sections – optional list of sections or string of a single section name (i.e. [‘BISPECTRUM’, ‘CALCULATOR’, ‘REFERENCE’], or ‘GROUPS’)
original_input – optional, set this value to “True” to view the original input file (saved in self._original_config) instead of the current state. This can be useful for debugging or cloning FitSNAP input settings objects.