EquiformerV2¶
- class dipm.models.equiformer_v2.models.EquiformerV2(*args: Any, **kwargs: Any)¶
The EquiformerV2 model flax module. It is derived from the
ForceModelclass.References
Yi-Lun Liao, Brandon Wood, Abhishek Das and Tess Smidt. EquiformerV2: Improved Equivariant Transformer for Scaling to Higher-Degree Representations. International Conference on Learning Representations (ICLR), January 2024. URL: https://openreview.net/forum?id=mCOBKZmrzD.
- config¶
Hyperparameters / configuration for the EquiformerV2 model, see
EquiformerV2Config.
- dataset_info¶
Hyperparameters dictated by the dataset (e.g., cutoff radius or average number of neighbors).
- __call__(edge_vectors: Array, node_species: Array, senders: Array, receivers: Array, n_node: Array, rngs: Rngs | None = None) Array¶
Compute node-wise energy summands. This function must be overridden by the implementation of
ForceModel.
- class dipm.models.equiformer_v2.config.EquiformerV2Config(*, force_head: bool = False, param_dtype: DtypeEnum = DtypeEnum.F32, num_layers: Annotated[int, Gt(gt=0)] = 12, lmax: Annotated[int, Gt(gt=0)] = 6, mmax: Annotated[int, Ge(ge=0)] = 2, sphere_channels: Annotated[int, Gt(gt=0)] = 128, num_edge_channels: Annotated[int, Gt(gt=0)] = 128, atom_edge_embedding: str = 'isolated', num_rbf: Annotated[int, Gt(gt=0)] = 600, attn_hidden_channels: Annotated[int, Gt(gt=0)] = 64, num_heads: Annotated[int, Gt(gt=0)] = 8, attn_alpha_channels: Annotated[int, Gt(gt=0)] = 64, attn_value_channels: Annotated[int, Gt(gt=0)] = 16, ffn_hidden_channels: Annotated[int, Gt(gt=0)] = 128, norm_type: LayerNormType = LayerNormType.LAYER_NORM_SH, grid_resolution: Annotated[int, Gt(gt=0)] = None, use_m_share_rad: bool = False, use_attn_renorm: bool = True, use_gate_act: bool = False, use_grid_mlp: bool = True, use_sep_s2_act: bool = True, alpha_drop: float = 0.1, drop_path_rate: float = 0.05, avg_num_neighbors: float | None = 23.395238876342773, avg_num_nodes: float | None = 77.81317, atomic_energies: str | dict[int, float] | None = None)¶
The configuration / hyperparameters of the EquiformerV2 model.
- num_layers¶
Number of EquiformerV2 layers. Default is 12.
- Type:
int
- lmax¶
Maximum degree of the spherical harmonics (1 to 10).
- Type:
int
- mmax¶
Maximum order of the spherical harmonics (0 to lmax).
- Type:
int
- sphere_channels¶
Number of spherical channels. Default is 128.
- Type:
int
- num_edge_channels¶
Number of channels for the edge invariant features. Default is 128.
- Type:
int
- atom_edge_embedding¶
Whether to use / share atomic embedding along with relative distance. Options are “none”, “isolated” (default) and “shared”.
- Type:
str
- num_rbf¶
Number of basis functions used in the embedding block. Default is 600.
- Type:
int
Number of hidden channels used during SO(2) graph attention. Use 64 or 96 (not necessarily).
- Type:
int
- num_heads¶
Number of heads in the attention block. Default is 8.
- Type:
int
- attn_alpha_channels¶
Number of channels for alpha vector in each attention head.
- Type:
int
- attn_value_channels¶
Number of channels for value vector in each attention head.
- Type:
int
Number of hidden channels used during feedforward network.
- Type:
int
- norm_type¶
Type of normalization layer. Options are “layer_norm”, “layer_norm_sh” (default) and “rms_norm_sh”.
- grid_resolution¶
Resolution of SO3Grid used in Activation. Examples are 18, 16, 14, None (default, decided automatically).
- Type:
int
Whether all m components within a type-L vector of one channel share radial function weights.
- Type:
bool
- use_attn_renorm¶
Whether to re-normalize attention weights.
- Type:
bool
- use_gate_act¶
If
True, use gate activation. Otherwise, use S2 activation.- Type:
bool
- use_grid_mlp¶
If
True, use projecting to grids and performing MLPs for FFNs.- Type:
bool
- use_sep_s2_act¶
If
True, use separable S2 activation whenuse_gate_actis False.- Type:
bool
- alpha_drop¶
Dropout rate for attention weights. Use 0.0 or 0.1 (default).
- Type:
float
- drop_path_rate¶
Graph drop path rate. Use 0.0 or 0.05 (default).
- Type:
float
- avg_num_nodes¶
The mean number of atoms per graph. If
None, use the value from the dataset info. Default is value from IS2RE (100k).- Type:
float | None
- avg_num_neighbors¶
The mean number of neighbors for atoms. If
None, use the value from the dataset info. Default is value from IS2RE (100k). It is used to rescale messages by this value.- Type:
float | None
- atomic_energies¶
How to treat the atomic energies. If set to
None(default) or the string"average", then the average atomic energies stored in the dataset info are used. It can also be set to the string"zero"which means not to use any atomic energies in the model. Lastly, one can also pass an atomic energies dictionary via this parameter different from the one in the dataset info, that is used.- Type:
str | dict[int, float] | None