Abstract
Visualizing crystal structures is an important part in understanding the relationship and evolution of structures in solid-state chemistry. Among the different methods, polyhedral models are preferred for visualizing the three-dimensional structures of materials. In this dataset, we use the n-capped n-gonal prism models (n = 3 – 7) as primary structural motif to visualize intermetallic compounds possessig AB stacking. By analysing the experimental data available in the Pearson’s Crystal Data, we identified 645 unique prototype structures that possess one or more of the n-capped n-gonal prisms. The identified structural motifs are then plotted on the plane perpendicular to the layering axis. The analysis and plotting were performed using an in-house developed Python program which automatically analyses the compounds for the presence of these structural motifs and creates plots illustrating these structures.
Keywords: Crystal structure, Intermetallics, Capped-prisms, Crystal structure visualization, Material informatics
Specifications Table
| Subject | Materials Science / Materials Chemistry |
| Specific subject area | Visualization of n-capped n-gonal prisms environments such as tri-capped trigonal prism, tetra-capped quadrilateral prism, etc. in intermetallic prototype compounds with AB stacking. |
| Type of data | Images of 3 × 3 supercells of prototype compounds with n-capped n-gonal prisms drawn (n = 3–7). |
| Data collection | The dataset was collected by identifying the n-capped n-gonal prisms environments in the prototype structures and then plotting all such environments in a 3 × 3 supercell perpendicular to the stacking layer. This is achieved using an inhouse developed Python program. The program uses Cifkit to identify unique sites and the atoms in their coordination sphere and matplotlib to create the visualizations. |
| Data source location | The candidate prototypes were identified by analysing all intermetallic compounds available in the Pearson’s Crystal Data (version 2024/25). |
| Data accessibility | Repository name: Figshare.com Data identification number: Direct URL to data: https://dx.doi.org/10.6084/m9.figshare.29105273 The data is available as a direct download of archive |
| Related research article | N/A |
1. Value of the Data
-
•
Crystal structures are commonly described using polyhedral description. In this dataset, we visually describe the compounds with AB type packing using the n-capped n-gonal prism description.
-
•
This presents the researchers with data to analyze structures in terms of similarities and dissimilarities in the presence of n-capped n-gonal prisms and their relative arrangement.
-
•
Further, the presence of planes, lines, and separated capped environments can be visualized, and new candidate compound can be proposed for experimental realization through crystal structure design strategies.
2. Background
Understanding the structural relationships between different compounds is important in understanding evolution of structures and their effect on properties. These changes can be easily visualized using polyhedral models. This has been extensively applied to families of compounds such as perovskites and spinels. Bärnighausen tree provides a systematic approach for symmetry reduction through group-subgroup relationships [1]. The tree starts with a simple, highly symmetrical prototype, often called the aristotype. Then the relationship between the aristotype and other prototypes will be added downward, labelling the symmetry reduction in terms of point groups, translational symmetry (translationengleiche, klassengleiche, and general), and relates the Wyckoff positions to give a diagrammatic representation. Atoms of the same kind tend to occupy equivalent positions [2],thus visualizing the evolution of environments of elements can be used to guide synthetic efforts by interpolating or extrapolating structural motifs. For instance, Cu3Au-type can be visualized as sheets of connected tetra-capped square prism and by changing the composition, one of the tetra-capped tetragonal prism can be changed to give two tri-capped trigonal-prisms sharing a face, yielding the U3Si2-type (Fig. 1). The transformation between these two prototypes can be rationalized through structures with distorted tetragonal-prism environments, for example Pd13Ga5-type. Chemically, the transformation results in the increase of the electronegativity differences and lowering of the radii for the most electronegative element in the compounds. Here, in an attempt to extend this to intermetallic compounds, we have created a dataset of intermetallic prototype compounds with AB stacking with n-capped n-gonal prisms (in convention with the chemical literature, the Latin numerical prefixes are used in text in lieu of the Arabic numerals) as the structural motif. This dataset can be used for machine vision-based machine learning models in materials [3], providing structural descriptions complementing traditional descriptions [4,5], systematization of intermetallic compounds [6], and identifying structural transformations [7].
Fig. 1.
Scheme illustrating the evolution of U3Si2-type from Cu3Au-type through Pd13Ga5-type. The top row shows the changes in the environments, middle row shows the relevant prototypes, and the bottom row shows the periodic tale heatmaps of compositions of Cu3Au-type (left) and U3Si2-type (right).
3. Data Description
The data is presented as images, and they are separated into three directories.
-
•
Overlay-full – contains the images which are scaled according to their unit cell dimension in the non-stacking plane and shows all available capped-prism environments.
-
•
Overlay-scaled – contains scaled images with the minimum length is set to four inches.
-
•
Individual-grouped-by-structure-type – contains directories for each prototype with individual images for applicable n (n = 3, 4, 5, 6, 7).
-
•
Individual-grouped-by-n – contains five directories for each n (n = 3, 4, 5, 6, 7). Each directory contains images from different prototypes with same n.
The dataset consists of 645 unique intermetallic structure-type figures showing n-capped n-gonal prisms consistently formatted. In the figures, n ranging from three to seven are plotted and Fig. 2 shows the structural representation of prisms corresponding to each n. The title of the figure consists of the chemical formula-type and the space group symbol (international short symbol notation). The legend lists the constituent elements from the structure type formula with the color-coded elements based on their group in the periodic table (blue for group 1–3, grey for group 4–10, pink for group 11–12, and red for group 13–16). A and B layers are relative and indicated with the solid and empty markers. The figures are saved with filenames that has the prototype formula, Pearson symbol, and space group number, separated hyphen character.
Fig. 2.
Two- and three-dimensional depiction of the n-capped n-gonal prisms (n = 3,4,5,6, and 7) environments. In the two-dimensional plots, only one type of prism is shown in each structure, and the capping atoms are not linked to centre for clarity. In three-dimensional, only one capped is shown with their capping atoms linked via bonds. Filled and open circles indicate the different layers in the structures.
The prisms are extracted from the automatically identified coordination environment based on the d/dmin method (see Experimental Design Materials, and Methods for details). The caping atoms on each face of the prism are required for the environment type identification, which results in tri-capped trigonal prisms (CN 9), tetra-capped tetragonal prisms (CN 12), penta-capped pentagonal prisms (CN 15), hexa-capped hexagonal prisms (CN 18), hepta-capped heptagonal prisms (CN 21).
The polyhedra are represented as the n-gonal prisms with layers differentiated as solid and empty polyhedral shapes. Examples are shown in Fig. 3, Fig. 4, Fig. 5, Fig. 6. Users can generate their own figures in a similar format from a CIF to compare with the provided 645 structure types using the code provided here: https://github.com/balaranjan/coord_env_analysis.git
Fig. 3.
Zr2P-type structure with red triangles indicating tri-capped trigonal prisms and green squares indicating tetra-capped tetragonal prisms.
Fig. 4.
Zr3NiSb7-type structure with red triangles indicating tri-capped trigonal prisms and green squares indicating tetra-capped tetragonal prisms.
Fig. 5.
Zr3Pd4P3-type structure with red triangles indicating tri-capped trigonal prisms, green squares indicating tetra-capped tetragonal prisms, and blue pentagons indicating penta-capped pentagonal prisms.
Fig. 6.
Zr5Pd9P7-type structure with red triangles indicating tri-capped trigonal prisms, green squares indicating tetra-capped teragonal prisms, and blue pentagons indicating penta-capped pentagonal prisms.
4. Experimental Design, Materials and Methods
The structure types containing n-capped n-gonal prisms were extracted by analysing all intermetallic compounds (all reported compounds without noble gases, halides, H, He, B, N, O, and S) reported in Pearson’s Crystal Data (release 2024/25) [8]. Then, this dataset was cleaned using cif-cleaner [9] utility to remove crystallographic information files (CIF) with (i) data errors (ii) compounds containing partial and mixed occupancies, and (iii) compounds with a minimum interatomic distance of <1.2 Å Following this data cleaning, 66,278 compounds belonging to 3853 prototypes were selected for further analysis. From these compounds, 846 unique structure types forming compounds with AB stacking were analysed for the presence of n-capped n-gonal prisms environments by plotting the capped prisms on the plane perpendicular to the layer axis in a 3 × 3 supercell, using an in-house developed Python script (available at https://github.com/balaranjan/coord_env_analysis.git). In the collection of this dataset, n = 3–7 was analysed and plotted using this approach. These motifs are depicted in Fig. 2.
This script finds the different sites and their neighbours using Cifkit[10] and the coordination number is determined by the d/dmin method where the distance between the center and a neighbor is divided by the minimum of all center-neighbor distances in the nearest neighbors. Then the pairwise differences between consecutive elements are computed, and the number neighbors before the maximal pairwise difference is determined as the coordination number. Within this coordination environment, the script analyses the closest neighbours below and above the centre atom for the presence of n-gonal prisms that encapsulates the centre atom. If such prism is found, then for each face parallel to the layering axis, if a capping atom is present the site is attributed have the required environment with that specific n. The code will analyse for n ranging from three to seven, and after analysing a compound, all sites with these environments will be plotted on a 3 × 3 supper cell perpendicular to the layering axis.
The summary of different capped environments in the dataset is summarized in Table 1, and it shows that the smaller capped environments are predominant. The structure types containing capped prisms and the n values of the prisms are given in Supplementary Information Table S1. The duplicates (by name) in Table S1 are the cases when different structure types adopted by compounds with identical composition (CaCu-type in P21/m with tri-capped trigonal and penta-capped pentagonal prisms and CaCu-type in Pnma with the same environments). In the rare cases, where compounds with identical composition adopts structure types belonging to the same space group, their Pearson symbol is provided in parentheses (e.g. CeCoAl-type C2/m has two entries, mS12 and mS180, resulting in two entries).
Table 1.
Number of prototypes forming intermetallic compounds with AB stacking, containing different n-capped n-gonal prisms.
| n of the environment | 3 | 4 | 5 | 6 | 7 |
|---|---|---|---|---|---|
| 3 | 515 | ||||
| 4 | 383 | 509 | |||
| 5 | 192 | 180 | 201 | ||
| 6 | 95 | 87 | 38 | 108 | |
| 7 | 4 | 4 | 3 | 1 | 4 |
The heatmap of the centre elements of the different capped environment is given in Fig. 7, Fig. 8. It agrees with the general intuition that the n in the n-capped n-gonal prism increases as the centre element moves from the top right of the periodic table to the bottom left (increasing the atomic radii).
Fig. 7.
Element histogram of centre elements occupying tri-capped trigonal (top) and tetra-capped square (bottom) prism environments.
Fig. 8.
Element histogram of centre elements occupying penta-capped pentagonal (top), hexa-capped hexagonal (middle), and hepta-capped heptagonal (bottom) prism environments.
Limitations
The dataset was constructed using data available the Pearson’s Crystal Data (version 2024/25). Environments with severe distortions from ideal geometries are also included.
Ethics Statement
Authors have read and followed the ethical requirements for publication in Data in Brief.
Credit Author Statement
Balaranjan Selvaratnam: Software, Data curation, Visualization, Conceptualization, Writing – original draft, review & editing. Emil Jaffal: Visualization and Writing - original draft, review & editing. Danila Shiryaev: editing. Anton Oliynyk: Conceptualization, Writing – review & editing
Acknowledgements
Anton Oliynyk thanks Hunter College CUNY for startup funds.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Footnotes
Supplementary material associated with this article can be found, in the online version, at doi:10.1016/j.dib.2025.112138.
Appendix. Supplementary Materials
Data Availability
References
- 1.U. Müller, Relating crystal structures by group–Subgroup relations; 2011; pp 44–56. 10.1107/97809553602060000795. [DOI]
- 2.Brunner G.O. An unconventional view of the 'closest sphere packings. Acta Crystallogr. Sect. A. 1971;27:388–390. doi: 10.1107/S0567739471000858. [DOI] [Google Scholar]
- 3.Wang A.Y.-T., Murdock R.J., Kauwe S.K., Oliynyk A.O., Gurlo A., Brgoch J., Persson K.A., Sparks T.D. Machine Learning for materials scientists: an introductory guide toward best practices. Chem. Mater. 2020;32:4954–4965. doi: 10.1021/acs.chemmater.0c01907. [DOI] [Google Scholar]
- 4.Tyvanchuk Y., Babizhetskyy V., Baran S., Szytuła A., Smetana V., Lee S., Oliynyk A.O., Mudring A.-V. The crystal and electronic structure of RE23Co6.7In20.3 (RE = Gd–Tm, Lu): a new structure type based on intergrowth of AlB2- and CsCl-type related slabs. J. Alloys Compd. 2024;976 doi: 10.1016/j.jallcom.2023.173241. [DOI] [Google Scholar]
- 5.Gvozdetskyi V., Selvaratnam B., Oliynyk A.O., Mar A. Revealing hidden patterns through chemical intuition and interpretable machine learning: a case study of binary rare-earth intermetallics RX. Chem. Mater. 2023;35:879–890. doi: 10.1021/acs.chemmater.2c02425. [DOI] [Google Scholar]
- 6.Oliynyk A.O., Adutwum L.A., Rudyk B.W., Pisavadia H., Lotfi S., Hlukhyy V., Harynuk J.J., Mar A., Brgoch J. Disentangling structural confusion through machine learning: structure prediction and polymorphism of equiatomic ternary phases ABC. J. Am. Chem. Soc. 2017;139:17870–17881. doi: 10.1021/jacs.7b08460. [DOI] [PubMed] [Google Scholar]
- 7.Oliynyk A.O., Mar A. Discovery of intermetallic compounds from traditional to machine-learning approaches. Acc. Chem. Res. 2018;51:59–68. doi: 10.1021/acs.accounts.7b00490. [DOI] [PubMed] [Google Scholar]
- 8.P. Villars, K. Cenzual, Pearson’s Crystal Data: crystal structure Database for inorganic compounds (on DVD). 2024/2025.
- 9.Jaffal E.I., Lee S., Shiryaev D., Vtorov A., Barua N.K., Kleinke H., Oliynyk A.O. Composition and structure analyzer/featurizer for explainable machine-learning models to predict solid State structures. Digit. Discov. 2025;4:548–560. doi: 10.1039/D4DD00332B. [DOI] [Google Scholar]
- 10.Lee S., Oliynyk A.O. Cifkit: a Python package for Coordination geometry and Atomic site analysis. J. Open Source Softw. 2024;9:7205. doi: 10.21105/joss.07205. [DOI] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.








