This is a compiled dataset of raw X-ray reflectivity (XRR, reflectometry) measurements together with corresponding fit parameters, intentionally published to use as training or test data for machine learning models.
The initial dataset has been compiled by A. Greco and prepared for publication by L. Pithan.
All data is provided in an hdf5 file, following NeXus convention with respect to the provided metadata in the hdf5 attributes. The datesets have been measured in-situ and therefore there are always stacks of curves that correspond to the different layer thicknesses of the same material on top of $SiO_x$. The measured data is provided under experimental
and the corresponding fit parameters under fit
. Additional information is collected in metadata
.
The following model was used:
Air | |
---|---|
layer 1: thickness, roughness and SLD fitted | |
$SiO_x$: thickness, roughness and SLD kept constant for each stack | |
$Si$: roughness and SLD kept constant |
🔴 For all provided fits the q-range during fitting was limited form $q_{min} = 0.02 {\mathring{A}}^{-1}$ to $q_{max} = 0.15 {\mathring{A}}^{-1}$.
The dataset itself can be cited via the zenodo:
Please consider also citing the following publication, where this dataset is initially described: "Machine Learning for Scattering Data: Strategies, Perspectives, and Applications to Surface Scattering" by Alexander Hinderhofer, Alessandro Greco, Vladimir Starostin, Valentin Munteanu, Linus Pithan, Alexander Gerlach and Frank Schreiber (in preparation).
Dataset version: 1.0
Have a look at github and zenodo. In case you wish to contribute further curves to this dataset or have ideas how to improve the dataset or where else to deposit it, please contact the authors at softmatter AT ifap.uni-tuebingen.de.
In the following you find some plots presenting the provided experimental data & fits. Most of the dataset were measured insitu therefore there are many measured XRR curves of the same sample while the thin film was still growing (prepared via OMBD). Use the slider to browse through the stacks of curves.
The html version of this documentation contains interactive plots without the need to install additional software.
To work with the provided dataset any software package / library that supports hdf5
can be used. We suggest to have a look at the h5py or silx packages when working with python. To inspect the file manually have a look at silx view. An example how to extract the provided data into simple txt file is provided at the bottom of this notebook.
The example code provided together with dataset uses the refl1d python package, which is also used in our machine learning XRR package mlreflect.
Together with the dataset there is a requirements.txt
for pip
or conda
(please note that relf1d
is currently only available trough pip). Further we provide an dump of the conda environment use to run this notebook in conda_env.yml
, that can be used to recreate the respective conda environment.
#prepare jupyter notebook for plots shown below
%run prepare_plot.py
#imports
from silx.io.dictdump import nxtodict #read NeXus hdf5 to python dict
import numpy as np
import pandas as pd
from IPython.display import display
#use same q_range as during fitting
q_fit = np.linspace(0.02,0.15,130,endpoint=False)
#produce plots
for key, ds in nxtodict("xrr_dataset.h5").items():
if "@" not in key: #skip nexus attributes
print("Dataset: ",key)
print("Experimentalists: ",ds["metadata"].pop("Experimentalists","?"))
ds["metadata"].pop("@NX_class", "")
display(pd.DataFrame.from_dict({"Dataset":[key],**ds["metadata"]}))
fig = prepare_figure(ds,q_fit,str(ds["metadata"]["Layer_material"])+" on SiOx")
fig.show()
Dataset: DIP_1 Experimentalists: ['Kowark, Stefan']
Dataset | Layer_CAS | Layer_formula | Layer_material | Substrate_temperature | Substrate_temperature@unit | instrument | q_max_fit | q_max_fit@unit | year_experiment | |
---|---|---|---|---|---|---|---|---|---|---|
0 | DIP_1 | 188-94-3 | C32H16 | Diindenoperylene | 303 | K | ESRF, ID10b | 0.15 | 1/Ang | 2005 |
Dataset: DIP_2 Experimentalists: ['Hinderhofer, Alexander']
Dataset | Layer_CAS | Layer_formula | Layer_material | Substrate_temperature | Substrate_temperature@unit | instrument | q_max_fit | q_max_fit@unit | year_experiment | |
---|---|---|---|---|---|---|---|---|---|---|
0 | DIP_2 | 188-94-3 | C32H16 | Diindenoperylene | 303 | K | ESRF, ID10b | 0.15 | 1/Ang | 2010 |
Dataset: DIP_3 Experimentalists: ['Lorch, Christopher']
Dataset | Layer_CAS | Layer_formula | Layer_material | Substrate_temperature | Substrate_temperature@unit | instrument | q_max_fit | q_max_fit@unit | year_experiment | |
---|---|---|---|---|---|---|---|---|---|---|
0 | DIP_3 | 188-94-3 | C32H16 | Diindenoperylene | 303 | K | SLS, MXX04 | 0.15 | 1/Ang | 2012 |
Dataset: DNTT_PDIF_1to2 Experimentalists: ['Rußegger, Nadine' 'Greco, Alessandro']
Dataset | Layer_material | instrument | q_max_fit | q_max_fit@unit | year_experiment | |
---|---|---|---|---|---|---|
0 | DNTT_PDIF_1to2 | DNTT PDIF (1:2) | DESY, P08 | 0.15 | 1/Ang | 2021 |
Dataset: DNTT_PDIF_2to1 Experimentalists: ['Rußegger, Nadine' 'Greco, Alessandro']
Dataset | Layer_material | instrument | q_max_fit | q_max_fit@unit | year_experiment | |
---|---|---|---|---|---|---|
0 | DNTT_PDIF_2to1 | DNTT PDIF (2:1) | DESY, P08 | 0.15 | 1/Ang | 2021 |
Dataset: PDIC5 Experimentalists: ['Rußegger, Nadine' 'Greco, Alessandro']
Dataset | Layer_material | Substrate_temperature | Substrate_temperature@unit | q_max_fit | q_max_fit@unit | |
---|---|---|---|---|---|---|
0 | PDIC5 | PDIC5 | 303 | K | 0.15 | 1/Ang |
Dataset: PDIC8 Experimentalists: ['Rußegger, Nadine' 'Greco, Alessandro']
Dataset | Layer_material | Substrate_temperature | Substrate_temperature@unit | q_max_fit | q_max_fit@unit | |
---|---|---|---|---|---|---|
0 | PDIC8 | PDIC8 | 303 | K | 0.15 | 1/Ang |
Dataset: PDIC8CN2_DIP_1to1 Experimentalists: ['Rußegger, Nadine' 'Greco, Alessandro']
Dataset | Layer_material | Substrate_temperature | Substrate_temperature@unit | q_max_fit | q_max_fit@unit | |
---|---|---|---|---|---|---|
0 | PDIC8CN2_DIP_1to1 | PDIC8CN2 DIP (1:1) | 303 | K | 0.15 | 1/Ang |
Dataset: PEN_1 Experimentalists: ['Dax, Ingrid']
Dataset | Layer_formula | Layer_material | Substrate_temperature | Substrate_temperature@unit | instrument | q_max_fit | q_max_fit@unit | year_experiment | |
---|---|---|---|---|---|---|---|---|---|
0 | PEN_1 | C22H14 | Pentacene | 300 | K | lab source | 0.15 | 1/Ang | 2021 |
Dataset: PEN_2 Experimentalists: ['Dax, Ingrid']
Dataset | Layer_formula | Layer_material | Substrate_temperature | instrument | q_max_fit | q_max_fit@unit | year_experiment | |
---|---|---|---|---|---|---|---|---|
0 | PEN_2 | C22H14 | Pentacene | LT | lab source | 0.15 | 1/Ang | 2021 |
from silx.io.dictdump import h5todict #read hdf5 to python dict, without taking NeXus metadata into account
import numpy as np
import pandas as pd
dataset_name = "PEN_2"
curve_index = 1
pen2 = h5todict("xrr_dataset.h5",path=dataset_name)
#export q and reflectivity curve
np.savetxt(f"PEN_2_curve_{curve_index}.txt",list(zip(pen2['experiment']['q'],pen2['experiment']['data'][curve_index])),header="#q in 1/Ang\trefl (norm.)")
#export fit parameters
fits=pd.DataFrame.from_dict(pen2["fit"])
fits.loc[curve_index:curve_index].to_csv("PEN_2_fit_parameters.txt")