Skip to content
Snippets Groups Projects
Commit 97c82fa1 authored by Johannes Wasmer's avatar Johannes Wasmer
Browse files

add fig/poster-2023-02-14/initial-results

parent 89a6f094
Branches
No related tags found
No related merge requests found
fig/poster-2023-02-14/initial-results/jij-poyen-soap-n-8-l-8-r-4-s-1.0-tt-split-0.2.png

130 B

with notebook
https://git.rwth-aachen.de/sisc/sisclab2022-project6-git/-/blob/63252c2f0511e2aa09c053a443eb5eca337209f1/notebooks/work-package-2/task_Jij_poyen.ipynb
changed:
- SOAP average='inner'
- remove poyen's np.concatenate SOAP vectors
- increased train-test split to 0.2
note that on different reruns (train-test-split) got widely varying
results from R2 0.5 to R2 0.9. this indicates that the dataset is
highly imbalanced and simple random train-test split leads to unreliable
results.
see for example
- https://www.google.com/search?channel=fs&client=ubuntu&q=how+does+train+test+split+affect+accuracy
- https://towardsdatascience.com/3-things-you-need-to-know-before-you-train-test-split-869dfabb7e50
- stratification, and few other requirements to satisfy
metrics for the run from the plot
jij-poyen-soap-n-8-l-8-r-4-s-1.0-tt-split-0.2.png
Mean absolute error on test set: 0.375 eV
R^2 score on test set: 0.920
when you look at the parity plot (predicted-vs-true plot) you see why the dataset is
imbalanced. almost all cluster around the origin, ie almost all have
small Jij with only a few outliers with high / low Jij values.
metric for the run from the plot
jij-poyen-soap-n-8-l-8-r-4-s-1.0-tt-split-0.25.png
Mean absolute error on test set: 0.436 eV
R^2 score on test set: 0.883
with train-test-split 0.3, got consistently R2=0.75.
fig/poster-2023-02-14/initial-results/jij-poyen-soap-n-8-l-8-r-4-s-1.0-tt-split-0.25.png

130 B

0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment