diff --git a/fig/poster-2023-02-14/initial-results/jij-poyen-soap-n-8-l-8-r-4-s-1.0-tt-split-0.2.png b/fig/poster-2023-02-14/initial-results/jij-poyen-soap-n-8-l-8-r-4-s-1.0-tt-split-0.2.png new file mode 100644 index 0000000000000000000000000000000000000000..924a3a6f9daa654169a2c1a4726e58b606a3b14e --- /dev/null +++ b/fig/poster-2023-02-14/initial-results/jij-poyen-soap-n-8-l-8-r-4-s-1.0-tt-split-0.2.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:bd211b27bef9b2e2664343cad9918bb8740dd215bbd8df8bda18823d8500113f +size 20993 diff --git a/fig/poster-2023-02-14/initial-results/jij-poyen-soap-n-8-l-8-r-4-s-1.0-tt-split-0.2.txt b/fig/poster-2023-02-14/initial-results/jij-poyen-soap-n-8-l-8-r-4-s-1.0-tt-split-0.2.txt new file mode 100644 index 0000000000000000000000000000000000000000..7770956c4d4b73ba99bc3e100b688e68c69745ae --- /dev/null +++ b/fig/poster-2023-02-14/initial-results/jij-poyen-soap-n-8-l-8-r-4-s-1.0-tt-split-0.2.txt @@ -0,0 +1,32 @@ +with notebook +https://git.rwth-aachen.de/sisc/sisclab2022-project6-git/-/blob/63252c2f0511e2aa09c053a443eb5eca337209f1/notebooks/work-package-2/task_Jij_poyen.ipynb +changed: +- SOAP average='inner' +- remove poyen's np.concatenate SOAP vectors +- increased train-test split to 0.2 + +note that on different reruns (train-test-split) got widely varying +results from R2 0.5 to R2 0.9. this indicates that the dataset is +highly imbalanced and simple random train-test split leads to unreliable +results. + +see for example +- https://www.google.com/search?channel=fs&client=ubuntu&q=how+does+train+test+split+affect+accuracy + - https://towardsdatascience.com/3-things-you-need-to-know-before-you-train-test-split-869dfabb7e50 + - stratification, and few other requirements to satisfy + +metrics for the run from the plot +jij-poyen-soap-n-8-l-8-r-4-s-1.0-tt-split-0.2.png +Mean absolute error on test set: 0.375 eV +R^2 score on test set: 0.920 + +when you look at the parity plot (predicted-vs-true plot) you see why the dataset is +imbalanced. almost all cluster around the origin, ie almost all have +small Jij with only a few outliers with high / low Jij values. + +metric for the run from the plot +jij-poyen-soap-n-8-l-8-r-4-s-1.0-tt-split-0.25.png +Mean absolute error on test set: 0.436 eV +R^2 score on test set: 0.883 + +with train-test-split 0.3, got consistently R2=0.75. diff --git a/fig/poster-2023-02-14/initial-results/jij-poyen-soap-n-8-l-8-r-4-s-1.0-tt-split-0.25.png b/fig/poster-2023-02-14/initial-results/jij-poyen-soap-n-8-l-8-r-4-s-1.0-tt-split-0.25.png new file mode 100644 index 0000000000000000000000000000000000000000..9e01c8b37cdd697048f8fb720864690d6f667500 --- /dev/null +++ b/fig/poster-2023-02-14/initial-results/jij-poyen-soap-n-8-l-8-r-4-s-1.0-tt-split-0.25.png @@ -0,0 +1,3 @@ +version https://git-lfs.github.com/spec/v1 +oid sha256:09438d4ae15942e910fc309624fb96c4c35a6f01b1ce68cd3f24b52940e1290a +size 21422