-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Weight initialization in new ACE model #202
Comments
I think initialization will be one of the main topics that we need to study carefully and get experience on. Obvious places where we should learn from are MTP, qSNAP, pacemaker, MACE. |
Looking at pacemaker sounds something good to do. |
I think the first step is to implement a one-hot initialization that will make the new models equivalent to ACE1. This is a little tricky since in the new models What makes this a bit tricky is that in the regime of many or few elements the behaviour should probably be quite different... For a first attempt my suggestion would be to choose weights such that
where
If we wanted to get a little more ambitious then we could assign relative weights to different elements to give higher resolution to some element-pairs than to others. |
and don't forget MTP! Let's get something up and running and then learn from others to see how they improve... |
I will find time to look into the one-hot in the coming week. That also give us some time to read through how others work with non-linear small MLIPs. |
hmm it doesn't seems to be straightforward. One probably has to allow maxn \neq maxq to proceed and this probably requires a discussion on the interface. Correct me if I am wrong? |
I agree it is not entirely obvious. Let's try to find time tomorrow? |
Current, only
zero
orglorot_normal
is allowed. I experience with that and naively applyingglorot_normal
doesn't seems to be good. This issue collects all discussion related to weight initialization.ref: #196 (comment)
The text was updated successfully, but these errors were encountered: