Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weight initialization in new ACE model #202

Open
CheukHinHoJerry opened this issue Jun 2, 2024 · 7 comments
Open

Weight initialization in new ACE model #202

CheukHinHoJerry opened this issue Jun 2, 2024 · 7 comments

Comments

@CheukHinHoJerry
Copy link
Collaborator

Current, only zero or glorot_normal is allowed. I experience with that and naively applying glorot_normal doesn't seems to be good. This issue collects all discussion related to weight initialization.

ref: #196 (comment)

@cortner
Copy link
Member

cortner commented Jun 3, 2024

I think initialization will be one of the main topics that we need to study carefully and get experience on. Obvious places where we should learn from are MTP, qSNAP, pacemaker, MACE.

@CheukHinHoJerry
Copy link
Collaborator Author

Looking at pacemaker sounds something good to do.

@cortner
Copy link
Member

cortner commented Jun 3, 2024

I think the first step is to implement a one-hot initialization that will make the new models equivalent to ACE1. This is a little tricky since in the new models n stands for the combined (n, z) in ACE1. So I think in this case the number of n channels must be at large enough to accomodate both n, z.

What makes this a bit tricky is that in the regime of many or few elements the behaviour should probably be quite different...

For a first attempt my suggestion would be to choose weights such that

Rnl(rij, Zi, Zj) = Pk(rij)

where n <-> (k,z) and the n are assigned through a loop ordering like

n = 0
for k = 1,2,..
    for z in zlist 
       n += 1
       # ... 
   end 
end

If we wanted to get a little more ambitious then we could assign relative weights to different elements to give higher resolution to some element-pairs than to others.

@cortner
Copy link
Member

cortner commented Jun 3, 2024

Looking at pacemaker sounds something good to do.

and don't forget MTP!

Let's get something up and running and then learn from others to see how they improve...

@CheukHinHoJerry
Copy link
Collaborator Author

I will find time to look into the one-hot in the coming week. That also give us some time to read through how others work with non-linear small MLIPs.

@CheukHinHoJerry
Copy link
Collaborator Author

CheukHinHoJerry commented Jun 4, 2024

hmm it doesn't seems to be straightforward. One probably has to allow maxn \neq maxq to proceed and this probably requires a discussion on the interface. Correct me if I am wrong?

@cortner
Copy link
Member

cortner commented Jun 4, 2024

I agree it is not entirely obvious. Let's try to find time tomorrow?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants