Weight initialization in new ACE model #202

CheukHinHoJerry · 2024-06-02T05:51:01Z

Current, only zero or glorot_normal is allowed. I experience with that and naively applying glorot_normal doesn't seems to be good. This issue collects all discussion related to weight initialization.

ref: #196 (comment)

The text was updated successfully, but these errors were encountered:

cortner · 2024-06-03T00:38:56Z

I think initialization will be one of the main topics that we need to study carefully and get experience on. Obvious places where we should learn from are MTP, qSNAP, pacemaker, MACE.

CheukHinHoJerry · 2024-06-03T00:44:58Z

Looking at pacemaker sounds something good to do.

cortner · 2024-06-03T00:47:29Z

I think the first step is to implement a one-hot initialization that will make the new models equivalent to ACE1. This is a little tricky since in the new models n stands for the combined (n, z) in ACE1. So I think in this case the number of n channels must be at large enough to accomodate both n, z.

What makes this a bit tricky is that in the regime of many or few elements the behaviour should probably be quite different...

For a first attempt my suggestion would be to choose weights such that

Rnl(rij, Zi, Zj) = Pk(rij)

where n <-> (k,z) and the n are assigned through a loop ordering like

n = 0
for k = 1,2,..
    for z in zlist 
       n += 1
       # ... 
   end 
end

If we wanted to get a little more ambitious then we could assign relative weights to different elements to give higher resolution to some element-pairs than to others.

cortner · 2024-06-03T00:49:06Z

Looking at pacemaker sounds something good to do.

and don't forget MTP!

Let's get something up and running and then learn from others to see how they improve...

CheukHinHoJerry · 2024-06-03T00:55:26Z

I will find time to look into the one-hot in the coming week. That also give us some time to read through how others work with non-linear small MLIPs.

CheukHinHoJerry · 2024-06-04T06:41:16Z

hmm it doesn't seems to be straightforward. One probably has to allow maxn \neq maxq to proceed and this probably requires a discussion on the interface. Correct me if I am wrong?

cortner · 2024-06-04T06:55:39Z

I agree it is not entirely obvious. Let's try to find time tomorrow?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Weight initialization in new ACE model #202

Weight initialization in new ACE model #202

CheukHinHoJerry commented Jun 2, 2024

cortner commented Jun 3, 2024

CheukHinHoJerry commented Jun 3, 2024

cortner commented Jun 3, 2024 •

edited

Loading

cortner commented Jun 3, 2024

CheukHinHoJerry commented Jun 3, 2024

CheukHinHoJerry commented Jun 4, 2024 •

edited

Loading

cortner commented Jun 4, 2024

Weight initialization in new ACE model #202

Weight initialization in new ACE model #202

Comments

CheukHinHoJerry commented Jun 2, 2024

cortner commented Jun 3, 2024

CheukHinHoJerry commented Jun 3, 2024

cortner commented Jun 3, 2024 • edited Loading

cortner commented Jun 3, 2024

CheukHinHoJerry commented Jun 3, 2024

CheukHinHoJerry commented Jun 4, 2024 • edited Loading

cortner commented Jun 4, 2024

cortner commented Jun 3, 2024 •

edited

Loading

CheukHinHoJerry commented Jun 4, 2024 •

edited

Loading