-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fine-tuning? #2
Comments
Thanks Adrian for the kind words.
Indeed I have thought about fine tuning however there are few challenges and aspects that make the exercise (in this particular context) not so futile.
First and foremost, the weights of the batch normalization layers are not available for this network. Davis King in one of his forum posts mentions that he accidentally overwrote the training weights file and lost them. With out those lost parameters the fine tuning will be invalid.
When I ported the architecture to keras I kept the BN layer because I can train from scratch. Note that during inference BN layer is replaced by Scale (or also called Affine) layer.
Now for a moment consider that the weights were not lost. Even then fine tuning for this particular network would not reap lot of benefits.
Here is my humble reasoning for above argument -
Fine tuning makes sense if you are dealing with a larger network and retraining takes lot of time. Now in this case the network is relatively small ..only 29 layers. On one 1080Ti it takes me about 10 hours to train on about 3 million (aligned) images.
Now again let’s remove the small vs large network and training time from the equation. You may want to adjust (fine tune) the network for new images (new classes of human faces in this particular case).
Reason above approach does not excite me is that typically for face recognition you plug an SVM model on the face representations. In my experiments, SVM has worked really well and even often compensate for the strength/accuracy of this particular network.
I hope these arguments make sense, again this is based on my understanding and experiments ... I am not an authority of this subject so please free to challenge and suggest if you see it differently
…Sent from my iPhone
On May 27, 2019, at 7:40 AM, Adrian Rosebrock ***@***.***> wrote:
Really nice job, @ksachdeva! Congratulations on this implementation -- it's really nice!
I was curious about fine-tuning with Keras. For example, let's say we wanted to:
Take the dlib model definition and trained weights
Convert them to Keras
And then use Keras to fine-tune a model on data the original model wasn't originally trained on
Have you experimented at all with that use case?
I'd really love to help with such a project so please do get back to me 😄
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
For face recognition, yes, you could take the 128-d embeddings from the network and then train a Logistic Regression or SVM on top of those representations. That can and will work for a small number of new face identities to recognize. To make the method more robust; however, one could fine-tune the model on a new dataset of example images. This new dataset would be smaller both in terms of (1) total images and (2) total number of unique individuals. It may also be impossible to train such a network from scratch using the dataset. There is an "in-between" situation where the SVM/LR approach could be too noisy/too many incorrect labelings while training from scratch would be impossible. In those situations fine-tuning might be worth exploring (at least in my opinion). |
Really nice job, @ksachdeva! Congratulations on this implementation -- it's really nice!
I was curious about fine-tuning with Keras. For example, let's say we wanted to:
Have you experimented at all with that use case?
I'd really love to help with such a project so please do get back to me 😄
The text was updated successfully, but these errors were encountered: