-
-
Notifications
You must be signed in to change notification settings - Fork 455
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suggest a simple random crop augmenter #126
Comments
Thank you for your offering. May you share more detail about that? |
Here is an example implementation. If set to crop by Original: Augmented Text might be 1: Augmented Text might be 2: Of course this might break the syntactic structure of the text, but it will introduce a little noise to a small dataset. In my own use case, classification of a few thousands of sample with tf-idf, it brings improvement.
|
Thank you for sharing. RandomCharAug's Delete augmenter and RandomWordaug's Delete augmenter should serve the purpose. For sentence-level, I will implement it in a later release. |
Just checked
Maybe I didn't express well in the previous example. Here below in 1 and 2 show the text within square brackets [ ] are the text that randomly cropped.
This crop behavior could be added as a fourth action to the |
Got what you mean, it is similar to CropAug (for audio). Will include it in coming release |
A very simple and naive augmenter, which just randomly crop part of the original text. Can work on char, word or sentence level.
I myself found it useful using with tf-idf, especially when you have only a very small dataset. I can provide an implementation if you'd like.
The text was updated successfully, but these errors were encountered: