Skip to content

Commit

Permalink
Update wordlists (0.7.20).
Browse files Browse the repository at this point in the history
  • Loading branch information
finnbear committed Dec 18, 2023
1 parent 8d2d6ff commit 684231c
Show file tree
Hide file tree
Showing 11 changed files with 210 additions and 341 deletions.
2 changes: 1 addition & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
[package]
name = "rustrict"
authors = ["Finn Bear"]
version = "0.7.19"
version = "0.7.20"
edition = "2021"
license = "MIT OR Apache-2.0"
repository = "https://github.com/finnbear/rustrict/"
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -177,7 +177,7 @@ is used as a dataset. Positive accuracy is the percentage of profanity detected

| Crate | Accuracy | Positive Accuracy | Negative Accuracy | Time |
|-------|----------|-------------------|-------------------|------|
| [rustrict](https://crates.io/crates/rustrict) | 79.86% | 93.96% | 76.34% | 8s |
| [rustrict](https://crates.io/crates/rustrict) | 79.85% | 93.99% | 76.32% | 8s |
| [censor](https://crates.io/crates/censor) | 76.16% | 72.76% | 77.01% | 23s |

## Development
Expand Down
5 changes: 4 additions & 1 deletion src/censor.rs
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ use crate::{is_whitespace, Replacements, Type};
use std::iter::Filter;
use std::mem;
use std::ops::Deref;
use std::ops::RangeInclusive;
use std::str::Chars;
use unicode_normalization::{Decompositions, Recompositions, UnicodeNormalization};

Expand Down Expand Up @@ -424,8 +425,10 @@ impl<I: Iterator<Item = char>> Iterator for Censor<I> {
raw_c, skippable, replacement
);

const BLOCK_ELEMENTS : RangeInclusive<char> = '\u{2580}'..='\u{259F}';

if (!self.inline.separate || self.inline.last == Some(self.options.censor_replacement))
&& raw_c == self.options.censor_replacement
&& (raw_c == self.options.censor_replacement || BLOCK_ELEMENTS.contains(&raw_c))
{
// Censor replacement found but not beginning of word.
self.inline.self_censoring = self.inline.self_censoring.saturating_add(1);
Expand Down
8 changes: 8 additions & 0 deletions src/dictionary_blacklist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -175,6 +175,9 @@ commies
condoms
coons
copulates
cotton picker
cotton farm
cotton farmer
cowards
crackers
crapper(.*)
Expand Down Expand Up @@ -324,6 +327,7 @@ gyppo
h
hand job
handjobs
hang yourself
hater
haters
hates
Expand All @@ -339,6 +343,9 @@ hijackers
hoars
hoes
holocausts
homophile
homophobia
homophobic
homos
homosexual
honkeys
Expand Down Expand Up @@ -450,6 +457,7 @@ menstruates
menstruations
mi ger
micropenis
mike hawk
milfs
missionary position
mo ron
Expand Down
3 changes: 3 additions & 0 deletions src/dictionary_extra.txt
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,7 @@ enola gay
fatty acid
fatty food
few secs
ffa game
fire cracker
fire crackers
francoitalian
Expand Down Expand Up @@ -171,6 +172,8 @@ pp. 6
pp. 7
pp. 8
pp. 9
pussinboots
puss in boots
ref'd
rip
saturated fat
Expand Down
Loading

0 comments on commit 684231c

Please sign in to comment.