Writings on Medium, music on spotify, other projects on the liketree in the URL ML enthusiast, dancer, wanting to do some web3 尾張旭市、愛知生まれ


Writings on Medium, music on spotify, other projects on the liketree in the URL ML enthusiast, dancer, wanting to do some web3 尾張旭市、愛知生まれ
Share Dialog
Share Dialog

Subscribe to 0xAbaki

Subscribe to 0xAbaki
<100 subscribers
<100 subscribers
This is a statistical algorithm
create a histogram for each feature in the data
multiply the inverse height of the feature bin that each bin resides in to get a feeling of the density a similar idea to the Naive Bayes algorithm
often used for fast semi-supervised anomaly detection
the speed comes from ignoring the interdependencies of each feature with each other. This assumption is particularly safe when there’s a lot of one-off features that many other data points don’t have (a sparse and large feature space).
ex: HBOS can take 1 min where k-nn can take >23 hours
bins can be created with
static bin sizes w/ fixed bin width
dynamic bins so that the number of bins per feature is about the same (but different bin widths). This method is more robust when there are large outlying values.
this is a modification of Principal Component Analysis
The robustness comes from the covariance matrix being computed twice, a similar optimization method to CMGOS from my second post.
often used for semi-supervised learning
train on anomaly free training data
SVM later detects normal v anomaly
but is inherently an unsupervised algorithm when using a soft marigin.
Since SVMs are better explained visually, check out this video for a more intuitive understanding.
in the paper, η, a value that adjusts the normality of an instance is added to optimize the algorithm.
Below are the comparative tables of this research we’ve been summarizing for reference. Hope you enjoyed the read!







This is a statistical algorithm
create a histogram for each feature in the data
multiply the inverse height of the feature bin that each bin resides in to get a feeling of the density a similar idea to the Naive Bayes algorithm
often used for fast semi-supervised anomaly detection
the speed comes from ignoring the interdependencies of each feature with each other. This assumption is particularly safe when there’s a lot of one-off features that many other data points don’t have (a sparse and large feature space).
ex: HBOS can take 1 min where k-nn can take >23 hours
bins can be created with
static bin sizes w/ fixed bin width
dynamic bins so that the number of bins per feature is about the same (but different bin widths). This method is more robust when there are large outlying values.
this is a modification of Principal Component Analysis
The robustness comes from the covariance matrix being computed twice, a similar optimization method to CMGOS from my second post.
often used for semi-supervised learning
train on anomaly free training data
SVM later detects normal v anomaly
but is inherently an unsupervised algorithm when using a soft marigin.
Since SVMs are better explained visually, check out this video for a more intuitive understanding.
in the paper, η, a value that adjusts the normality of an instance is added to optimize the algorithm.
Below are the comparative tables of this research we’ve been summarizing for reference. Hope you enjoyed the read!







No activity yet