PreFace
Novelty Detection is the detection for whether a new data point is an outlier, and outlier detection is the detection for whether a train data is an outlier. In other words, we find the most concentrated area in outlier detection.
For outlier detection, we first fit density. We define the data point as outlier if it has in low density. densityfunctionβ€t
We also can find the boundary between inlier and outlier. To do this, we have to find the smallest ball such that it includes all data points. This problem is converted into the problem finding a center c and radius r as below.
minr,cβrs.t.(xiββc)T(xiββc)β€r,rβ₯0 Minimum enclosing ball
L(r,c,Ξ±)=r+i=1βmβΞ±iβ((xiββc)T(xiββc)βr),Ξ±iββ₯0 βrβLβ=1βΣαiβ=0βcβLβ=Σαiβ(β2xiβ+2c)=0 This leads to Σαiβ=1,c=Σαiβxiβ
Lβ=r+Σαiβ((xiββc)T(xiββc)βr)=r+ΣαiβxiTβxiββΣαiβxiTβcβΣαiβcTxiβ+ΣαiβcTcβΣαiβr=ΣαiβxiTβxiββΞ£iβΞ£jβΞ±iβΞ±jβxiTβxjβs.t.Σαiβ=1,Ξ±iββ₯0ββ βNow, the problem becomes a quadratic problem with simple constraints (dual problem)
maxaβg(a)=bTΞ±βΞ±TAΞ±s.t.Σαiβ=1,Ξ±iββ₯0Aijβ=xiTβxjβ,biβ=xiTβxiβ,Ξ± When Ξ±iβ becomes 0, the points will be inside the circle. When Ξ±iβ is bigger than 0, the points will be exactly on the boundary. Thus, the solution in Ξ± is very sparse.