PreFace
Novelty Detection is the detection for whether a new data point is an outlier, and outlier detection is the detection for whether a train data is an outlier. In other words, we find the most concentrated area in outlier detection.
Reference
For outlier detection, we first fit density. We define the data point as outlier if it has in low density. densityfunction≤t
We also can find the boundary between inlier and outlier. To do this, we have to find the smallest ball such that it includes all data points. This problem is converted into the problem finding a center c and radius r as below.
minr,c​rs.t.(xi​−c)T(xi​−c)≤r,r≥0 Minimum enclosing ball
L(r,c,α)=r+i=1∑m​αi​((xi​−c)T(xi​−c)−r),αi​≥0 ∂r∂L​=1−Σαi​=0∂c∂L​=Σαi​(−2xi​+2c)=0 This leads to Σαi​=1,c=Σαi​xi​
L​=r+Σαi​((xi​−c)T(xi​−c)−r)=r+Σαi​xiT​xi​−Σαi​xiT​c−Σαi​cTxi​+Σαi​cTc−Σαi​r=Σαi​xiT​xi​−Σi​Σj​αi​αj​xiT​xj​s.t.Σαi​=1,αi​≥0​​ ​Now, the problem becomes a quadratic problem with simple constraints (dual problem)
maxa​g(a)=bTα−αTAαs.t.Σαi​=1,αi​≥0Aij​=xiT​xj​,bi​=xiT​xi​,α When αi​ becomes 0, the points will be inside the circle. When αi​ is bigger than 0, the points will be exactly on the boundary. Thus, the solution in α is very sparse.