beyond bag of frames: Homogenization of NICM by covariance

I chose to homogenize by covariance since it looks like that's the main anti-hub correlate for this data. The plot below shows the log-determinant for each component of each model (that's 32 x 897 components). I'm convinced that the super tiny variance components are just ones that collapsed in EM and should definitely be removed. So, I picked two thresholds for now: -300 to remove all the super tiny components, and -150 to remove most everything outside of that massive band around -100.

Artist R-precision increased for both homogenization (I'm not attempting another table):
35.85% for -300
38.24% for -150

Compared to the un-homogenized 32.27% (different than the last post because of some meta-data clean-up). Differences are significant under the Wilcoxon test (p-values ~ 0).

Seems the hubness increases though.
# of hubs (100-occurrences greater than 200):
105 for no homo.
121 for -300
119 for -150

# of anti-hubs (100-occurrences less than 20):
121 for no homo.
114 for -300
131 for -150

So, I guess we're trading smooth hub distribution for precision.

I'll look into the other homogenization methods soon.

beyond bag of frames

Tuesday, March 18, 2008

Homogenization of NICM by covariance

No comments:

mir blogs

Blog Archive

About Me