What is the expected accuracy of the distribution classifier over the biased training set and the unbiased test set? I can’t seem to get above 75%, and the weights from the distribution classifier are not improving the accuracy of my original classifier. What kind of accuracy is needed to see any improvements? How much improvement are we supposed to be getting?

Sorry for the late reply, but yes there should be an upper bound on the accuracy of your distribution classifier: suppose X is a picture of a shoe, then there’s some probability that that image came from both the test and training distribution, so the classifier can never perfectly discriminate. In fact it will only be able to discriminate up to the relative proportions of each type of image in the training/test sets.

You know this maybe would have been helpful a week ago…

Sorry about that, I didn’t see this one. We’ll be lenient on this problem.