 # Environment

“For this scheme to work, we need that each data point in the target (test time) distribution had nonzero probability of occurring at training time. If we find a point where q(x)>0 but p(x)=0, then the corresponding importance weight should be infinity.”
Isn’t it the other way around?
q(x)=0 will cause the \beta \to \inf

I think there is a typo in formula 4.9.2, where the denominator should be q as well.

\int p(\mathbf{x}) f(\mathbf{x}) dx & = \int p(\mathbf{x}) f(\mathbf{x}) \frac{q(\mathbf{x})}{q(\mathbf{x})} dx

Hi @Siyang, great catch! Thanks!

At the end of section 4.9.1.5 “Covariate Shift Correction” it is stated that the correction factor is infinity for p(x)=0 and q(x)>0. This conflicts with the definition of beta(x)=p(x)/q(x) (following equation 4.9.2). Should q(x) and p(x) be switched?

Can someone explain " When the distribution of labels shifts over time 𝑝(𝑦)≠𝑞(𝑦)p(y)≠q(y) but the class-conditional distributions stay the same 𝑝(𝐱)=𝑞(𝐱)p(x)=q(x), our importance weights will correspond to the label likelihood ratios 𝑞(𝑦)/𝑝(𝑦)q(y)/p(y)."

what is the connection here?

Just found this video to clear the confusion https://www.youtube.com/watch?v=nAqQF-jU_YM