The material here is intended to supplement the material in our tutorial, not to provide an introduction to semi-supervised learning. If you're looking for such a general introduction, please come to the tutorial itself! If you want a good written overview of semi-supervised learning in general, a good place to start is Jerry's semi-supervised learning literature survey.

Each page (links below) corresponds to one section of the tutorial. For every section, we have included several papers for further reading. Some of these papers are mentioned explicitly in the slides. Some of them are simply papers we felt were good research in a particular area of semi-supervised learning. On this page, we listed Castelli and Cover, which we mention in the introduction of the tutorial. We hope you find these supplemental resources useful, and feel free to contact us.

# Castelli and Cover

**Vittorio Castelli and Thomas Cover. The Relative Value of Labeled and Unlabeled Samples in Pattern Recognition with an Unknown Mixing Parameter. IEEE Transactions on Information Theory 1996.**

Castelli and Cover analyze a binary classification problem from the standpoint of the following generative model: We first choose a label y with probability $\eta$ and then we choose a feature vector x with probability $p_{y}(x)$. Assuming that we know the form of the conditional densities $p_{1}(x)$ and $p_{2}(x)$, they demonstrate that the risk converges exponentially in the number of labeled samples but only polynomially in the number of unlabeled samples. The analysis provides valuable insight into how unlabeled and labeled examples differ for certain types of assumptions and estimation procedures. The standard setup for semi-supervised learning in NLP usually does not assume that we know the generating distribution, but rather makes restricting assumptions on the *joint* distributions of labels and instances.