This paper addresses the following question: how should we update our beliefs after observing some incomplete data, in order to make credible predictions about new, and possibly incomplete, data? There may be several answers to this question according to the model of the process that creates the incompleteness. This paper develops a rigorous modelling framework that makes it clear the conditions that justify the different answers; and, on this basis, it derives a new conditioning rule for predictive inference to be used in a wide range of states of knowledge about the incompleteness process, including near-ignorance, which, surprisingly, does not seem to have received attention so far. Such a case is instead particularly important, as modelling incompleteness processes can be highly impractical, and because there are limitations to statistical inference with incomplete data: it is generally not possible to learn how incompleteness processes work by using the available data; and it may not be possible, as the paper shows, to measure empirically the quality of the predictions. Yet, these depend heavily on the assumptions made.
Keywords. Predictive inference, statistical inference, incomplete data, missing data, conservative inference rule, imprecise probability, conditioning, classification, data mining
Paper Download
The paper is availabe in the following formats:
Authors addresses:
Galleria 2
CH-6928 Manno
Switzerland
E-mail addresses:
Marco Zaffalon | zaffalon@idsia.ch |
Related Web Sites