Efficient Methods for Dealing with Missing Data in Supervised Learning



next up previous
Next: Introduction

In: G. Tesauro, D. Touretzky, and T. Leen, eds., Advances in Neural Information Processing Systems 7, San Mateo, CA, Morgan Kaufman, 1995.

Efficient Methods for Dealing with Missing Data in Supervised Learning

Volker Tresp, gif Siemens AG, Central Research, Otto-Hahn-Ring 6, 81730 München, Germany
Ralph Neuneier,Siemens AG,Central Research,Otto-Hahn-Ring 6,81730 München,Germany
Subutai Ahmad, Interval Research Corporation, 1801-C Page Mill Rd., Palo Alto, CA 94304

Abstract:

We present efficient algorithms for dealing with the problem of missing inputs (incomplete feature vectors) during training and recall. Our approach is based on the approximation of the input data distribution using Parzen windows. For recall, we obtain closed form solutions for arbitrary feedforward networks. For training, we show how the backpropagation step for an incomplete pattern can be approximated by a weighted averaged backpropagation step. The complexity of the solutions for training and recall is independent of the number of missing features. We verify our theoretical results using one classification and one regression problem.





Subutai Ahmad
Mon Mar 27 18:14:29 PST 1995