data reduction


data reduction

[′dad·ə ri‚dək·shən] (computer science) The transformation of raw data into a more useful form. (statistics) The conversion of all information in a data set into fewer dimensions for a particular purpose, as, for example, a single measure such as a reliability measure.

Data reduction

The transformation of information, usually empirically or experimentally derived, into corrected, ordered, and simplified form.

The term data reduction generally refers to operations on either numerical or alphabetical information digitally represented, or to operations which yield digital information from empirical observations or instrument readings. In the latter case data reduction also implies conversion from analog to digital form either by human reading and digital symbolization or by mechanical means. See Analog-to-digital converter, Digital computer

In applications where the raw data are already digital, data reduction may consist simply of such operations as editing, scaling, coding, sorting, collating, and tabular summarization.

More typically, the data reduction process is applied to readings or measurements involving random errors. These are the indeterminate errors inherent in the process of assigning values to observational quantities. In such cases, before data may be coded and summarized, the most probable value of a quantity must be determined. Provided the errors are normally distributed, the most probable (or central) value of a set of measurements is given by the arithmetic mean or, in the more general case, by the weighted mean.

Data reduction may also involve operations of smoothing and interpolation, because the results of observations and measurements are always given as a discrete set of numbers, while the phenomenon being studied may be continuous in nature.