Challenges of Accurately Measuring and Using BMI and Other Indicators of Obesity in Children

Page created by Antonio Carr
 
CONTINUE READING
SUPPLEMENT ARTICLE

Challenges of Accurately Measuring and Using BMI
and Other Indicators of Obesity in Children
CONTRIBUTOR: John H. Himes, PhD, MPH
Division of Epidemiology and Community Health, University of       abstract
Minnesota, School of Public Health, Minneapolis, Minnesota         BMI is an important indicator of overweight and obesity in childhood
KEY WORDS                                                          and adolescence. When measurements are taken carefully and com-
body mass index, overweight, obesity, child obesity,
measurement                                                        pared with appropriate growth charts and recommended cutoffs, BMI
ABBREVIATIONS
                                                                   provides an excellent indicator of overweight and obesity that is suffi-
CDC—Centers for Disease Control and Prevention                     cient for most clinical, screening, and surveillance purposes. Accurate
IOTF—International Obesity Taskforce                               measurements of height and weight require that adequate attention be
WHO—World Health Organization
                                                                   given to data collection and management. Choosing appropriate equip-
CI— confidence interval
BIA— bioelectrical impedance analysis                              ment and measurement protocols and providing regular training and
www.pediatrics.org/cgi/doi/10.1542/peds.2008-3586D                 standardization of data collectors are critical aspects that apply to all
doi:10.1542/peds.2008-3586D
                                                                   settings in which BMI will be measured and used. Proxy measures for
                                                                   directly measured BMI, such as self-reports or parental reports of
Accepted for publication Apr 29, 2009
                                                                   height and weight, are much less preferred and should only be used
Address correspondence to John H. Himes, PhD, MPH, University
of Minnesota, School of Public Health, Division of Epidemiology    with caution and cognizance of the limitations, biases, and uncertain-
and Community Health, 1300 S 2nd St, Suite 300, Minneapolis,       ties attending these measures. There is little evidence that other mea-
MN 55454. E-mail: himes001@umn.edu                                 sures of body fat such as skinfolds, waist circumference, or bioelectri-
PEDIATRICS (ISSN Numbers: Print, 0031-4005; Online, 1098-4275).    cal impedance are sufficiently practicable or provide appreciable
Copyright © 2009 by the American Academy of Pediatrics             added information to be used in the identification of children and ad-
FINANCIAL DISCLOSURE: The author has indicated he has no           olescents who are overweight or obese. Consequently, for most clini-
financial relationships relevant to this article to disclose.       cal, school, or community settings these measures are not recom-
                                                                   mended for routine practice. These alternative measures of fatness
                                                                   remain important for research and perhaps in some specialized
                                                                   screening situations that include a specific focus on risk factors for
                                                                   cardiovascular or diabetic disease. Pediatrics 2009;124:S3–S22

                                    Downloaded from www.aappublications.org/news by guest on October 10, 2021
                                                                                           PEDIATRICS Volume 124, Supplement 1, September 2009   S3
BMI (weight [kg]/height [m2]) has            appropriate collection, use, and inter-          100 cm. She measured the same chil-
probably become the most common in-          pretation of BMI as it is used as an in-         dren a second time on Tuesday, again
dicator used to assess overweight and        dicator of child and adolescent over-            with a mean height of 100 cm. Never-
obesity in a wide variety of settings,       weight and obesity are considered.               theless, for some girls there were
including clinical, public health, and       Also, chief measurement issues re-               small differences in height measure-
community-based programs. Although           lated to other selected anthropometric           ments between Monday and Tuesday,
it is certainly not a perfect surrogate      indicators of overweight and obesity             although the mean height of all girls
for total body fatness and not without       are briefly discussed.                            remained the same for the 2 days.
its technical limitations,1 BMI has been                                                      The differences between measured
recommended as the most appropri-            SOME BASIC CONCEPTS FROM                         heights on Monday and Tuesday for the
ate single indicator of overweight and       MEASUREMENT THEORY                               individual girls are examples of ran-
obesity in children and adolescents          Classical measurement theory in-                 dom errors of measurement.
outside of research settings.2–4             cludes some concepts that are helpful            Random errors of measurement are a
One of the attractive features of BMI is     for understanding issues surrounding             concern, because they always add to
that it is derived from measurements         measurement of height, weight, and,              the variability of the true measure-
of height and weight. These 2 anthro-        therefore, BMI. Detailed explanations            ments; their presence and extent are
pometric dimensions are the ones             of measurement theory are available              usually considered the measure-
most commonly collected on children          in standard textbooks concerning                 ment’s “reliability.” Poor measure-
worldwide. These 2 measurements are          measurement and psychometrics.5,6                ment reliability is a concern because it
noninvasive, relatively inexpensive to       Different academic disciplines may               may cause incorrect clinical judg-
obtain, and relatively easily under-         use different terms to refer to the              ments for individual children (misclas-
stood by health practitioners, the indi-     same concepts, but for the present dis-          sification) and alter conclusions for
viduals being measured, and their            cussion the terms usually found in the           statistical analysis for groups of chil-
families.                                    biomedical and epidemiologic litera-             dren. Because most inferential statis-
Mentioning child measurements of             ture will be used.                               tical tests use a measure of variation
height and weight, individuals may be        It is important to know that all mea-            (eg, SD) as a denominator, statistical
reminded of their own marks on the           surements are imperfect and always               tests of differences between means,
door sills and the bathroom scales of        measured with some error, whether                analysis of variance, correlations, re-
their childhood homes. So, although          the measurements be height, weight,              gressions, and odds ratios are all at-
wide familiarity with height and weight      skinfolds, or bioelectric impedance.             tenuated (ie, less statistically signifi-
enhances the use and understanding           Accordingly, an index such as BMI,               cant) as the measurement reliability
of a measure such as BMI, it also may        which is derived from 2 other mea-               decreases and the variability term in
desensitize health professionals to the      surements, will include the compo-               the denominator increases. Random
need to give adequate attention to           nents of measurement error inherent              errors are usually reported in terms of
issues concerning how height and             in the constituent height and weight             a measurement error variance or a
weight data are collected. Accordingly,      measurements. The nature and magni-              measurement error SD, or summa-
one may hear the comment, “Anyone            tude of these measurement errors                 rized in reliability coefficients (inter-
can measure height and weight.” Al-          have some fairly predicable conse-               class or intraclass correlations) from
though one must actually agree with          quences related to the usefulness and            replicate measurements of the same
the language, if not the intent, of this     interpretation of the measurements.              children.
easy declaration, many health profes-        Some measurement errors are ran-                 In a second example, nurse Brown
sionals are unaware that there are           dom, with the same probability of be-            measured the same group of girls on
consequences for the usefulness and          ing smaller than or greater than the             Monday, again with a mean height of
accurate interpretation of BMI data          true value (a theoretical value mea-             100 cm. This time on Tuesday nurse
that follow from decisions made con-         sured without error). Consequently,              Jones measured them for a second
cerning data collection.                     the average or mean of random errors             time and recorded heights exactly
In this article, challenges surround-        across a series of measurements is 0.            1.0 cm taller than did nurse Brown for
ing the measurement of BMI in US             For example, nurse Brown measured                every girl. Now the mean height for all
children (2–18 years of age) and the         heights on a group of 4-year-old girls           the girls on Tuesday was 101 cm. If we
implications of these issues for the         on Monday, and the mean height was               consider nurse Brown to be our gold

                                 Downloaded from www.aappublications.org/news by guest on October 10, 2021
S4    HIMES
SUPPLEMENT ARTICLE

standard of measurement, this sys-           associated with the instrument used               to time height measurements to ac-
tematic measurement error (ie, all in 1      to measure, that associated with the              commodate it, unless one is engaged
direction) of nurse Jones is an exam-        child being measured, and that associ-            in a rigorous research protocol that
ple of measurement bias.                     ated with the observer(s) doing the               requires serial measurements on a
Measurement bias is a concern be-            measuring. In most settings, however,             small number of individual children.
cause it may cause misclassification of       errors associated with the child and              For the data included in the 2000 Cen-
individual children or groups of chil-       with the observer(s) are the chief                ters for Disease Control and Preven-
dren. Nevertheless, as long as the bias      sources of measurement error in mea-              tion (CDC) growth charts,13 heights
is not differential among groups, pure       surements of height and weight. Obvi-             were measured from mornings through
measurement bias will not affect the         ously, it is still important to have ap-          evenings so that the reference percen-
results of statistical tests between or      propriate measuring equipment, but                tiles represent something like heights
among groups, such as differences be-        once they are installed and calibrated,           averaged throughout the day, and the
tween means, analysis of variance,           little measurement error usually is               associated within-child variation is in-
correlations (interclass), regressions,      due to the instruments per se.                    cluded in the total variance in height
and odds ratios. In practice, differ-                                                          captured in the published percentiles
ences between individual observers           Measurement Errors Due to                         or z scores at an age.
who measure the same children will           Child Variation                                   For body weight, the within-child varia-
also have a component of random              The normal day-to-day variation within            tion is related to the size of the child
measurement error between them.              a child leads to a component of mea-              and should usually be within 1.5% of
Not surprisingly, observers tend to          surement error. This variation proba-             the measured weight (SD: 0.5%).14 Ac-
measure more like themselves than            bly results from many sources includ-             cordingly, the expected maximum
like others, so interobserver errors         ing hydration, gastrointestinal and               within-child weight variation for chil-
are almost always larger than intra-         urinary bladder contents, diurnal hor-            dren who weigh 25 and 50 kg should be
observer errors.                             monal fluctuations, saltatory growth,              ⬃375 g (0.83 lb) and 750 g (1.65 lb),
Measurement theory usually specifies          fidgeting, alterations in position, and            respectively. In practice, it is difficult to
that measurement errors are inde-            fatigue.7,8                                       standardize this physiologic within-
pendent and additive, that is, that the      As early as 1724, Wasse recognized ap-            child weight variation when children
total measurement error variance is          preciable variation in stature during             are measured, so it is usually ignored
the sum of error variances from all          the day and concluded that “[t]he al-             for most purposes.
sources.5 Also, when increments or dif-      teration in the human stature . . . pro-
ferences between successive mea-             ceeds from the yielding of the carti-             Seasonal Variation
surements are used, the measure-             lages between the vertebrae to the                It has been known for a long time that
ment errors attending each of the 2          weight of the body in an erect pos-               in some environments children may
constituent measurements are in-             ture.”9 MRI studies have since con-               grow differentially according to sea-
cluded with the increment. So an incre-      firmed that the diurnal variation in               son of the year.15 It is important, how-
ment has twice the random measure-           stature primarily results from in-                ever, to understand the contexts of
ment error (variance) of an attained         creases in water content in the soft              these findings to determine the impli-
value and lower measurement reliabil-        central portion of the intervertebral             cations for current studies of height,
ity. Obviously, if measurement biases        discs (nucleus pulposus) while at rest            weight, and BMI.
change over time, the accuracy of in-        and water loss while standing or dur-             In developing countries with prevalent
crements becomes questionable.               ing other weight-bearing activities.10            poverty, undernutrition, and infection,
                                             For children, one can expect a mean               reduced seasonal patterns of aver-
CHIEF SOURCES OF MEASUREMENT                 height difference of ⬃1.5 cm (SD:                 age growth in height and weight are
ERRORS FOR HEIGHT AND WEIGHT                 0.46 cm) between rising and late after-           often linked to the rainy season(s),
When a child’s height and weight are         noon,11 with most of the change prob-             along with accompanying factors in-
measured, there are several pos-             ably occurring during the first 2 to 3             cluding reduced food availability and
sible sources of measurement error.          hours of the day.12                               increased infection.16,17 In developed
A simplified theoretical model would          In practice, it is helpful to understand          countries the evidence is mixed, but
say that the total variance of mea-          the expected diurnal variation in child           when seasonal patterns are present,
surement error is the sum of that            height but probably impractical to try            they usually indicate relatively greater

                           Downloaded from www.aappublications.org/news by guest on October 10, 2021
                                                                                  PEDIATRICS Volume 124, Supplement 1, September 2009   S5
growth in height and linear dimen-           more concern about observer reliabil-            theory here is that a mean of repli-
sions during the spring and summer           ity in measurements of height rather             cates is a better estimate of the “true”
and relatively greater growth in weight      than weight, because height measure-             measurement, because the random
and fatness during the fall and win-         ments include more “opportunities”               errors of measurement are reduced.28
ter.18,19 When seasonal fluctuations ex-      for within-child and observer variation          The usefulness of taking replicate
ist in developed countries, they are         than do weight measurements.                     measurements depends on the reli-
smaller and less common than those           Often, height and weight measure-                ability of the single measurement in
seen in children living in developing        ments for BMI are collected in clinical          question and how the data will be
countries.                                   or other settings in which data col-             used.
In studies in both Japan20 and the           lection may be hurried and observ-               Routinely obtaining replicates benefits
United States,21 seasonal fluctuations        ers may not have been trained as                 most those measurements that have
in growth were observed in earlier           rigorously as observers in research              the lowest initial reliability, and the
generations of children but disap-           settings. Actually, there are few stud-          corresponding improvements in reli-
peared within the same populations           ies available concerning measure-                ability are predictable.28 Measurement-
over 20 to 40 years because general          ment variation among those who prob-             reliability coefficients (R) express the
health and nutrition conditions im-          ably collect most of the data used for           percentage of the total observed vari-
proved through time. Accordingly, for        BMI evaluation and screening. Ahmed              ation that is captured by the “true”
almost all children now living in the        et al24 evaluated the measurement                measurement variation. For single
United States, there should be little if     variation among 2 sets of health visi-           measurements of height and weight in
any seasonal variation in growth that        tors who each measured each of 10                a nonresearch setting, a reasonable
would require accounting for it in the       children at ages 3 and 4.5 years 3               expectation for values of R should be
design of studies or data-collection         times with a portable stadiometer. The           ⬃0.93 and 0.97, respectively. At these
protocols.                                   average value for the SD of measure-             levels of measurement reliability, col-
Excess growth in BMI has been ob-            ment was 0.47 cm. In a small compari-            lecting a second measurement and
served over summer vacation be-              son trial on height of 5- and 6-year-old         using the mean raises the values of
tween kindergarten and first grade            British children, school nurses had a            R to 0.963 and 0.984, respectively.
for children in the Early Childhood          pooled interobserver measurement                 These are not dramatic improvements
Longitudinal Survey.22 Nevertheless,         SD of 0.32 cm, which compared favor-             in measurement reliability using a du-
this should probably be viewed as a          ably to that of a trained auxologist             plicate, because the initial levels of
school/no-school effect rather than          (0.35 cm) on the study.25 The nurses in          measurement reliability started out
seasonal variation per se.                   this study had been trained in measur-           rather high.
                                             ing height. Importantly, training can            Contrast these possible improvements
Measurement Errors Due to                    improve the precision of length and              when using replicate measurements
Observer Variation                           height measurements.8,26                         with those for skinfold thicknesses, for
An important goal in measurement of          Given the above-listed principles, it fol-       which the measurement reliability for
height and weight should always be to        lows that when a large number of data            a single measurement in nonresearch
collect the data with as little measure-     collectors are required the interob-             settings is probably ⬃0.8. For succes-
ment error as possible, given the prac-      server measurement errors increase               sive numbers of replicate skinfold
tical and financial constraints of the        as well.27 Consequently, one would pre-          measurements and using the mean,
local situation.                             fer to have as few individuals measur-           the R values would be 0.88 for 2 mea-
In a highly controlled research labora-      ing height and weight as is practicable          surements, 0.92 for 3 measurements,
tory with experienced anthropom-             in the particular setting, especially if         and 0.94 for 4 measurements.
etrists, the mean interobserver (abso-       the resulting data will be used for re-          The errors of measurement with low
lute) differences for standing height        search purposes or if serial measure-            measurement reliability are usually
and weight are 0.3 cm and 0.02 kg, re-       ments on the same children are being             assumed to be largely random. Conse-
spectively, with corresponding SDs of        made.                                            quently, how the data are to be used is
0.2 cm and 0.03 kg.23 These values           Another strategy for reducing mea-               a consideration in deciding whether
should be viewed as close to the mini-       surement errors is to take the mea-              the extra time and trouble should be
mum values possible using current            surements more than once and then                spent routinely collecting replicate
methods. In most situations there is         use the mean of the replicates. The              measurements. Purely random errors

                                 Downloaded from www.aappublications.org/news by guest on October 10, 2021
S6    HIMES
SUPPLEMENT ARTICLE

will not affect the group means of           relative to the 2000 CDC growth                   the United States are now being re-
height, weight, and BMI, although they       charts.13 These are high-quality growth           ported in the literature rather fre-
will increase the SDs because of the         charts that present selected percen-              quently using the IOTF criteria,33,34
added error variance. Similarly, the         tiles and allow calculation of z scores           which has been useful in standardizing
prevalence of children with a BMI            of attained height, weight, and BMI for           BMI criteria. Nevertheless, it should be
above percentile cutoffs for age and         age and gender and in metric and En-              noted that the IOTF charts contain no
gender will not be affected by the           glish units. The primary data were col-           percentile or z-score curves other than
added random error because as many           lected in national surveys by using rig-          the 2 cutoff lines, because they were
children should be misclassified above        orous measurement protocols, and                  specifically designed for reporting
and below the cutoff value. If the BMI       state-of-the-art statistical methods              population prevalences of overweight
data are to be used for these pur-           were used to derive and smooth the                and obesity. Accordingly, the IOTF
poses, routinely taking replicate mea-       percentiles and z scores across the               charts should not be used to monitor
surements is probably not worthwhile.        ages. More detailed technical informa-            BMI growth in individual children.
For some uses of BMI data, however,          tion on methods and development are               In 2006 the Department of Nutrition
routinely taking replicate measure-          available elsewhere.13 Earlier sets of            and Health at the World Health Organi-
ments is recommended. If the BMI data        BMI reference data for US children (eg,           zation (WHO) released a new growth
will be used to make clinical decisions      Must et al30) should not be used be-              standard for children from birth to 5
regarding treatment or referral of in-       cause the cutoff values are slightly dif-         years of age based on longitudinal and
dividual children, or for assessing          ferent, which will serve to complicate            cross-sectional data collected in 6
changes in individuals over time, a          comparisons across studies.                       countries (Brazil, Ghana, India, Nor-
second measurement of height and             Some other countries have developed               way, Oman, and United States).35 The
weight will reduce misclassification of       and use their own growth charts, but 2            new attained growth curves, including
current status and increase the ability      sets designed for international appli-            BMI, were designed to represent how
to detect changes from one occasion          cations should be briefly mentioned,               all children ought to grow under ideal
to another. In research settings that        particularly relative to BMI. The Inter-          circumstances. Accordingly, the moth-
include height, weight, and BMI as im-       national Obesity Taskforce (IOTF) spon-           ers and children were carefully se-
portant variables, duplicate measure-        sored a workshop with a goal of estab-            lected so that there were no known
ments of height and weight are recom-        lishing a standard definition for child            constraints to healthy growth, includ-
mended. If the height and weight             overweight and obesity worldwide.31               ing exclusive breastfeeding and appro-
replicates are averaged before calcu-        As a result, high-quality BMI data from           priate introduction of solid foods.36 Be-
lating BMI, the latter calculation only      6 countries (Brazil, Great Britain, Hong          cause of the homogeneous nature of
needs to occur once.                         Kong, Netherlands, Singapore, and the             the WHO samples and some choices
                                             United States) were combined to de-               made to exclude the heaviest children,
CHALLENGES OF USING
APPROPRIATE REFERENCE DATA                   velop age- and gender-specific cutoffs             the upper BMI percentiles and z scores
AND CUTOFFS                                  for children (birth to 20 years of age)           are somewhat restricted (ie, nar-
                                             corresponding to the locations of the             rower) at an age compared with those
Which Reference Data?                        BMI values of 25 and 30 kg/m2 in the              in the 2000 CDC growth charts. Conse-
Usually, BMI will be evaluated in chil-      statistical distribution of adults.19             quently, using the same percentile cut-
dren relative to reference data or           These latter BMI cutoffs are the con-             off for BMI at an age (eg, ⱖ95th), the
growth charts. The main challenge to         ventional criteria that identify over-            WHO standards will yield a higher prev-
the investigator is to choose the set        weight and obesity in adults.32                   alence of children than if the ⱖ95th
of growth charts that is most ap-            The IOTF cutoffs that define overweight            percentile for age were used from the
propriate for the intended purposes          and obesity correspond approximately              2000 CDC growth charts.37 The oppo-
for which the BMI data will be used.         to percentiles 82 to 84 and 96 to 97,             site is true at the other end of the BMI
For height, weight, and BMI, US in-          respectively, on the 2000 CDC growth              distribution so that thinness defined
vestigators have the benefit of re-           charts for BMI for age, not very differ-          by a low BMI percentile on the WHO
cent recommendations from an expert          ent from the 85th- and 95th-percentile            standards will identify fewer children
committee.4,29                               cutoffs used customarily in the United            with low BMI compared with using the
For most purposes, US children aged          States. Prevalences of overweight and             same percentile cutoff on the 2000 CDC
2 to 18 years should be evaluated            obesity in children in countries outside          growth charts.38

                           Downloaded from www.aappublications.org/news by guest on October 10, 2021
                                                                                  PEDIATRICS Volume 124, Supplement 1, September 2009   S7
One concern about using these new            As a personal recommendation for                 was smaller) were considered over-
WHO growth standards is the interpre-        health practitioners in the United               weight. Children or adolescents with a
tation in terms of the health or growth      States, the 2000 CDC growth charts               BMI at ⱖ85th percentile but ⬍95th
of children who are in the extremes of       should be used for routine screening,            percentile were considered at risk of
the percentiles (eg, ⬍5th, ⬎95th) on         surveillance, and monitoring of BMI              overweight. At that time, the term
the basis of a standard that purport-        because they have been widely evalu-             “obese” was avoided, because obesity
edly only included healthy children.         ated and adopted, and they have been             was technically defined in terms of
Nevertheless, the WHO standards are          recommended by recent expert com-                body fat per se, and BMI was derived
so new that there are no data docu-          mittees.4,29 If investigators wish to            only from height and weight.
menting whether the new cutoffs are          communicate with international col-              In 2005, the Institute of Medicine (IOM)
better at identifying children at health     leagues in presentations and in the              consciously departed from the termi-
risk than the 2000 CDC growth charts.        scientific literature by citing the IOTF          nology discussed above and elected to
In 2007, the WHO released a growth ref-      or WHO criteria, they should also in-            define children with at BMI at ⱖ95th
erence for height, weight, and BMI for       clude at least prevalence results rela-          percentile for age and gender as obese
children aged 5 to 19 years that was         tive to the 2000 CDC growth charts so            rather than overweight.44 The IOM re-
designed to align with the 2006 WHO          that their findings can be compared               port expressed the seriousness, ur-
growth standards at 5 years and to           with those of other US studies. Hope-            gency, and medical nature of child-
be used internationally.39 The WHO re-       fully, as further research becomes               hood obesity and deliberately sought
analyzed the data comprising the US          available, more specific recommenda-              to express this concern by using the
National Center for Health Statistics        tions can be made on the basis of                term “obese” to refer to the children
growth curves, published in 1977,40          studies of sensitivity/specificity and            and adolescents with the highest BMI.
and proposed that they be used as a          differential risk among the various              A recent expert committee endorsed
single growth reference for screening,       BMI criteria currently available.                the IOM position and recommended to
surveillance, and monitoring of school-                                                       replace the terms “at risk of over-
aged children worldwide. As with chil-       A Rose by Any Other Name                         weight” and “overweight” with the
dren older than 24 months included in        Before 1994 the scientific literature on          terms “overweight” and “obese,” re-
the new WHO birth to 5 years refer-          overweight and obesity included a                spectively.4,29 Accordingly, the expert
ence,35 BMI values of ⬎2 SDs were ex-        wide range of defining criteria (eg,              committee recommended that individ-
cluded as unhealthy for the 2007 5 to        percent ideal weight, skinfold thick-            uals 2 to 18 years of age with a BMI of
19 years reference.39 Because the            ness, ponderal index, BMI) and many              ⬎30 kg/m2 or ⱖ95th percentile for
heaviest children were excluded, the         descriptive names to refer to the chil-          age and gender (whichever is smaller)
upper percentiles of BMI for the WHO         dren and adolescents who were con-               should be considered obese. Individu-
2007 reference are substantially be-         sidered the fattest. This variation in re-       als with a BMI at ⱖ85th percentile but
low the corresponding levels for the         porting made it difficult to compare              ⬍95th percentile or 30 kg/m2 (which
2000 CDC growth charts, especially in        findings because different indicators             ever is smaller) should be considered
later adolescence when high BMI val-         may actually identify different children         overweight.
ues are more common.                         as the fattest,41 and the differences in         The expert committee believed that the
There has been much informal dis-            terminology were sometimes confus-               terms “overweight” and “obese” better
cussion about the use of the IOTF and        ing. An expert committee considered              convey the seriousness and impor-
WHO references. Unfortunately, there         these issues, and their proceedings,             tance of the obesity epidemic to health
have been no formal recommenda-              published in 1994,2 had considerable             providers, parents, and children and in
tions from agencies or professional          effect toward standardizing the crite-           a less ambiguous manner than the
organizations in the United States re-       ria (BMI for age) and the nomenclature           previous terms, although no specific
garding their routine or partial use         for referring to the fattest children            literature was cited to support this
(eg, at certain ages or for certain pur-     and adolescents. Subsequently, these             view. Because BMI identifies the fattest
poses). This institutional silence is un-    definitions became preferred in de-               individuals with acceptable accuracy,
fortunate, because it will likely lead to    scribing weight status.3,42,43                   especially at the highest levels of
at least ambiguity and perhaps even          In the 1994 report,2 children with a BMI         BMI,45,46 the expert committee believed
confusion among health practitioners         that exceeded 30 kg/m2 or ⱖ95th per-             that choosing more direct terms that
and in the scientific literature.             centile for age and gender (whichever            may provide additional impetus for

                                 Downloaded from www.aappublications.org/news by guest on October 10, 2021
S8    HIMES
SUPPLEMENT ARTICLE

treatment and change was to be pre-                                                                 from approximately the 92nd to the 97th
ferred to parsing technical concepts                                                                percentiles but increase in spread,
that would be unlikely to aid under-                                                                reaching from the 90th to 98th percen-
standing. Finally, the new terminology                                                              tiles at the older ages. The 95% CIs
comports with that from the IOTF BMI                                                                around the 99th BMI percentiles for
criteria for children and adolescents,47                                                            girls include from approximately the
with conventional terminology for                                                                   97th to effectively just less than the
adults,32,48 and with the International                                                             100th percentiles (because no point
Classification of Diseases, 9th Revi-                                                                can exceed percentile 100).
sion, Clinical Modification (ICD-9-CM).                                                              After ⬃18 years of age, the upper 95%
                                               FIGURE 1
Nomenclature really does matter; it is a       The 85th, 95th, and 99th percentiles for BMI in      confidence limit for the 85th percen-
sine qua non with standardized defi-            girls (straight horizontal lines) and 95% CIs cal-   tiles and the lower 95% confidence
                                               culated from the number of subjects included at      limit for the 95th percentiles are ap-
nitions of health conditions. Standard-        each age in the 2000 CDC growth charts.
ized nomenclature increases precision                                                               proximately coincident, and the upper
in scientific and public communication                                                               limit of the 95th and the lower limit of
and provides improved understanding            As an example, Fig 1 presents the 85th,              the 99th percentiles actually overlap.
in health guidance.                            95th, and 99th percentiles of BMI for                This means, for example, that a 19-
                                               girls as straight lines and the respec-              year-old girl with a BMI identified as
Precision of Percentile Estimates              tive 95% confidence intervals (CIs) cal-              being at the 99th percentile by the
                                               culated by using the method of Wil-                  2000 CDC growth charts (or by com-
Often, health providers and research-                                                               puter programs that calculate the ex-
                                               son49,50 and the unweighted sample
ers use the exact BMI-for-age cutoffs                                                               act percentiles) will probably have a
                                               sizes within age groups (15–20 years)
that define overweight and obesity as                                                                BMI somewhere between the 96th and
                                               used for the 2000 CDC growth charts.13
ironclad diagnostic criteria. Although                                                              100th percentiles.
                                               At younger ages the sample sizes
standardized definitions are essential,
                                               range from 400 to 639, and the 95% CIs               The precision of the upper percentile
as discussed above, the actual mea-
                                               are quite stable and similar to those at             cutoffs for BMI can be viewed from sev-
surements on the child will always
                                               15 and 16 years. The sample sizes and                eral different perspectives. First, the
vary somewhat as a result of child and         corresponding 95% CIs for boys are                   samples and CDC growth charts are as
observer factors. In addition, the ac-         similar to those for girls. The 99th per-            they are, and no revisions are antici-
tual percentile cutoffs themselves are         centile of BMI for age was not origi-                pated in the near future. Consequently,
statistical estimates of points that are       nally published with the growth charts               those who use the growth charts should
also subject to errors.                        but has been suggested as a useful                   understand their limitations in inter-
Let us assume that the basic data that         cutoff for identifying children at added             preting findings and not wait for more
were used to construct the 2000 CDC            health risk.51                                       precise estimates. Actually, additional
growth charts13 are truly representa-          On the basis of sample sizes, the 95%                imprecision beyond that related to
tive of the US population of children          CIs around the 85th BMI percentiles in-              sample size probably also occurs at
and adolescents, and that the statisti-        clude values approximately between                   some ages in adolescence because of
cal procedures used to smooth the              the 81st and 88th percentiles until                  differences in maturational status.52
percentile values across age were ap-          ⬃17.5 years, when the sample sizes                   Given the range of CIs surrounding the
propriate and unbiased. There still re-        decrease and the 95% CIs become                      BMI percentiles at all ages, health pro-
mains a degree of uncertainty regard-          wider. These CIs mean that at 20 years               viders and investigators should be a
ing the point estimates of the final            of age (the most extreme case), a girl               little less stringent in defining the ex-
percentile values related to the num-          whose BMI percentile corresponds to                  act location of a child or groups of chil-
ber of children that were included in          the 85th percentile on the CDC chart                 dren in the BMI distribution relative to
the samples within each 6-month age            may actually have a BMI anywhere be-                 the growth charts. Accordingly, BMI
group used to estimate the percen-             tween the 78th and 90th percentiles                  values just below or just above recom-
tiles. Simply put, the larger the sample,      because of the imprecision the of per-               mended cutoffs should be interpreted
the more precise the percentile esti-          centile estimates.                                   as only 1 indicator and not the only di-
mates, especially at the extremes of           For the 95th BMI percentile estimates                agnostic criterion for clinical deci-
the distribution.                              before ⬃17.5 years, the 95% CI range                 sions. Follow-up visits and repeated as-

                             Downloaded from www.aappublications.org/news by guest on October 10, 2021
                                                                                    PEDIATRICS Volume 124, Supplement 1, September 2009    S9
sessments on other occasions should          the percentile charts cease to be use-           age, calculating the percentage excess
reduce the uncertainty of the child’s        ful for differentiating their growth sta-        of a BMI value or percentage over-
BMI status.                                  tus. Accordingly, cutoffs of less than           weight beyond a percentile value is in-
The fairly wide confidence limits             ⫺2 z for height for age and weight for           appropriate, because it will have in-
around the percentiles do not invali-        age have become conventional defini-              consistent meaning from age to age.
date the recommended BMI cutoffs for         tions for stunting and wasting,
standardized reporting of population         respectively.53                                  CHALLENGES OF MEASURING
prevalences or for analyses of the           In a similar fashion, for overweight             HEIGHT, WEIGHT, AND BMI
associated risk profiles of groups of         and obesity in children and adoles-              Summary reminders concerning data
children.51 Nevertheless, investigators      cents, z scores can be useful for char-          collection and management are listed
should be cautious drawing inferences        acterizing individuals with a high BMI           in Table 1. The particular setting in
from risk ratios comparing the ob-           that exceeds the percentile levels               which data for BMI assessments will
served and expected prevalences be-          available on the growth charts. For ex-          be collected has implications for how
yond a given BMI cutoff because of the       ample, if the progress of a girl with a          or whether the recommended prac-
imprecision of the percentile cutoffs.       BMI that far exceeds the 97th percen-            tices can be implemented.
                                             tile for age (currently the highest per-
When Should z (SD) Scores                    centile available on the CDC charts) is          Equipment and Space
Be Used?                                     monitored, her attained BMI on the               If possible, height should be measured
A BMI z or SD score is the BMI of a child    growth chart is difficult to evaluate             to the nearest 0.1 cm (1⁄4 in) by using
transformed into a scale comprising          and impossible to meaningfully quan-             a stadiometer mounted on the wall or
the number of SD units it is away from       tify. On the other hand, by converting           a portable stadiometer that allows
the mean of the referent population of       her BMI to a z score, her progress can           the child to be positioned properly
the same age and gender. The 2000            be monitored and changes in subse-               with his or her back against a vertical
CDC growth charts13 were constructed         quent z scores have a direct interpre-           surface. A second choice are models
in such a way to allow calculation of z      tation relative to the referent popula-          that measure the child freely stand-
scores for BMI.                              tion of her age. Because z scores are            ing, but the measurement errors for
There are several advantages to using        calculated relative to age, noting a             these latter instruments tend to be
z scores compared with using the cor-        change in z score is an appropriate              larger than when the measurements
responding percentiles, although they        way to evaluate changes in BMI across            are taken with the child standing
both describe a child’s status relative      ages relative to what is expected in the         against a surface.55 The height mea-
to the same reference data set. The pe-      referent population.                             surements for the 2000 CDC growth
diatric applications of z scores that        An alternative to using z scores to              charts13 were taken by using wall-
are most common are probably in en-          evaluate change in individual children           mounted stadiometers. Many brands
docrinology or nutrition where chil-         with elevated BMI is to just use change          of acceptable stadiometers are avail-
dren who are very small relative to the      in BMI itself. These changes are un-             able, and searching on-line will pro-
growth charts are seen and z scores          derstandable to practitioners, adoles-           vide several good choices. Stadiom-
provide a more useful and manage-            cents, and families, and they allow set-         eters attached to scales that do not
able metric than percentiles to evalu-       ting of goals and monitoring of progress.        allow the child to be positioned cor-
ate and monitor status or treat-             Using z scores is currently the only             rectly are not recommended.
ment.53,54 For example, a 3-year-old boy     appropriate way available to quantify            Weight should be measured by using a
with a height-for-age z score of ⫺3.6        the severity of obesity in children who          good-quality scale to the nearest 100 g
has a height that is 3.6 SDs lower than      have BMI levels that exceed the avail-           (1⁄4 lb). In the past, balance-beam scales
the age- and gender-specific mean for         able percentiles for age and gender.             were routinely recommended because
him on the growth charts; his corre-         Unfortunately, z scores require a com-           the only alternatives were spring scales
sponding height-for-age percentile is        puter program to calculate them                  that were less dependable. Now there
0.013.                                       readily, and the SD-related metric is            are many good electric scales avail-
When a high proportion of children           not familiar to many practitioners. Be-          able that are also quite portable. The
have heights and weights less than the       cause the total variation in BMI (eg, the        more expensive scales have multiple
lowest percentiles (eg, 3rd, 5th), as        distance between the 5th and 95th per-           pressure transducers under the weigh-
found in many developing countries,          centiles) progressively increases with           ing platform, so they are less sensitive

                                 Downloaded from www.aappublications.org/news by guest on October 10, 2021
S10    HIMES
SUPPLEMENT ARTICLE

TABLE 1 Data-Collection and Management Practices for Reducing Errors for Height, Weight, and              the patient confidentiality sought by in-
          BMI
                                                                                                          stitutional human subjects committees.
Equipment and space
  Choose appropriate equipment
  Check and calibrate equipment regularly                                                                 Measurement Protocols
  Keep extra batteries for scales                                                                         Because health providers and others
  Provide a private area for child measurements, if possible
Measurement protocols                                                                                     who use BMI data will almost always
  Chose a protocol that matches that used in the growth charts                                            compare them to the growth charts, it
  Have written copies of measurement protocols available for review                                       makes sense to strive to collect the
  Train and standardize data collectors
  Make sure data are recorded in the appropriate units (eg, kilograms, pounds)                            height and weight measurements that
  Make sure data are measured and recorded to the nearest unit specified in the protocol (eg, 0.1 cm for   comprise BMI by using protocols that
     height, 0.1 kg for weight)                                                                           match those used in the reference
  Collect some replicate measurements for assessment of reliability, if feasible
Personnel
                                                                                                          data as closely as possible. The mea-
  Use as few observers as is feasible to take measurements, especially for research studies               surement procedures used in the col-
  Identify observers on data-collection forms or data-entry programs                                      lection of the height, weight, and BMI
Data management
                                                                                                          data for the 2000 CDC growth charts13
  Use as exact ages as possible
  Have unique identifiers for children                                                                     are currently available as a download-
  Calculate BMI, percentiles, and z scores by using tables or computer programs                           able file at the CDC National Health
                                                                                                          and Nutritional Examination Survey
                                                                                                          (NHANES)Website(www.cdc.gov/nchs/
                                                                                                          data/nhanes/bm.pdf). These measure-
to variation in the child’s position and              ibrated by using a metal rod of a fixed              ment protocols follow closely those rec-
shifting of weight from one leg to the                length.                                             ommended by a US consensus group.57
other. Again, an Internet-based search                                                                    This publication has become the gold-
                                                      Good electric scales can be calibrated
will yield many good alternatives.                                                                        standard reference in the United States
                                                      or “zeroed.” In most areas of the
In a research setting, obviously, the                                                                     for anthropometry methods related to
                                                      United States, state agencies in de-
best-quality equipment should be                                                                          health issues, although slight differ-
                                                      partments of commerce, standards, or
chosen for maximum consistency over                                                                       ences exist for some measurements
                                                      agriculture have representatives who
time and for reliability among observ-                                                                    customarily used internationally.58
                                                      calibrate and certify scales in grocery
ers taking the measurements. In clini-                                                                    It is important to train data collectors
                                                      stores and in other commercial ven-
cal or community settings, cheaper al-                                                                    in the appropriate methods for mea-
                                                      ues. In some cases, these representa-
ternatives are often used, but given                                                                      suring height and weight. Again, the
                                                      tives can be called on to routinely
the heavy utilization in a busy clinic, for                                                               goal is to use the same measurement
                                                      check and calibrate scales at perma-
example, investing in sturdy anthropo-                                                                    protocols that were used for the deri-
                                                      nent sites. Alternatively, scales can be
metric equipment that can be cali-                                                                        vation of the growth charts. Some-
                                                      calibrated by using weights of known
brated if necessary will prove worth-                                                                     times, experienced clinic staff may
                                                      size. If models of electric scales are
while and increase confidence in the                                                                       take offense because they have been
                                                      used in clinic or in the field, ensuring             measuring height and weight for a
measurements. Cheaper models of
                                                      that a supply of batteries of appropri-             long time. Often, however, “the way we
stadiometers tend to have less-rigid
parts that wobble or bend with fre-                   ate size should be on the checklist for             do it here” includes some bad habits or
quent use.                                            routine equipment maintenance.                      deviations from the prescribed proto-
With repeated use or if equipment is                  Often, in busy clinic or school situa-              cols. Standardizing all data collectors
moved about fairly often, stadiometers                tions, stadiometers and scales are rel-             to a gold-standard trainer ensures
and scales should be checked to deter-                egated to hallways or even reception                that a single protocol is followed and
mine if they are calibrated correctly.                areas. Children and adolescents may                 that departures from the trainer are
It is important to develop a regular                  find it embarrassing to be measured,                 within acceptable limits.59
schedule for calibration (eg, daily in                and even more so to have witnesses to               For extended research protocols or
research, weekly in clinic) and assign                the procedures.56 Having a private or               for ongoing surveillance or clinical
someone to be responsible for these                   partially screened area for the height              activities, having a gold-standard
duties. Depending on the installation,                and weight measurements will in-                    trainer periodically visit and observe
good stadiometers usually can be cal-                 crease child cooperation and enhance                measurements or take some replicate

                                  Downloaded from www.aappublications.org/news by guest on October 10, 2021
                                                                                        PEDIATRICS Volume 124, Supplement 1, September 2009   S11
measurements will help prevent “drift”       first time take the measurements.                 touching required for anthropometric
in the measurement techniques. Also,         Over the course of the month of data             measurements.
these opportunities can be used to cor-      collection, the variation among observ-          As mentioned previously, having as
rect and recertify data collectors, if       ers, schools, and any study drift will           few data collectors as is feasible for
necessary.                                   be captured in the final reliability sam-         other practical demands will minimize
Laminated copies of the measurement          ple, which should include data on ⬃25            interobserver measurement variation.
protocols on-site provide a readily          children. For complicated protocols              Ensuring that unique observer codes
available reminder for data collectors       that involve many measurements or                are included on the data-collection
concerning child position, measure-          administration of other instruments,             forms or data-entry computer pro-
ment landmarks, and local policies           children may contribute only a repli-            grams will aid in quality-assurance ac-
regarding calibration, clothing, exclu-      cate for one of the measurements so              tivities and can even be used in the sta-
sion criteria, data recording, data          that the burden on any one child is              tistical analyses if consistent observer
flow, etc.                                    small and the total of 25 replicates             measurement bias becomes apparent.
In research settings a certain propor-       may represent many more individual
tion of the measurements should be           children. The calculation of the rele-           Data Management
repeated to evaluate measurement             vant measurement-reliability statis-             Having chronological ages as exact as
reliability. The proportion required de-     tics has been explained elsewhere.27,61          possible is important for the accurate
pends on the number of different ob-         If the measurement protocols specify             calculation of percentiles and z scores,
servers concerned, the number of chil-       that duplicate measurements be rou-              and they will aid in minimizing age-
dren usually measured, and the period        tinely collected for all subjects (as rec-       related variance in statistical analyses
over which reliability will be assessed.     ommended above), then these repli-               when children are grouped according
In general, there should be enough           cates can be used for assessing                  to age. Chronological ages in years ex-
replicates to capture the variation          measurement reliability as long as all           pressed to at least 2 decimal points
among data collectors and study de-          the different data collectors involved           are sufficient for most applications;
sign features and to capture a fairly                                                         this will capture exact ages to the
                                             in the study take the replicates. If the
stable estimate of the mean differ-                                                           nearest 3 days. For children less than 5
                                             mean of replicate measurements will
ences between replicates and the ac-
                                             be used in statistical analyses, the             or 6 years of age it may be more con-
companying SD. The SD of differences
                                             measurement reliability should take              venient to express age in months to 1
between replicates is really a measure
                                             this into account.61 If different data col-      decimal point, or in exact days.
of variance, and the CIs for a variance
                                             lectors usually work on different days,          Actual values for BMI, BMI percentiles,
begin to stabilize at sample sizes
                                             then special scheduling may be re-               and BMI z scores are best calculated
larger than 20 (in our case, 20 pairs of
                                             quired to accommodate fully captur-              by using computer programs to avoid
measurements).60
                                             ing the interobserver variation in the           computational errors. There are many
As an example, for a hypothetical study
                                             reliability sample.                              Web sites with BMI calculators that
in school-aged children, a 3-person
                                                                                              can be found easily by using Internet
measurement team will visit 4 differ-        Personnel                                        searches, including those provided
ent schools during a month of data
                                             Experience shows that an advanced                by the CDC (http://apps.nccd.cdc.gov/
collection. Each school has an average
                                             formal education is not required to              dnpabmi/Calculator.aspx) and National
of 30 children, and the team will aver-
age ⬃10 children measured per day.           take high-quality anthropometric mea-            Institutes of Health (www.nhlbisupport.
So, each school will require 3 days of       surements. Willing adults who will               com/bmi/bminojs.htm).
data collection, and ⬃120 children           give adequate attention to detail and            In some settings where immediate pa-
will be measured. If a target of 25 rep-     who meet the requirements for em-                tient feedback or charting are con-
licates is sought for assessing mea-         ployment are usually satisfactory.               ducted, calculating BMI by using tables
surement reliability, that amounts to        Members of the community who are                 may be preferred. Again, many Web
an ⬃20% sample. One simple ap-               familiar with the local ethos and jar-           sites provide such tables; the only cau-
proach is to specify that the data col-      gon may be excellent data collectors.            tion is that some BMI tables are de-
lectors remeasure 2 children per day         In some situations, like-gender ob-              signed for adults and may not include
and that a different data collector from     servers may make children and ado-               the low heights and weights observed
the one who measured the child the           lescents more comfortable with the               in children.2

                                 Downloaded from www.aappublications.org/news by guest on October 10, 2021
S12    HIMES
SUPPLEMENT ARTICLE

TABLE 2 Selected Studies With Interclass Correlation Coefficients for Reported and Measured Height, Weight, and BMI According to Gender
     Source               Group/Location            Age or Grade Level      n, Male/Female       Height r           Weight r              BMI r
                                                                                             Male     Female    Male     Female    Male      Female
Davis and Gergen63   Mexican American/US                  12–19 y              392/437        0.86     0.86      0.95     0.93     0.87       0.85
Himes and Faricy64   All/US                               12–16 y              759/876        0.89     0.79      0.97     0.93     0.93       0.87
Himes and Story65    American Indian/Minnesota            12–19 y               41/28         0.91     0.71      0.96     0.91     0.90       0.80
Hauck et al66        American Indian/US                   12–19 y                536          0.83     0.62      0.95     0.90     0.88       0.79
Brener et al67       20 states/US                      Grades 9–12             957/1075       0.87     0.82      0.92     0.94     0.89       0.89
Himes et al68        Minnesota/US                         12–18 y             1936/1861       0.90     0.80      0.96     0.94     0.89       0.85
Tsigilis69           Trikala, Greece              Middle and high schools      141/159        0.94     0.93      0.97     0.97     0.90       0.94
Median r                                                                                      0.89     0.80      0.96     0.93     0.89       0.85

Exact BMI percentiles and BMI z scores            when and how such data might be                    the random errors in both height and
can be calculated by using Epi Info, a            used appropriately.62                              weight, self-reported BMI generally
free, user-friendly and downloadable                                                                 has lower correlations with measured
computer program developed by the                 Measurement Reliability                            BMI than corresponding associations
CDC (www.cdc.gov/epiinfo). At the CDC             No published data are available on re-             observed for reported and measured
Web site, researchers can download                liability in self-reported height and              height and weight.
a program for SAS statistical analysis            weight as narrowly defined previously               The youngest-aged children included
software that generates a data set con-           (ie, the random error associated with              in these studies were 11 to 12 years
taining the percentiles and z scores for          the same measurement being re-                     old, and correlations between self-
all the anthropometric measurements               peated). Such data would comprise                  reported and measured dimensions,
(including BMI) in the 2000 CDC growth            the same children being asked for                  especially height, are usually lower
charts (www.cdc.gov/nccdphp/dnpa/                 their reported height and weight at                at these ages than they are later in
growthcharts/resources/sas.htm).                  least twice over a period of time insig-           adolescence.64,70,71
SELF-REPORTED HEIGHT, WEIGHT,                     nificant for growth.                                A slightly different concern about
AND BMI                                           Reliability in self-reports has been               young adolescents is that they are of-
Having older children and adolescents             evaluated in adolescents, considering              ten unable or decline to report their
report their height and weight rather             reliability as the random errors asso-             heights and weights.62,72 In a study based
than having someone directly measure              ciated with the differences between                on US national-level data, 41% of 12-
them is attractive economically and               self-reported height, weight, and BMI              year-olds and 25% of 13-year-olds had
logistically. Costs of direct anthropo-           and the corresponding measured di-                 missing data for weight.64 These rates
metric measurements include addi-                 mensions. A good summary measure                   compared with 4% missing reported
tional time, personnel, training, and             of this reliability is the Pearson or              weights in 15- and 16-year olds. It may
equipment. Logistically, direct mea-              interclass correlation coefficient.                 be that for youth aged 11 to 13 years
surements require an in-person ex-                Correlation coefficients between re-                their height has not yet become as
amination, space, and additional time             ported and measured height, weight,                important to them as it will be as they
for participants. If direct measure-              and BMI are presented in Table 2 for               get older, and they may not have regu-
ments of height and weight are re-                some selected studies that reported                lar opportunities to have their height
quired, some study designs and data-              the correlations according to gender.              measured.
collection strategies are summarily               Overall, the correlations for reported
inadequate or eliminated (eg, mail                and measured dimensions are rela-                  Measurement Bias
surveys, classroom surveys, telephone             tively high, indicating that self-reported         Although Pearson correlation coeffi-
surveys). Of course, the appropriate-             values are generally reasonable prox-              cients are useful indicators of relia-
ness of using self-reports of height,             ies for the corresponding measured                 bility, they only provide average asso-
weight, and BMI depends on the reli-              values. On the basis of the correlation            ciations, and they only account for
ability, bias, validity, and specific ap-          coefficients, boys generally do a little            random errors between reported and
plications of these measures. In some             better than girls, and weight is usually           measured values. Pearson correla-
cases, self-reported data may be all that         more reliably reported than is height.             tions are blind to systematic errors or
exist, so it is important to understand           Because self-reported BMI combines                 bias. Several different sources of bias

                                Downloaded from www.aappublications.org/news by guest on October 10, 2021
                                                                                      PEDIATRICS Volume 124, Supplement 1, September 2009         S13
in self-reports of height and weight                                                          The biases in self-reports are entan-
have been investigated, and they were                                                         gled in idiosyncratic differences among
recently reviewed for studies on US                                                           samples in gender, age, underlying
adolescents.62                                                                                distributions of BMI, and perhaps
For our discussion, it is important to                                                        race, mental health, and socioeco-
recognize that the mean values of self-                                                       nomic status.62,68
reported height are usually overesti-
mated by ⬃1 to 2 cm, and mean                                                                 Measurement Validity
self-reported weight is usually under-                                                        From the evidence for bias discussed
estimated by 2 to 4 kg, especially so                                                         above, it is not surprising that consid-
in girls.62,72 Thus, with overestimated                                                       erable misclassification occurs when
height and underestimated weight,                                                             children and adolescents are identi-
mean BMI values calculated from the                                                           fied as overweight or obese on the
self-reported data are usually less by                                                        basis of self-reports and the BMI-
2 to 3 BMI units (kg/m2) than if they                                                         percentile criteria. In the Brener et al67
were measured.                                                                                study, the sensitivity and specificity
Another source of bias that is impor-                                                         of self-reported BMI for identifying
tant for understanding how self-                                                              overweight adolescents were 60.5%
reported data might be used in evalu-                                                         and 98.0%, respectively. Correspond-
ating overweight and obesity is related                                                       ing values for sensitivity and specific-
to the body size of the children and ad-                                                      ity for identifying obese individuals
olescents providing the self-reports.                                                         were 54.9% and 99.2%, respectively.
The mean differences for self-reported                                                        So, as few as 55% (positive predictive
values less measured values for                                                               value) of those who are truly over-
height, weight, and BMI are presented                                                         weight will be correctly identified as
in Fig 2 relative to categories of the                                                        such when using BMI calculated from
measured dimensions for a sample                                                              self-reported heights and weights. Re-
of 3797 Minnesota youth aged 12 to                                                            sults from other studies of validity are
18 years.68                                                                                   not much more encouraging.62
For height, the errors in self-reporting                                                      The validity of BMI using self-reported
are largely positive because most of         FIGURE 2                                         data relative to total body fat has not
                                             Mean differences between self-reported and
the youth overestimated their heights        measured body size adjusted for age, socio-      been evaluated. Nevertheless, given
(mean differences: boys, 1.2 cm; girls,      economic status, and race/ethnicity, plotted     the modest validity relative to mea-
                                             against categories of the measured dimension:    sured BMI, BMI derived from self-
2.4 cm). Nevertheless, a strong nega-        A, height; B, weight; C, BMI.68
tive relationship between the errors                                                          reported data must be even poorer
in reporting height and the actual                                                            than measured BMI in its ability to cor-
measured heights is evident so that          upper percentiles (eg, 85th, 95th). For          rectly identify the fattest individuals on
the only group actually underestimat-        example, in a separate study of high             the basis of laboratory methods.
ing height was the very tallest boys.        school students by Brener et al,67 the
For self-reported weight and BMI, the        prevalences for overweight (ⱖ85th                When Is It Appropriate to Use
errors in self-reports became increas-       percentile) were 47.4% for directly              Self-reported BMI?
ingly negative (indicating underesti-        measured BMI and 29.7% for self-                 In some situations, BMI derived from
mates) as categories of measured             reported BMI. Corresponding preva-               self-reported data are the only data
weight and BMI increased, with steeper       lences for obesity (ⱖ95th percentile)            available (eg, the CDC Youth Risk Be-
slopes in girls than in boys.                were 26.0% for measured BMI and                  havior Surveillance System,73 which
This pattern of underestimation means        14.9% for self-reported BMI. Unfortu-            collects data through telephone inter-
that the greatest impact of the bias         nately, there is no easy conversion              views from a national sample). In other
in self-reported BMI will be to un-          from a prevalence based on self-                 cases, the complexity and size of the
derestimate prevalences of over-             reported BMI to what it would have been          survey make direct measurements im-
weight and obesity defined by the             if height and weight were measured.              practical.74 Nevertheless, any use of

                                 Downloaded from www.aappublications.org/news by guest on October 10, 2021
S14    HIMES
You can also read