Two recent papers have quantified long-term ozone (O₃) changes observed at northern midlatitude sites that are believed to represent baseline (here understood as representative of continental to hemispheric scales) conditions. Three chemistry-climate models (NCAR CAM-chem, GFDL-CM3, and GISS-E2-R) have calculated retrospective tropospheric O₃ concentrations as part of the Atmospheric Chemistry and Climate Model Intercomparison Project and Coupled Model Intercomparison Project Phase 5 model intercomparisons. We present an approach for quantitative comparisons of model results with measurements for seasonally averaged O₃ concentrations. There is considerable qualitative agreement between the measurements and the models, but there are also substantial and consistent quantitative disagreements. Most notably, models (1) overestimate absolute O₃ mixing ratios, on average by ~5 to 17 ppbv in the year 2000, (2) capture only ~50% of O₃ changes observed over the past five to six decades, and little of observed seasonal differences, and (3) capture ~25 to 45% of the rate of change of the long-term changes. These disagreements are significant enough to indicate that only limited confidence can be placed on estimates of present-day radiative forcing of tropospheric O₃ derived from modeled historic concentration changes and on predicted future O₃ concentrations. Evidently our understanding of tropospheric O₃, or the incorporation of chemistry and transport processes into current chemical climate models, is incomplete. Modeled O₃ trends approximately parallel estimated trends in anthropogenic emissions of NOx, an important O₃ precursor, while measured O₃ changes increase more rapidly than these emission estimates.