Biased, incomplete numerical models are often used for forecasting states of complex dynamical systems by mapping an estimate of a “true” initial state into model phase space, making a forecast, and then mapping back to the “true” space. While advances have been made to reduce errors associated with model initialization and model forecasts, we lack a general framework for discovering optimal mappings between “true” dynamical systems and model phase spaces. Here, we propose using a data‐driven approach to infer these maps. Our approach consistently reduces errors in the Lorenz‐96 system with an imperfect model constructed to produce significant model errors compared to a reference configuration. Optimal pre‐ and post‐processing transforms leverage “shocks” and “drifts” in the imperfect model to make more skillful forecasts of the reference system. The implemented machine learning architecture using neural networks constructed with a custom analog‐adjoint layer makes the approach generalizable across applications.