Hidden Markov models (HMMs) are commonly used for disease progression modeling when the true patient health state is not fully known. Since HMMs typically have multiple local optima, incorporating additional patient covariates can improve parameter estimation and predictive performance. To allow for this, we develop hidden Markov recurrent neural networks (HMRNNs), a special case of recurrent neural networks that combine neural networks’ flexibility with HMMs’ interpretability. The HMRNN can be reduced to a standard HMM, with an identical likelihood function and parameter interpretations, but it can also combine an HMM with other predictive neural networks that take patient information as input. The HMRNN estimates all parameters simultaneously via gradient descent. Using a dataset of Alzheimer’s disease patients, we demonstrate how the HMRNN can combine an HMM with other predictive neural networks to improve disease forecasting and to offer a novel clinical interpretation compared with a standard HMM trained via expectation-maximization.