Convolutional neural networks (CNNs) typically suffer from slow convergence rates in training, which limits their wider application. This paper presents a new CNN learning approach, based on second-order methods, aimed at improving - a) Convergence rates of existing gradient-based methods, and b) Robustness to the choice of learning hyper-parameters (e.g., learning rate). We derive an efficient back-propagation algorithm for simultaneously computing both gradients and second derivatives of the CNN’s learning objective. These are then input to a Long Short Term Memory (LSTM) to predict optimal updates of CNN parameters in each learning iteration. Both meta-learning of the LSTM and learning of the CNN are conducted jointly. Evaluation on image classification demonstrates that our second-order backpropagation has faster convergences rates than standard gradientbased learning for the same CNN, and that it converges to better optima leading to better performance under a budgeted time for learning. We also show that an LSTM learned to learn a small CNN network can be readily used for learning a larger network