I have read an interesting paper on limitations of machine learning models: Scaling Learning Algorithms towards AI. It mentions limitation of two-layer neural networks and other two-layer models (SVMs). These shallow models are unable to learn some functions without an exponential number of components. For example, to learn the parity function over N input bits, they would need 2N hidden neurons.
On the other hand, a deep model with N layers could compute the parity with just N components.