Along with the popularity of learning machines in fiction, machine learning is growing in research. Some applications are impressive, for instance being able to distinguish if an image is in a national park or of a bird. Many of these applications use deep learning, where networks of many layers and nodes are created and have thousands or millions of parameters that need to be trained. The vast number of parameters creates a schism between engineers and researchers. Engineers care primarily about performance; how accurate the classification is and how fast it runs. Researchers want to somehow take those parameters and learn from them; it is difficult to take millions of tuned parameters and extract knowledge.
The classical researcher generates several models reflecting inferred or predicted mechanisms, fits each, and selects the model and set of parameters that has the best combination of accuracy, physical meaning, and predictive power. All things equal, being parsimonious is a good thing. But if the universe can be fully described by 26 dimensionless constants, why does it take millions to figure out if something’s a bird? The ultimate purity of physics (or mathematics) don’t provide answers that are useful in an engineering sense. Too abstract, too far from our reality, they aren’t even formally useful as it’s impossible to fully simulate the universe on a computer that resides in said universe (at least on some scales).
We like simple models, but paradoxically we each have on the order of 1010 neurons, each with at least a couple parameters, and 1014 synapses. Is a lifetime of experience alone sufficient to train a model with that many free parameters? No, but there aren’t really that many free parameters. Much of our brain is formed whole-cloth by nature and requires relatively little tuning. The visual system has a very clear spatial ordering and the first couple steps of processing from the 108 photoreceptors in our retina take advantage of this to generate more information-rich, “compressed” visual primitives of various shapes that can be further processed to identify faces, snakes, or defects in metal surfaces. Such subsystems are assuredly the product of nature. These natural systems can be structured through nurturing to understand decidedly synthetic things like writing.
Jumping tracks slightly, we create systems that innately have some natural capability, and we assemble them into ever-more complex systems. Through us, Darwin operates on machines as we iterate on the most successful. The speed at which this happens is extreme, at least relative to other macroscopic forms. We’ve created inexplicable systems (i.e. the Internet) that are beyond the facultative limits of anyone. Their capabilities are fantastic, but we’re beginning to realize that giving them a measure of adaptability allows them to do what we’re poor at translating from our sensory experience to theirs (i.e. image processing). As we move forward, we need to evolve the machines to balance nature and nurture.
Hopefully they don’t learn to become megalomanial.