Combining Natural and Artificial Neural Nets for Better Search
We know of natural neural nets more by another name animal brains (including human). Artificial neural nets are of course artificial intelligences built on the same basic rules as their natural counterparts, but much simpler and more focussed. Google, that pioneer of all manner of artificial intelligences designed for search, has posed the question as to whether or not artificial intelligences are capable of being taught to automatically label every song on the Internet.
Obviously the answer in principle, is yes, as fundamentally artificial neural nets are capable of being every bit as powerful as their natural counterparts. However, the ones we haver at the moment are nowhere even close, hampered by our incomplete understanding of how natural brains function. So, what if we were to use both natural neural nets and artificial in a single combined system? If it is turned into something of a game, will the human participants enjoy themselves at the same time as they are providing useful data, and training the artificial in what to expect?
That was the question posed in response by University of California, San Diego engineers. They set up a system using the internet whereby game-powered machine learning, would enable music lovers to search every song on the web well beyond popular hits, with a simple text search using key words like funky or spooky electronica.
The researchers, led by Gert Lanckriet, a professor of electrical engineering at the UC San Diego Jacobs School of Engineering, hope to create a text-based multimedia search engine that will make it far easier to access the explosion of multimedia content online. Thats because humans working round the clock labelling songs with descriptive text could never keep up with the volume of content being uploaded to the Internet. For example, YouTube users upload 60 hours of video content per minute. Only an artificial neural network dedicated solely to the task, could even hope to keep up with this sheer volume of information.
In Lanckriets solution, computers study the examples of music that have been provided by the music fans and labelled in categories such as romantic, jazz, saxophone, or happy. The computer then analyses waveforms of recorded songs in these categories looking for acoustic patterns common to each. It can then automatically label millions of songs by recognizing these patterns. Training computers in this way is referred to as machine learning. Game-powered refers to the millions of people who are already online that Lanckriets team is enticing to provide the sets of examples by labelling music through a Facebook-based online game called Herd It.
Another significant finding in the paper is that the machine can use what it has learned to design new games that elicit the most effective training data from the humans in the loop. The question is if you have only extracted a little bit of knowledge from people and you only have a rudimentary machine learning system, can the computer use that rudimentary version to determine the most effective next questions to ask the people? said Lanckriet. Its like a baby. You teach it a little bit and the baby comes back and asks more questions. For example, the machine may be great at recognizing the music patterns in rock music but struggle with jazz. In that case, it might ask for more examples of jazz music to study.
Its the active feedback loop that combines human knowledge about music and the scalability of automated music tagging through machine learning that makes Google for music a real possibility. Although human knowledge about music is essential to the process, Lanckriets solution requires relatively little human effort to achieve success. Essentially the only real human participation is via casual gaming software, with the AI picking up the results and aggregating the data received - so it can weed out malicious attempts to lead the AI astray by "trolls" or "griefers". Through the active feedback loop, the computer automatically creates new Herd It games to collect the specific human input it needs to most effectively improve the auto-tagging algorithms, said Lanckriet. The game goes well beyond the two primary methods of categorizing music used today: paying experts in music theory to analyse songs the method used by Internet radio sites like Pandora and collaborative filtering, which online book and music sellers now use to recommend products by comparing a buyers past purchases with those of people who made similar choices.
Both methods are effective up to a point. But paid music experts are expensive and cant possibly keep up with the vast expanse of music available online. Pandora has just 900,000 songs in its catalogue after 12 years in operation. Meanwhile, collaborative filtering only really works with books and music that are already popular and selling well. On the other hand, this system works with everything, and has next to no operating costs. At the end of the day, the highly trained AIs can be duplicated and used in multiple operations, and like humans, they will never stop learning.
There is absolutely no reason to believe a trained AI of this type would be limited to just music, and every reason to believe that if the trial is successful, they can be trained to index every media type going - and those not yet invented.