(You can read more of our original ideas in our archive. You can order a business plan of this idea here.)
Problem: Images are often uni-sensory: they are designed to be seen not heard, smelled, or touched. Thus the information they share is limited.
Solution: An AI company that design technologies that give images more context and content. This would mean creating models that would take a photo and provide the associated sound and smells (and perhaps even textures) of this image.
Both in fiction and in reality, this technology has been dreamed of. In the world of Harry Potter, there is a great example of this too: photos and pictures are dynamic and move. Apple also recently dabbled in this field by releasing live photos in 2015. The business would work to create images that you can hear, images that you can smell, or images that you could touch through using large datasets from videos and movies.
More recently, this research is currently being conducted by individuals at MIT, UC Berkley, and Google Research. As they write in the abstract for their 2016 paper, “Visually Indicated Sounds”
Objects make distinctive sounds when they are hit or scratched. These sounds reveal aspects of an object’s material properties, as well as the actions that produced them. In this paper, we propose the task of predicting what sound an object makes when struck as a way of studying physical interactions within a visual scene. We present an algorithm that synthesizes sound from silent videos of people hitting and scratching objects with a drumstick. This algorithm uses a recurrent neural network to predict sound features from videos and then produces a waveform from these features with an example-based synthesis procedure. We show that the sounds predicted by our model are realistic enough to fool participants in a “real or fake” psychophysical experiment, and that they convey significant information about material properties and physical interactions.
In short, these researches have cracked the code of making images make sounds that seem logical to the human brain. More on the context of the idea in popular culture below:
What industry would such a business play? Perhaps in the $678 billion news and media industry or the $11.2 billion photography industry. Nonetheless, this technology would require R&D to eventually become profitable. Think of it as photoshop for sounds and smells: audioshop or olfactory-shop!
Monetization: Selling or licensing this technology to necessary stakeholders.
Contributed by: Michael Bervell (Billion Dollar Startup Ideas)