MultiSpeak — Billion Dollar Startup Ideas

(We originally posted this in 2020. You can read more of our original ideas in our archive.)

Problem: There is a racial divide in speech recognition systems. Technology from Amazon, Apple, Google, IBM and Microsoft misidentified 35 percent of words from people who were black (according to the Proceedings of the National Academy of Sciences).

Solution: The root of this disparity comes from a lack of training data for these systems. Many systems are not able to recognize diverse accents of individuals from various different countries, continents, and backgrounds because these algorithms have not been trained with diverse data. One business that could help with this is a service that collects machine learning and artificial intelligence training data in the voice context. It would be similar to Spare5 (which used to be MightyAI and was acquired by Uber) or CAPTCHAs (which I talked about on February 9) but focused on the audio.

The trick would be creating some sort of platform that people would want to use to provide training data. Perhaps the app (aka training data collector) allows you to vocally warm up with tongue twisters or maybe it helps you meditate by giving you one passage a day to read from a philosophical book (maybe one by Marcus Aurelius or Albert Camus). Perhaps the business could even create these training sets by combining existing video resources (YouTube, Black Twitter, celebrity Instagram lives, rap music and Genius, etc.) to create new datasets that include more vernacular of speaking.

Eventually, the goal of the business would be to become the gate-keeper and largest holder of multicultural training data that creates more equitable algorithms. What a great mission!

Monetization: Selling access to this training data to companies.

Contributed by: Michael Bervell (Billion Dollar Startup Ideas)

WE POST ONE NEW BILLION-DOLLAR STARTUP IDEA every day.

Jun 22 MultiSpeak.

Jun 23 Bunkobon Leadership.

Jun 21 Treadmill Takeoff.

Related Posts

Jun 9 Jun 9 GPT SDG: Synthetic Data Generation

Jul 6 Jul 6 Fruit Fly Brains for Humans

Aug 12 Aug 12 Decentralized Chief of Staff.

Jun 9
Jun 9 GPT SDG: Synthetic Data Generation

Jul 6
Jul 6 Fruit Fly Brains for Humans

Aug 12
Aug 12 Decentralized Chief of Staff.