How AI infrastructure captures the magic of the spoken word

22 July, 2019
Zacharias Mack
IBM

The U.S. Department of Health and Human Services estimates that 15 percent of the population has dyslexia. In fact, every 1 in 5 students are dyslexic. Dyslexic kids may have trouble with reading comprehension, and therefore, assistive technology tools like audiobooks can make a big difference for struggling readers.

DeepZen is a company based in London that is working on an end-to-end AI product that can provide human-like speech for audiobooks and voiceovers. The company is working on sector-specific solutions that will eliminate the need to use studios and lengthy recording sessions by creating human-like quality of speech with emotion that can be edited by using intuitive tools.

Users can pick voices from an existing library, modify and tweak them, and then generate voices directly from texts. DeepZen has developed an AI solution to create large and complex neural networks that can switch between multiple deep learning frameworks. They combine emotive speech synthesis technology and natural language processing techniques. As a result, they eliminate the need to use studios and lengthy recording sessions. c Generated speech is created with the full range of human emotions and edit speech with intuitive tools.

Deep Zen designed and implemented an on-premises infrastructure solution in their data center that could handle workloads as data intensive as deep learning.

At the AI Summit in London, DeepZen CEO Taylan Kamis said: “DeepZen’s final product is on par with the human voice. We are working with Watson Machine Learning Accelerator, [which] works with IBM Power Systems AC922 and gets much faster training times compared to other systems.”

Purpose-built infrastructure fuels faster insights

Power Systems AC922 delivers unprecedented performance for modern HPC, analytics and artificial intelligence. This system has significant advantages over x86: 2 times the data throughput with PCIe Gen 4 and 5.6 times the data throughput to accelerators with NVIDIANV link. It is built for the world’s biggest AI challenges.

With this system, companies can deploy data-intensive workloads such as deep learning frameworks and accelerated databases with confidence. The AC922 enables the cutting-edge AI innovation data scientists need with the dependability IT requires. WMLA, also known as Watson Machine Learning Accelerator, formally PowerAI, makes deep learning and machine learning more accessible. It combines popular deep learning frameworks and efficient AI development tools in an easy-to-use package.

If you want to learn more about the AC922 system and WMLA, visit here.  To learn more about DeepZen and IBM Power Systems, please visit here.

The post How AI infrastructure captures the magic of the spoken word appeared first on IBM IT Infrastructure Blog.