Latest from IBM Developer : A Python Flask audio search application

Note: This code pattern uses Watson Discovery V1 and will not work with Discovery V2. However, you can still use it to learn the Discovery features. Future plans include updating the code pattern to work with Discovery V2.

Summary

This code pattern explains how to create an application that you can use to search for a topic within video and audio files.

Description

While listening to a podcast or to video or audio files of courses, you often want to jump directly to the topic rather than listening to extraneous information. However, finding the topics and keywords in the entire recording can be challenging.

In this code pattern, create an application that you can use to search within the video or audio files. With the app, not only can you search, but you can also highlight the text where the search string or topic occurs in the file. The code pattern performs a natural language query search in audio files, and returns the results with the proper timeframe where your search topic is being discussed. This example uses an IBM® Watson Machine Learning introduction video to illustrate the process.

When you have completed the code pattern, you understand how to:

Prepare audio and video data and perform chunking to break it into smaller chunks to work with
Work with the Watson Speech to Text service through API calls to convert audio or video to text
Work with the Watson Discovery service through API calls to perform a search on text chunks
Create a Python Flask application and deploy it on IBM Cloud.

Flow

The user uploads the video or audio file on the UI.
The video or audio file is processed with the moviepy and pydub Python libraries, and is chunked to create smaller chunks to work with.
The user interacts with the Watson Speech to Text service through the provided application UI. The audio chunks are converted into text chunks with Watson Speech to Text.
The text chunks are uploaded on Watson Discovery by calling Watson Discovery APIs with Python SDKs.
The user performs a search query using Watson Discovery.
The results are shown on the UI.

Instructions

Get detailed steps in the readme file. Those steps show how to:

Clone the GitHub repository.
Create the Watson Speech to Text service.
Create a Watson Discovery instance.
Run the application locally.

Artificial Intelligence

Latest from MIT Tech Review – Get ready for the next generation of AI

To receive The Algorithm in your inbox every Monday, sign up here. Welcome to the Algorithm! Is anyone else feeling dizzy? Just when the AI community was wrapping its head around the astounding progress of text-to-image systems, we’re already moving on to the next frontier: text-to-video. Late last week, Meta unveiled Make-A-Video, an AI that generates…

Artificial Intelligence

Latest from MIT : An optimized solution for face recognition

The human brain seems to care a lot about faces. It’s dedicated a specific area to identifying them, and the neurons there are so good at their job that most of us can readily recognize thousands of individuals. With artificial intelligence, computers can now recognize faces with a similar efficiency — and neuroscientists at MIT’s…

Artificial Intelligence

Latest from MIT : A community collaboration for progress

While decades of discriminatory policies and practices continue to fuel the affordable housing crisis in the United States, less than three miles from the MIT campus exists a beacon of innovation and community empowerment. “We are very proud to continue MIT’s long-standing partnership with Camfield Estates,” says Catherine D’Ignazio, associate professor of urban science and…

Artificial Intelligence

Latest from MIT Tech Review – “Dr. Google” had its issues. Can ChatGPT Health do better?

For the past two decades, there’s been a clear first step for anyone who starts experiencing new medical symptoms: Look them up online. The practice was so common that it gained the pejorative moniker “Dr. Google.” But times are changing, and many medical-information seekers are now using LLMs. According to OpenAI, 230 million people ask…

Artificial Intelligence

Latest from MIT Tech Review – Robots that learn as they fail could unlock a new era of AI

Lerrel Pinto is one of MIT Technology Review’s 2023 Innovators Under 35. Asked to explain his work, Lerrel Pinto, 31, likes to shoot back another question: When did you last see a cool robot in your home? The answer typically depends on whether the person asking owns a robot vacuum cleaner: yesterday or never. Pinto’s…

Artificial Intelligence

Latest from Google AI – Performer-MPC: Navigation via real-time, on-robot transformers

Posted by Krzysztof Choromanski, Staff Research Scientist, Robotics at Google, and Xuesu Xiao, Visiting Researcher, George Mason University Despite decades of research, we don’t see many mobile robots roaming our homes, offices, and streets. Real-world robot navigation in human-centric environments remains an unsolved problem. These challenging situations require safe and efficient navigation through tight spaces,…

Summary

Description

Flow

Instructions

Similar Posts