problem in summarization AI

Thread Starter

Tom gayle

Joined Sep 20, 2021
84
hi , I’m doing on project that is summarization Ai. using speech recognition , nltk I’m doing this. first of all , user need to speak then the code converts speech to text and it displays the transcribed text. then it will summarize the transcribed text. but one problem happened while executing that is the output of my code simply did transcribing but it didn’t summarize the transcribed text rather than summarizing the transcribed text simply it is showing the whole transcribed text . l don’t know what to do pls can anybody help?


import speech_recognition as sr
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize, sent_tokenize
from collections import defaultdict
import speech_recognition as sr
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize, sent_tokenize
from collections import defaultdict


def summarize_text(text, num_sentences=3):
# Tokenize the text into sentences
sentences = sent_tokenize(text)

# Tokenize the text into words
words = word_tokenize(text.lower())

# Remove stop words
stop_words = set(stopwords.words('english'))
words = [word for word in words if word.isalnum() and word not in stop_words]

# Calculate word frequencies
word_freq = defaultdict(int)
for word in words:
word_freq[word] += 1

# Calculate sentence scores based on word frequencies
sentence_scores = defaultdict(int)
for sentence in sentences:
for word in word_tokenize(sentence.lower()):
if word in word_freq:
sentence_scores[sentence] += word_freq[word]

# Get the top 'num_sentences' sentences with highest scores
summary_sentences = sorted(sentence_scores, key=sentence_scores.get, reverse=True)[:num_sentences]
summary = ' '.join(summary_sentences)
return summary

def speech_to_text():
recognizer = sr.Recognizer()
mic = sr.Microphone(device_index=0) # Change the device index if needed

with mic as source:
print("Speak now...")
recognizer.adjust_for_ambient_noise(source)
audio = recognizer.listen(source)

try:
print("Transcribing...")
text = recognizer.recognize_google(audio)
return text
except sr.UnknownValueError:
print("Sorry, could not understand the audio.")
return ""
except sr.RequestError:
print("Sorry, the service is unavailable.")
return ""

# Get speech input
user_input = speech_to_text()

if user_input:
# Get the summary
summary = summarize_text(user_input)
print("\nSummary:")
print(summary)


i also attach the same program file. pls i need help
 

Attachments

Last edited:

BookerE1

Joined Feb 28, 2024
1
Hello,

I can try to help you with your project of summarization AI using speech recognition and nltk. Based on the web search results that I found, I think the problem with your code is that you are not calling the summarize_text function correctly. You are passing the user_input variable as the first argument, but you are not passing the second argument, which is the number of sentences you want in the summary. By default, the function will return only 3 sentences, which may not be enough to summarize the whole text. To fix this, you need to either specify the number of sentences you want in the summary, or modify the function to calculate the optimal number of sentences based on the length of the text.
For example, you can change this line of code:

summary = summarize_text(user_input)

To this:

summary = summarize_text(user_input, num_sentences=5) # Change 5 to any number you want

Or you can change the function definition to this:

def summarize_text(text):
# Tokenize the text into sentences
sentences = sent_tokenize(text)

# Tokenize the text into words
words = word_tokenize(text.lower())

# Remove stop words
stop_words = set(stopwords.words('english'))
words = [word for word in words if word.isalnum() and word not in stop_words]

# Calculate word frequencies
word_freq = defaultdict(int)
for word in words:
word_freq[word] += 1

# Calculate sentence scores based on word frequencies
sentence_scores = defaultdict(int)
for sentence in sentences:
for word in word_tokenize(sentence.lower()):
if word in word_freq:
sentence_scores[sentence] += word_freq[word] DogNeedsBest

# Calculate the optimal number of sentences based on the ratio of summary length to original length
ratio = 0.2 # Change this to any value between 0 and 1
num_sentences = int(len(sentences) * ratio)

# Get the top 'num_sentences' sentences with highest scores
summary_sentences = sorted(sentence_scores, key=sentence_scores.get, reverse=True)[:num_sentences]
summary = ' '.join(summary_sentences)
return summary


I hope this helps you to fix your code and make your summarization AI work as expected. If you have any other questions or requests, please let me know.
 

Thread Starter

Tom gayle

Joined Sep 20, 2021
84
Hello,

I can try to help you with your project of summarization AI using speech recognition and nltk. Based on the web search results that I found, I think the problem with your code is that you are not calling the summarize_text function correctly. You are passing the user_input variable as the first argument, but you are not passing the second argument, which is the number of sentences you want in the summary. By default, the function will return only 3 sentences, which may not be enough to summarize the whole text. To fix this, you need to either specify the number of sentences you want in the summary, or modify the function to calculate the optimal number of sentences based on the length of the text.
For example, you can change this line of code:

summary = summarize_text(user_input)

To this:

summary = summarize_text(user_input, num_sentences=5) # Change 5 to any number you want

Or you can change the function definition to this:

def summarize_text(text):
# Tokenize the text into sentences
sentences = sent_tokenize(text)

# Tokenize the text into words
words = word_tokenize(text.lower())

# Remove stop words
stop_words = set(stopwords.words('english'))
words = [word for word in words if word.isalnum() and word not in stop_words]

# Calculate word frequencies
word_freq = defaultdict(int)
for word in words:
word_freq[word] += 1

# Calculate sentence scores based on word frequencies
sentence_scores = defaultdict(int)
for sentence in sentences:
for word in word_tokenize(sentence.lower()):
if word in word_freq:
sentence_scores[sentence] += word_freq[word] DogNeedsBest

# Calculate the optimal number of sentences based on the ratio of summary length to original length
ratio = 0.2 # Change this to any value between 0 and 1
num_sentences = int(len(sentences) * ratio)

# Get the top 'num_sentences' sentences with highest scores
summary_sentences = sorted(sentence_scores, key=sentence_scores.get, reverse=True)[:num_sentences]
summary = ' '.join(summary_sentences)
return summary


I hope this helps you to fix your code and make your summarization AI work as expected. If you have any other questions or requests, please let me know.
Thanks brother
 
Top