How can I make this text generator more efficient ?

Question

koveras vehcna 0 Newbie Poster

14 Years Ago

Hello everyone, I'm working on a random text generator -without using Markov chains- and currently it works without too many problems. Firstly, here is my code flow:

1-Enter a sentence as input -this is called trigger string, is assigned to a variable-
2-Get longest word in trigger string
3-Search all Project Gutenberg database for sentences that contain this word -regardless of uppercase lowercase-
4-Return the longest sentence that has the word I spoke about in step 3
5-Append the sentence in Step 1 and Step4 together
6-Assign the sentence in Step 4 as the new 'trigger' sentence and repeat the process. Note that I have to get the longest word in second sentence and continue like that and so on-

And here is my code:

import nltk
from nltk.corpus import gutenberg
from random import choice
import smtplib
triggerSentence = raw_input("Please enter the trigger sentence: ")#get input str

longestLength = 0

longestString = ""

listOfSents = gutenberg.sents() #all sentences of gutenberg are assigned -list of list format-
listOfWords = gutenberg.words()# all words in gutenberg books -list format-
    
while triggerSentence:
    #so this is run every time through the loop
    split_str = triggerSentence.split()#split the sentence into words

    #code to find the longest word in the trigger sentence input
    for piece in split_str:
        if len(piece) > longestLength:
            longestString = piece
            longestLength = len(piece)

    #lowerStr = longestString.lower()
    #code to get the sentences containing the longest word, then selecting
    #random one of these sentences that are longer than 40 characters
    sets = []
    for sentence in listOfSents:
        if sentence.count(longestString):
            sents= " ".join(sentence)
            if len(sents) > 40:
                sets.append(" ".join(sentence))
                


    
    triggerSentence = choice(sets)
    print triggerSentence

My concern is, the loop mostly reaches a point where the same sentence is printed over and over again. To counter this problem I decided to do the following:

*If the longest word in the current sentence is the same as it was in the last sentence, simply delete this longest word from the current sentence and look for the next longest word.

I tried some implementations but failed to apply the solution above. Any suggestions about how to find the second longest word ? Thanks in advance.

python

2 Contributors
2 Replies
175 Views
7 Hours Discussion Span
Latest Post 14 Years Ago Latest Post by koveras vehcna

TrustyTony 888 ex-Moderator

14 Years Ago

If you would have list of words in sentence sorted in length order, it would be simple wouldn't it?

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

koveras vehcna 0 Newbie Poster · Answer 1 · 2010-08-30T00:46:15+00:00

If you would have list of words in sentence sorted in length order, it would be simple wouldn't it?

Yup, just found it out, thanks.