Hi, I am undertaking a peice of work and may need a bit of help.
The problem i need to find a solution for is as follows -
I am requesting a Text based document through a http request, and currently have the document i want from the http request. Now i need to look for an element within this text and remove it
Does anyone have an idea how i could do this my code is below. also i am requesting the document from a solr database, which basically outputs the results from the query in a text view but
#!/bin/env python2.5
from urllib2 import *
import sys
import os
import pickle
import logging
from optparse import OptionParser, OptionGroup
import urllib
def set_solr_query(docID):
return '[internal network http]'
def request_doc(url):
conn = urlopen(url)
rsp = eval( conn.read() )
docedit = rsp['response']['docs'][0]
#docedit = fileobject.read()
#doc.readlines()
find_keyword(docedit)
print "number of matches=", rsp['response']['numFound']
for doc in rsp['response']['docs']:
print 'Year =', doc['year']
def find_keyword(x):
print "opened File object and dumped doc into pickle"
file_pi = open('filename_pi.obj', 'w')
pickle.dump(x,file_pi)
def main(argv):
print '+++++++++++++++++++++++++++++++++++++++CONFIGURATIONS++++++++++++++++++++++++++++++++++++++++++++++'
docID=argv[1]
keyword=argv[2]
url = set_solr_query(docID)
request_doc(url)
print docID
print keyword
print url
print '++++++++++++++++++++++++++++++++++++++++++++END++++++++++++++++++++++++++++++++++++++++++++++++++++'
if __name__ == "__main__":
main(sys.argv)
The only part of the above code i cannot publish is the internal http request. apologies for this.
I hope someone can help point a first time python user to this, i would be very greatful. I look foward to hearing back from someone.
Thanks
Dan