counting collocates -- Help for Linguist, Newbie

Question

ChrisP_Buffalo 0 Newbie Poster

17 Years Ago

I'm a linguist (Python newbie) trying to use Python to help me with some simple NLP processing. I have extracted verb + preposition + VBG triples from the British National Corpus, so I have large csv files containing stuff like this:

plan,on,selling
criticised,for,getting
opened,after,receiving
were,in,identifying
visited,before,returning
recruited,from,including
attended,by,including
given,by,joining
...
...

The python script below counts tokens within any given column (e.g., how many times does the verb "prevent" occur, or how many times does the preposition "by" occur).

import csv
out_stream = file('counted_test_file.csv', 'w')
x = csv.reader(open('test_file.csv', 'rb'))

count = {}

for verb, prep, vbg in x:
	if verb not in count:
		count[verb] = 0
	count[verb] += 1

for (key, val) in count.items():
	print>>out_stream, "%s,%s" % (key, val)

out_stream.close()

Now I'm trying to get this code to count all combinations (e.g., how many times does 'prevent' occur with 'from'). I tried the following variation, but this just counts preps (the second element in the csv file):

for verb, prep, vbg in x:
if (verb and prep) not in count:
count[(verb and prep)] = 0
count[(verb and prep)] += 1

Any help would be greatly appreciated!

python

3 Contributors
3 Replies
94 Views
6 Hours Discussion Span
Latest Post 17 Years Ago Latest Post by woooee

All 3 Replies

woooee 814 Nearly a Posting Maven

17 Years Ago

If you want both verb and prep to be found in count created by the existing code

from collections import defaultdict
total_found_dic = defaultdict(int) 
if (verb in count) and (prep in count):
     total_found_dic[(verb, prep)]  += 1

Note that you want to test sub-words and print the results to see what happens. I doubt you are looking for "the", but as an example, searching for "the" may or may not give it hit for the word "they", depending on how the dictionary is arranged.

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

BearofNH 104 Posting Whiz · Answer 1 · 2008-04-25T23:26:16+00:00

BearofNH 104 Posting Whiz

17 Years Ago

How about count[(verb,prep)] += 1 ?

ChrisP_Buffalo 0 Newbie Poster · Answer 2 · 2008-04-25T23:33:19+00:00

How about count[(verb,prep)] += 1 ?

Ahh, yes, count[(verb,prep)] += 1 worked. So simple. Thanks!

counting collocates -- Help for Linguist, Newbie

Recommended Answers Collapse Answers

All 3 Replies

Recommended Answers