hey all.
I am trying to assign unique IDs to a list of strings I get from a file

Let's say my list (list.txt) is:

Shoes from Italy 1
Shirts made in Japan
Shoes from Italy 2
Shirts made in France
Boots made in United Kingdom
Socks

I want IDs to be exactly as :

SFI1
SMIJ
SFI2
SMIF
BMIU
S

As you see, these are the first letters of each word (up to 4 words) from each item from the list (SFI1 standing for Shoes From Italy 1, and BMIU doesn't include the K of kingdom )

I couldn't figure out an easy way to make this..

Thanks for your help

If the ID-s don't have to be verbose, then you might consider the uuid module.

But if you want it to be verbose, than the following questions need to be answered imho:

Is this a real ID? One ID <-> One name?
You have to have an algo to make an ID from a name and make a name from the id.

I suppose it has to be a real id.
So we have to have a data construct to store the name-id pairs.

Something like:

class ID(object):
  def __init__(self):
    self.id_name=dict()
    self.name_id=dict()
  def get_name(self,id):
     return self.id_name[id]

And so on...

The algo should be something like in this class:

def put_name(self,name):
  if name in self._name_id: return self._name_id[name]
  id=''.join([x[0] for x in map(string.upper, name.split()) if x!='KINGDOM'])

The 'id=' line will big complicated mess after a while. That is because it is very hard to think out an algo that is bijective (hash for example is not), so you have to look up that the generated id does not already exists.

Once I have made a name lookup search for a bank. They wanted to have a search term (ID) for the customer names, that can be constructed by humans.

The construction was simplified the following:
remove all accent from the name รถ->o etc
Take the first chars from the name's names uppercase

So John Alva Doe became JAD.

The system made this id, but appended a counter after the letter if needed. If there was a James Arian Dee, than JAD1

The user typed in the search term, and got all the customer that began with the search term and could pick one.
We ended up with an alternate customer number anyway. And hundreds of John Smiths:)


hey all.
I am trying to assign unique IDs to a list of strings I get from a file

Let's say my list (list.txt) is:

Shoes from Italy 1
Shirts made in Japan
Shoes from Italy 2
Shirts made in France
Boots made in United Kingdom
Socks

I want IDs to be exactly as :

SFI1
SMIJ
SFI2
SMIF
BMIU
S

As you see, these are the first letters of each word (up to 4 words) from each item from the list (SFI1 standing for Shoes From Italy 1, and BMIU doesn't include the K of kingdom )

I couldn't figure out an easy way to make this..

Thanks for your help

wooooo... I think we've been very far from the post folks :)
Forget about IDs

I just want another list, as below, where each item will be represented by the first letter of each word(up to 4 words).
if an item is made of more than 4 words, just pick the 4 first words (I don't understand why 'kingdom' is hard coded in the solution u posted since it was a pure example)

If you are interested in why these are IDs, we have run some investigation before and concluded that the only way to represent items (in a unique way) is this one.

Thanks ya all

well it was simple enough..

for line in file
  words = line[1:30].strip().split(" ") # each item is split into words
  for i in range(0, 4):
            try:
                w = words[i]
                s = s + w[0]
            except IndexError:
                break

thanks anyway

If you want only this then:

id=''.join([x[0] for x in map(string.upper, name.split())[:4] ])

I don't have python at hand, but I think
a=[1,2,3]
a[:4]
produces [1,2,3]

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.