Hi.

I have a function that needs to split a string on the seperators given by the function. so if i had the seperators as !.?: then it should split the string at those points.

if i have a string 'hello. world! hello.'
how can i return a list with this list ?
i'll have to use .split on the string problem is .split only splits one character and in this case i want to split the string with 2 characters:

'!' and '.'

how can i achieve that without using regex?
The separators can be anything so i need some sort of loop that will use those characters i put as separators

Write a recursive function

# python 2 and 3

def splitmany(s, seps=""):
    if seps:
        u, seps = seps[0], seps[1:]
        for word in s.split(u):
            for item in splitmany(word, seps):
                yield item
    else:
        yield s
            
if __name__ == "__main__":
    print(list(splitmany('hello. world! hello.', "!.")))
    
"""my output -->
['hello', ' world', ' hello', '']
"""

but why don't you want to use regexes ?

Rolling your own is the best way to go. Alternatively you could replace "?" and "!" with "." and split on the period, but that requires many loops through the file.

To get you started if you want to roll your own, you would iterate over each character and append to a list unless the character is one of the dividers. If it is a divider, you would join and append the sub-list to the final list, initialize sub-list as an empty list, and continue appending until the next divisor.

This is place for itertools.groupby in my opinion:

import itertools
separator = set('.!? ')
data = 'hello. world! hello.'
result = []
for isin, group in itertools.groupby(data, lambda x: x not in separator):
    if isin:
        result.append(''.join(group))
print result

split with regex. Could work, maybe slower, more flexible.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.