User Name Password Register
DaniWeb IT Discussion Community
All
What is DaniWeb IT Discussion Community?
You're currently browsing the Python section within the Software Development category of DaniWeb, a massive community of 457,706 software developers, web developers, Internet marketers, and tech gurus who are all enthusiastic about making contacts, networking, and learning from each other. In fact, there are 3,268 IT professionals currently interacting right now! Registration is free, only takes a minute and lets you enjoy all of the interactive features of the site.
Please support our Python advertiser: Programming Forums
Oct 11th, 2008
Views: 696
This snippet defines a class which can group a collection of strings according to a given set of prefixes. Each string goes in the group of the longest prefix it contains.
Last edited : Oct 11th, 2008.
python Syntax | 5 stars
  1. #!/usr/bin/env python
  2.  
  3. # Gribouillis at www.daniweb.com
  4. """
  5. The class Grouper implements grouping strings by prefixes.
  6.  
  7. A Grouper object is created with a sequence of prefixes
  8. G = Grouper(["pref1", "pref2", ...])
  9.  
  10. (an empty prefix is automatically added). The Grouper object
  11. is a dictionary "prefix --> set of strings". Initially, it looks
  12. like
  13. {
  14. "pref1" : set([]),
  15. "pref2" : set([]),
  16. }
  17.  
  18. The method 'add_strings' can be used to add a sequence of strings
  19. to the grouper. Each string goes in one of the sets according to
  20. the longest prefix it contains.
  21. After
  22. G.add_strings(["pref1aaaaa", "pref2jjj", "pref1s"])
  23.  
  24. it's content will be
  25.  
  26. {
  27. "pref1" : set(["pref1aaaaa", "pref1s"]),
  28. "pref2" : set(["pref2jjj"]),
  29. }
  30.  
  31. More strings can be added.
  32. """
  33.  
  34. from collections import deque
  35. from bisect import bisect_left ,bisect_right
  36.  
  37. class Grouper (dict ):
  38. def __init__ (self ,iterable ):
  39. prefixes =set (iterable )
  40. prefixes .add ("")
  41. dict .__init__ (self ,[(p ,set ())for p in prefixes ])
  42. self .skeys =[]
  43.  
  44. def add_strings (self ,iterable ):
  45. strings =sorted (set (iterable ))
  46. if len (self .skeys )!=len (self ):
  47. self .skeys =sorted (self )[1 :]
  48. stack =deque ([("",chr (255 ))])
  49. k =0
  50. for p in self .skeys :
  51. while stack [-1 ][1 ]<=p :
  52. q ,qs =stack .pop ()
  53. k =self ._forward (q ,k ,qs ,strings )
  54. k =self ._forward (stack [-1 ][0 ],k ,p ,strings )
  55. stack .append ((p ,p [:-1 ]+chr (ord (p [-1 ])+1 )))
  56. while stack :
  57. q ,qs =stack .pop ()
  58. k =self ._forward (q ,k ,qs ,strings )
  59.  
  60. def _forward (self ,q ,k ,qs ,strings ):
  61. ks =bisect_left (strings ,qs ,k )
  62. self [q ].update (strings [i ]for i in xrange (k ,ks ))
  63. return ks
  64.  
  65. if __name__ =="__main__":
  66. grouper =Grouper (["/usr","/usr/lib","hello","heat"])
  67. grouper .add_strings ([
  68. "/usr/bin/python","/usr/lib/python2.5","/dev",
  69. "hello world","hello","heat foo","heat","her",
  70. "/usr/liberation",
  71. ])
  72. for k ,s in grouper .items ():
  73. print ("%s: %s"%(repr (k ),str (s )))
  74. print "Adding more strings"
  75. grouper .add_strings ([
  76. "hello fred","/usrssss",
  77. ])
  78. for k ,s in grouper .items ():
  79. print ("%s: %s"%(repr (k ),str (s )))
Post Comment

Only community members can submit or comment on code snippets. You must register or log in to contribute.

DaniWeb Marketplace (Sponsored Links)
All times are GMT -4. The time now is 6:25 pm.
Forum system based on vBulletin Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
©2003 - 2008 DaniWeb® LLC