Hi All,

I'm extremely new to Python and would like to create a program to help automate some of the things I do at work. I need to take information from an IDX shell and create a word document with it. Does this sound possible? Would anyone be interested in switching to E-mail and helping me figure this out? I'd have to start from scratch, but I learn fast.

Any help would be appreciate.

Thanks,
Ryan

Why not do it on the forum so all of us can learn?

It's a good idea, for example I don't know what an IDX shell is

I'd be more than happy to do it here, I just want to make sure it is easy for someone to help me.

IDX (aka Reflections) is a telnet application we use to access a db of patient names, demographics, appointment history, and schedules.

I work in a medical records department where we print information from the shell, then manually create labels for the charts and other specific paperwork. I would like very much to streamline as much of this process as possible and figure the labels would be a good start to get to know the python language, then try to tackle other parts of the process.

Also, IDX supports Visual Basic Macros. I don't know if these can be initiated from outside the shell or if I will have to create and run macros from inside IDX and maybe print lists to files, then use Python to move data around? Sorry to be such a noob, but I really don't know what I'm doing.

I suppose that the information which you print could easily be sent to a file instead of the printer ? I think it would be a good idea to show us an exemple of this printed information and also of the information that you create manually from this.

Hello Again,

Sorry it took so long to get back, I had to discuss the options for printing to a file with our I.T. Department. I finally managed to find a way to generate a text document with the information I need.

I have attached a shortened version of the schedule file I can create. The columns will always look like they do in the attached file as far as spacing. I had to replace any identifying information due to Health Information Protection, but the format of the report has not changed at all.

I also attached a template for printing labels with the patients' information. I filled in one of the labels to show how the information is to be organized - the name, which is capitalized on the schedule will need to appear the way it does on the labels, with only the first letter capitalized.

This is a start to the project - eventually I'd like to be able to populate the Name and Date of Birth fields on other forms that we use.

Thanks again for helping out,
Ryan

Hello?

Can anyone help me use Python to extract information from the text file attached to my last post and enter it into the word template that is also attached?

I would really appreciate the help.

Let's do this one step at a time. The text file is kind of ugly, but here's a quick example of extracting info.

>>> f = open('Schedule.txt')
>>> r = f.readlines()
>>> f.close()
>>>
>>> data_found = 0
>>> for each_line in r:
...     if not data_found:
...         if each_line.strip()[:4] == 'Time':
...             data_found = 1
...             my_data = {}
...             for each_entry in [ i.strip() for i in each_line.split('  ') if i ]:
...                 my_data[each_entry] = ''
...     else:
...         if each_line[:3] != '   ' and each_line.strip():
...             print [ i.strip() for i in each_line.strip().split('  ') if i ]
...     
['08:30AM', 'DOE, JANE', '3######', '##/##/####', '4MDC', '###-###-####', 'FES', '40', 'XXXX MD,XXXXXX', 'PNL', 'MFG', '3XXXXXXX']
['08:30AM', 'DOE, JANE', '3######', '##/##/####', '5UNI', '###-###-####', 'FES', '40', 'XXXX MD,XXXXXX', 'PNL', 'MFG', '3XXXXXXX']
['09:20AM', 'DOE, JANE', '3######', '##/##/####', '5UNI', '###-###-####', 'IGS', '40', 'XXXX MD,XXXXXX', 'PNL', 'MFG', '3XXXXXXX']
['09:20AM', 'DOE, JANE', '3######', '##/##/####', '5PPO', '###-###-####', 'IGS', '40', 'XXXX MD,XXXXXX', 'PNL', 'MFG', '3XXXXXXX']
['10:10AM', 'DOE, JANE', '3######', '##/##/####', '4OMH', '###-###-####', 'UFU', '40', 'XXXX MD,XXXXXX', 'PNL', 'MFG', '3XXXXXXX']
['10:10AM', 'DOE, JANE', '3######', '##/##/####', '4MPE', '###-###-####', 'AMN', '40', 'XXXX MD,XXXXXX', 'PNL', 'MFG', '3XXXXXXX']
>>> my_data
{'Hm/Wk Phone': '', 'Appt#': '', 'Loc': '', 'DOB': '', 'Provider': '', 'Patient Name': '', 'Dept': '', 'MRN': '', 'FSC1': '', 'Time': '', 'Dur': '', 'Typ': ''}
>>>

Now we'll need to put some thought into how we want to handle this. We could set up a list and for each patient record, fill in my_data structure, and then deep copy it into that master list of records. After the extraction stage is done we'll need to think about what data we want to put into the word template.

The final step is actually filling in the word template which may prove to be much more difficult. I've never worked with MS Word COM objects (which I'm assuming we'll need here), so hopefully somebody else on the forum has some insights.

Thanks but I think you may have to walk me through that a little. It looks like you're opening the text file, extracting the data, and closing the file. Then are you assigning each string item to a variable? I'm not at all familiar with Python and am only vaguely familiar with C++.

Ah sorry, sometimes I just forget to explain myself ;)

Let me break it down:

>>> f = open('Schedule.txt')
>>> r = f.readlines()
>>> f.close()

So we open the file with the open() function. This function takes the path to the file as its first argument, and the mode that you want as the second argument, returning a file handle. If the second argument is not provided it defaults to Read mode.

Alternately you can do open( my_file, 'r' ) to specify read. Similarly, open( my_file, 'w' ) opens a file for writing; however keep in mind that an existing file will be cleared out or created anew if it doesn't. Using a mode of 'a' would be append (ie, don't clear an existing file add onto the end). The link above explains in detail other modifiers that can be used with the modes, such as + and 'b'

Next line I asked the file handle to return a list containing each line as a single element. I could've alternately used read() to get the entire block of text as a string, but that's ugly. You could roll your own iteration by using readline(), which will read a single line at a time and update the file handle to always know the position of the "cursor", so that you can call repeated readline()'s until EOF is reached.

Finally we close the file handle.

>>> data_found = 0
>>> for each_line in r:
...     if not data_found:
...         if each_line.strip()[:4] == 'Time':
...             data_found = 1
...             my_data = {}
...             for each_entry in [ i.strip() for i in each_line.split('  ') if i ]:
...                 my_data[each_entry] = ''
...     else:
...         if each_line[:3] != '   ' and each_line.strip():
...             print [ i.strip() for i in each_line.strip().split('  ') if i ]

So data_found is a marker to know that we've moved passed that header information of the file and into the actual patient data. Our for loop is iterating over r , which if you remember has each line of the file as an element (so the loop is effectively reading the file line by line).

Since data_found = 0 , the first conditional statement within the loop is true. Since everything in Python is an object, all built-in objects have a boolean element. So for numbers, 0 is False and anything else is True. Strings '' is false (empty string), and as soon as even a single character is stored in the string it becomes True. I check for the boolean value of our variable and negate it with the not keyword: if not data_found . So basically this says; if data_found is false, perform this action.

Do denote the beginning of our data, I've chosen the line with the "headings" of our categories. So I have a secondary check for the word "Time" as the beginning of a line. So on every line that I'm checking I strip off any whitespace with strip(), and then slice off the first 4 characters with [:4] (this denotes from the beginning of the string to after the fourth character). So if that portion of the line is equal to 'Time' I know I'm onto the good part, so I set data_found to 1 (True) and initialize a dictionary via {} The next part is comlicated so I'll break it down. It's called a list comprehension, and it combines many things into a single step.

[ i.strip() for i in each_line.split('  ') if i ]

Would be the same thing as doing the following:

breakdown = each_line.split('  ')
new_data = []
for element in breakdown:
    if element:
        new_data.append(element.strip())

So first I split apart the string using ' ' (two spaces) as a delimiter. This is because some elements have a single space between them, and categories are all separated by atleast two spaces.

I next initialized an empty list with [] , and then begin iterating over my container with the slit-apart line. Since some parts of the line contained more than just two spaces, each occurance of two spaces gets divided and our container is full of empty strings. That is why I check if element: , so as long as it's not an empty string, I strip and trailing/leading white space off with strip(), and then add it to my new_data container.

Finally, we have a container containing only the category names and no garbage or extra whitespace. So in the original example I then iterate over this container, and add each category name to the dictionary. This is essentially unnecessary and could be skipped since we've not made use of this container in the example.

So after all of that, we've gotten past the header lines of the file, and parsed the categories into a dictionary for an unknown purpose. The next iteration of the outer most for loop will kick us into the else clause, since data_found is now 'True'...

I next applied the exact same list comprehension as before to each line that contained patient data. To separate vital patient data from the "comments" or whatever comes after it, I looked at the amount of whitespace before each line. Patient data only has one space, and the notes have four or five spaces. So first I check if each_line[:3] != ' ' (if the first three characters of this line are all spaces (three spaces), then skip it... otherwise, do the list comprehension). As long as that condition holds true, we should only perform the list comprehension on the patient data proper.

So once again, we strip the whitespace from the line, and then split it using two spaces as our delimiter. We go through and remove the '' empty strings, and strip whitespace off any non-empty strings. I chose to simply print the lines out with print to validate that we were only receiving the data that we wanted and not any extra garbage.

WOW. Sorry for lacking brevity. It's not a skill that I possess. I hope something in that rambling mess helped you.

I definitely have a better idea of what I'm looking at now, and can probably wrap my head around it even more if I can get it to work.

This is a little embarrassing, but I have no idea how to run this program. I know python is not at all like C++ as far as compiling and running, but that's about the only thing I know about it. Can you help me out here too?

Thanks for the help so far, it looks like this is exactly what I need.

When you want to run a python script generally the easiest thing is to open a command-window. Change (cd) to the directory that your script is located, and then type python <name_of_script>.py <and_then_any_options> .

But note that if python isn't in your path (ie, if you're on windows and haven't added it yourself), that you'll need to do something like: C:\Python25\python.exe <name_of_script>.py .

HTH

Do you have python installed on your PC?

If so, and you are running windows, it should be as easy as double-clicking the file (assuming you gave it a .py extension)

if not, go to python.org and grab it.

If you are running some flavor of *nix. add the path to the python executable at the top of the script (i.e. #! /usr/lib/python ), make sure you have execute rights to the file ( chmod a+x <filename> ) and run it like you normally would run an executable.

Sorry for the long break, I finally got some time to read through all this and try the script. I saved it as a .py file and put it in a folder with the schedule.txt file. When I run the script, a DOS box pops up, thinks for a second and then closes. That's it. The schedule.txt file does not change and there is no new file created. Am I missing something here?

Try editing it in IDLE, you are probably getting an Error/Exception. If you double click it to start your python script then if there is an error it will not tell you what it is, the window and program will just close. If you run it from IDLE (F5) then if an error occours it will tell you where and why there is an error.

Hope that helps.

Sometimes it helps to put a line at the end of your script:

raw_input('Press Enter to Continue...')

But if it is failing due to an exception this probably won't catch it.

An even better solution is to simply run the script from the command line, so that everything is captured in the same window.

Open up a command prompt, navigate to the directory with the python script and then use C:\Python25\python.exe <scriptname>.py (and make sure to check that the C:\ path matches your configuration)

Alright, so I realized that part of the problem was the fact that I copied over the numbers before the lines in the script. Once I figured that out, I learned that Python cares very much about the way you indent your code. After some trial and error editing the script in notepad, I gave up and tried running it one line at a time in IDLE (is this even correct?) and when I got to line 13. "else :", I got a syntax error.

Does this help at all? Can you believe what a newbie I am? I have no clue how to properly utilize IDLE. It is an alien environment.

No that dosent tell us anything. Because if you run it line by line an else statement is going to be invalid syntax because there is no if before it. What you should to is run the whole program in IDLE and then post the error you get.

Alright, how do I run the whole thing without entering it line by line?

There are many ways.
1) Double click the file, which should open python, run the script, and exit; which sometimes requires a statement like raw_input("Press ENTER to exit...") to be able to read any output.

2) Open a command line ( Start -> Run -> cmd ), and navigate to the directory where your script is. Then type C:\Python25\python.exe <name_of_script>.py . Note that if you don't have Python 2.5 installed or have installed it to a non-default location the C:\ path will need to change.

If you want to run the whole thing in IDLE then go to where the program is located, right click it. Then choose Edit in IDLE.

If that option does not exsist then you can also open IDLE and the go File - Open and open your .py or .pyw file for editing.

To run it in IDLE once it is open press F5.

Hope that helps

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.