Member Avatar for gianniszaf

Hi there,

I am running a program in order to generate some reports. The program starts connects to the database takes the first customer and then calls a method in order to create a report for this customer. When it completes takes the second customer generates the report etc etc.
But each customer needs 5 hours to complete.

Is it possible to trigger all the reports for all the customers without waiting each one to complete?

I read something about threads but I am not sure if this is the solution.

for c in customers:
        create_reports(a,b,c,d)

Any ideas?

If your called function for a customer takes 100%CPU load you will take longer to complete the whole task due to scheduling the tasks. If the function is an RPC you move the load to the server, only if it is a server farm or it has multi CPU you will gain speed/time by triggering multiple RPC requests.
To my knowledge the Python interpreter is single threaded (in C or OS), the Python threads are scheduled in the interpreter, so no gain for a multi core CPU. You have to have to lauch as much python iterperters as there are cores in the computer and with a commandline parameters separate the work. The parameterss can be "process every n'th customer, start with customer x (0..n-1)"

If you have 4 cores in your computer, start 4 commandshells and give the following commands, one in each shell

python customerreport.py --start 0 --step 4
python customerreport.py --start 1 --step 4
python customerreport.py --start 2 --step 4
python customerreport.py --start 3 --step 4

What is the report doing? In 5 hours you can calculate a lot?

Member Avatar for gianniszaf

If your called function for a customer takes 100%CPU load you will take longer to complete the whole task due to scheduling the tasks. If the function is an RPC you move the load to the server, only if it is a server farm or it has multi CPU you will gain speed/time by triggering multiple RPC requests.
To my knowledge the Python interpreter is single threaded (in C or OS), the Python threads are scheduled in the interpreter, so no gain for a multi core CPU. You have to have to lauch as much python iterperters as there are cores in the computer and with a commandline parameters separate the work. The parameterss can be "process every n'th customer, start with customer x (0..n-1)"

If you have 4 cores in your computer, start 4 commandshells and give the following commands, one in each shell

python customerreport.py --start 0 --step 4
python customerreport.py --start 1 --step 4
python customerreport.py --start 2 --step 4
python customerreport.py --start 3 --step 4

What is the report doing? In 5 hours you can calculate a lot?

Making counts on a months data

Making counts on a months data

You should be able to process the data file(s) once, i.e. one 5 hour period, and store the counts for all of the customers in a dictionary or SQL file in one pass. Worst case is that you process the data once and output a separate, smaller file for each customer which is then processed in much less time. If you give us some sample input data and explain what you are trying to do, we can probably help with this.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.