Quick and dirty Condor log preparation for analysis....
Overview and Assumptions
This page describes some process done on logs taken from Condor batch system to analyze the running time of the jobs.
At the end of the following process a file with the following columns will be avail: hostname, jobID, submition time and end time.
The following fields were taken from the logs:
- Starting time: JobStartDate
- The end time is seen as: CompletionDate
- Another field that might be of interest is: JobFinishedHookDone
The JobRunCount specifies the number of times a job was running <- This one was not used yet.
The process:
Splitting the log that contained all the computers into computer names was done by:
time awk -F"/" '{ print >> $2; close ($2) }' ../condor_statistics_Start_End_date
real 1m54.137s
user 0m39.838s
sys 1m13.889s
For a file (wc output): 9061533 36397236 666625490 ../condor_statistics_Start_End_date
Celaring the log from the lines that look errors
grep "\.\-1" condor_statistics_Start_End_date | wc
2141392 8565568 143013296
Celaring the log from the lines that look errors
grep -v "\.\-1" condor_statistics_Start_End_date > condor_statistics_Start_End_date1
wc condor_statistics_Start_End_date1
6920141 27831668 523612194 condor_statistics_Start_End_date1
Cleaning the file:
less condor_statistics_Start_End_date1| grep -v Offset | sed "s@\/@\ @g" | sed "s/\ spool//" | sed "s/hosts\ //" | sed "s/job_queue.log.*\:103\ //" > condor_statistics_Start_End_date2
wc condor_statistics_Start_End_date2
6907549 27628176 303928866 condor_statistics_Start_End_date2
sorting the statistics file by computer and then by job number:
time sort -k 1,2 condor_statistics_Start_End_date2 > condor_statistics_Start_End_date3
real 0m56.376s
user 0m54.015s
sys 0m1.244s
In order to allow lexicograph order:
time less condor_statistics_Start_End_date2_temp | sort -k 1,4 > condor_statistics_Start_End_date3
real 1m5.967s
user 0m59.020s
sys 0m2.060s
And sorting by hostname, jobID, start time, finish time, Completion time:
time less condor_statistics_Start_End_date2_temp | sort -k 1,4 > condor_statistics_Start_End_date3
Changing the fields names back:
sed "s/AJobStartDate/JobStartDate/" condor_statistics_Start_End_date3 | sed "s/ZCompletionDate/CompletionDate/" > condor_statistics_Start_End_date4
Concating each start job time and end job time (JobFinished will be before Complete) :
time awk '{if (index($0,"JobStartDate")) {printf "\n" $1 " " $2 " " $3 " " $4 " " ; comp_name=$1 ; jobid=$2 } else if ( (index($0,"Complet") || index($0,"JobFinished")) && (jobid==$2) && (comp_name==$1) ) { printf $3 " " $4 " " "\n"; comp_name="ZZZ"; } }' condor_statistics_Start_End_date4 >> condor_statistics_Start_End_date5
real 0m12.563s
user 0m11.889s
sys 0m0.360s
Cleaning the blank lines:
time egrep "JobFinishedHookDone|Complet" condor_statistics_Start_End_date5 > condor_statistics_Start_End_date6
real 0m1.147s
user 0m0.776s
sys 0m0.152s
And we have:
wc condor_statistics_Start_End_date6
1181105 7086630 85965243 condor_statistics_Start_End_date6
That's it
This file can be loaded to R or Matlab and processed.
Feel free to complain about the content to: eddiea-@_@-This_should_be_remoVed_cs.tau.ac.il