Quick and dirty Condor log preparation for analysis....


Overview and Assumptions

This page describes some process done on logs taken from Condor batch system to analyze the running time of the jobs. At the end of the following process a file with the following columns will be avail: hostname, jobID, submition time and end time.
The following fields were taken from the logs: The JobRunCount specifies the number of times a job was running <- This one was not used yet.

The process:

  • Splitting the log that contained all the computers into computer names was done by:

    time awk -F"/" '{ print >> $2; close ($2) }' ../condor_statistics_Start_End_date


    real 1m54.137s
    user 0m39.838s
    sys 1m13.889s

    For a file (wc output): 9061533 36397236 666625490 ../condor_statistics_Start_End_date
  • Celaring the log from the lines that look errors

    grep "\.\-1" condor_statistics_Start_End_date | wc


    2141392 8565568 143013296
  • Celaring the log from the lines that look errors

    grep -v "\.\-1" condor_statistics_Start_End_date > condor_statistics_Start_End_date1

    wc condor_statistics_Start_End_date1
    6920141 27831668 523612194 condor_statistics_Start_End_date1
  • Cleaning the file:

    less condor_statistics_Start_End_date1| grep -v Offset | sed "s@\/@\ @g" | sed "s/\ spool//" | sed "s/hosts\ //" | sed "s/job_queue.log.*\:103\ //" > condor_statistics_Start_End_date2

  • wc condor_statistics_Start_End_date2
    6907549 27628176 303928866 condor_statistics_Start_End_date2
  • sorting the statistics file by computer and then by job number:

    time sort -k 1,2 condor_statistics_Start_End_date2 > condor_statistics_Start_End_date3


    real 0m56.376s
    user 0m54.015s
    sys 0m1.244s
  • In order to allow lexicograph order:

    time less condor_statistics_Start_End_date2_temp | sort -k 1,4 > condor_statistics_Start_End_date3


    real 1m5.967s
    user 0m59.020s
    sys 0m2.060s
  • And sorting by hostname, jobID, start time, finish time, Completion time:

    time less condor_statistics_Start_End_date2_temp | sort -k 1,4 > condor_statistics_Start_End_date3

  • Changing the fields names back:

    sed "s/AJobStartDate/JobStartDate/" condor_statistics_Start_End_date3 | sed "s/ZCompletionDate/CompletionDate/" > condor_statistics_Start_End_date4

  • Concating each start job time and end job time (JobFinished will be before Complete) :

    time awk '{if (index($0,"JobStartDate")) {printf "\n" $1 " " $2 " " $3 " " $4 " " ; comp_name=$1 ; jobid=$2 } else if ( (index($0,"Complet") || index($0,"JobFinished")) && (jobid==$2) && (comp_name==$1) ) { printf $3 " " $4 " " "\n"; comp_name="ZZZ"; } }' condor_statistics_Start_End_date4 >> condor_statistics_Start_End_date5


    real 0m12.563s
    user 0m11.889s
    sys 0m0.360s
  • Cleaning the blank lines:

    time egrep "JobFinishedHookDone|Complet" condor_statistics_Start_End_date5 > condor_statistics_Start_End_date6

    real 0m1.147s
    user 0m0.776s
    sys 0m0.152s
    And we have:

    wc condor_statistics_Start_End_date6

    1181105 7086630 85965243 condor_statistics_Start_End_date6
  • That's it

    This file can be loaded to R or Matlab and processed.

    Feel free to complain about the content to: eddiea-@_@-This_should_be_remoVed_cs.tau.ac.il