bash - How to count number of forked (sub-?)processes -
somebody else has written (tm) bash script forks many sub-processes. needs optimization. i'm looking way measure "how bad" problem is.
can / how count says how many sub-processes forked script all-in-all / recursively?
this simplified version of existing, forking code looks - poor man's grep:
#!/bin/bash  file=/tmp/1000lines.txt match=$1  let cnt=0 while read line     cnt=`expr $cnt + 1`     linearray[$cnt]="${line}" done < $file totallines=$cnt  cnt=0 while [ $cnt -lt $totallines ]     cnt=`expr $cnt + 1`     matches=`echo ${linearray[$cnt]}|grep $match`     if [ "$matches" ] ;         echo ${linearray[$cnt]}     fi done   it takes script 20 seconds $1 in 1000 lines of input. code forks way many sub-processes. in real code, there longer pipes (e.g. proga | progb | progc) operating on each line using grep, cut, awk, sed , on.
this busy system lots of other stuff going on, count of how many processes forked on entire system during run-time of script of use me, i'd prefer count of processes started script , descendants. , guess analyze script , count myself, script long , rather complicated, i'd instrument counter debugging, if possible.
to clarify:
- i'm not looking number of processes under 
$$@ given time (e.g. viaps), number of processes run during entire life of script. - i'm not looking faster version of particular example script (i can that). i'm looking way determine of 30+ scripts optimize first use bash built-ins.
 
you can count forked processes trapping sigchld signal. if can edit script file can this:
set -o monitor # or set -m trap "((++fork))" chld   so fork variable contain number of forks. @ end can print value:
echo $fork forks   for 1000 lines input file print:
3000 forks   this code forks 2 reasons. 1 each expr ... , 1 `echo ...|grep...`. in reading while-loop forks every time when line read; in processing while-loop forks 2 times (one because of expr ... , 1 `echo ...|grep ...`). 1000 lines file forks 3000 times. 
but not exact! forks done calling shell. there more forks, because `echo ...|grep...` forks start bash run code. after forks twice: 1 echo , 1 grep. 3 forks, not one. rather 5000 forks, not 3000.
if need count forks of forks (of forks...) (or cannot modify bash script or want other script), more exact solution can used
strace -fo s.log ./x.sh   it print lines this:
30934 execve("./x.sh", ["./x.sh"], [/* 61 vars */]) = 0   then need count unique pids using (first number pid):
awk '{n[$1]}end{print length(n)}' s.log   in case of script got 5001 (the +1 pid of original bash script).
comments
actually in case forks can avoided:
instead of
cnt=`expr $cnt + 1`   use
((++cnt))   instead of
matches=`echo ${linearray[$cnt]}|grep $match` if [ "$matches" ] ;     echo ${linearray[$cnt]} fi   you can use bash's internal pattern matching:
[[ ${linearray[cnt]} =~ $match ]] && echo ${linearray[cnt]}   mind bash =~ uses ere not re (like grep). behave egrep (or grep -e), not grep.
i assume defined linearray not pointless (otherwise in reading loop matching tested , linearray not needed) , used other purpose well. in case may suggest little bit shorter version:
readarray -t linearray <infile   line in "${linearray[@]}";{ [[ $line} =~ $match ]] && echo $line; }   first line reads complete infile  linearray without loop. second line process array element-by-element.
measures
original script 1000 lines (on cygwin):
$ time ./test.sh 3000 forks  real    0m48.725s user    0m14.107s sys     0m30.659s   modified version
forks  real    0m0.075s user    0m0.031s sys     0m0.031s   same on linux:
3000 forks  real    0m4.745s user    0m1.015s sys     0m4.396s   and
forks  real    0m0.028s user    0m0.022s sys     0m0.005s   so version uses no fork (or clone) @ all. may suggest use version small (<100 kib) files. in other cases grap, egrep, awk on performs pure bash solution. should checked performance test.
for thousand lines on linux got following:
$ time grep solaris infile # solaris not in infile  real    0m0.001s user    0m0.000s sys     0m0.001s      
Comments
Post a Comment