Hal Ashburner
2014-09-15 00:54:04 UTC
I want to turn stdout into multiple files. I have markers where I
would like this to happen.
stdout:
data
data
data
this_is_a_marker
data2
data2
data2
data2
only it's very large
I have this function which works, but is slow.
Better ideas would include:
1) re-write everything in another language eg python
2) re-write split reports in C
3) ask CLUG if anyone has a faster way of doing this using standard
bash 4.1.2 or older on a redhat enterprise/centos system.
Yeah I just asked about optimising a shell script, I already feel bad
and you don't have to point out that I should. ;-)
function split_reports()
{
local input_file="$1"
local first_report="$2"
local second_report="$3"
# generalise the above using $@ if more than 2 needed
local breaks_seen=0
local line=""
while read line
do
if [[ $line =~ start_report ]]; then
breaks_seen=$((breaks_seen + 1))
# clobber it before using it
# don't write out the marker
: > ${input_file}.${breaks_seen}
else
case $breaks_seen in
[0-9]) echo "${line}" >> ${input_file}.${breaks_seen} ;;
*) echo_stderr "error breaks_seen is ${breaks_seen} -
should be 0-1";;
esac
fi
done < "${input_file}"
mv "${input_file}" "${input_file}.orig"
mv "${input_file}.0" "${first_report}"
mv "${input_file}.1" "${second_report}"
}
would like this to happen.
stdout:
data
data
data
this_is_a_marker
data2
data2
data2
data2
only it's very large
I have this function which works, but is slow.
Better ideas would include:
1) re-write everything in another language eg python
2) re-write split reports in C
3) ask CLUG if anyone has a faster way of doing this using standard
bash 4.1.2 or older on a redhat enterprise/centos system.
Yeah I just asked about optimising a shell script, I already feel bad
and you don't have to point out that I should. ;-)
function split_reports()
{
local input_file="$1"
local first_report="$2"
local second_report="$3"
# generalise the above using $@ if more than 2 needed
local breaks_seen=0
local line=""
while read line
do
if [[ $line =~ start_report ]]; then
breaks_seen=$((breaks_seen + 1))
# clobber it before using it
# don't write out the marker
: > ${input_file}.${breaks_seen}
else
case $breaks_seen in
[0-9]) echo "${line}" >> ${input_file}.${breaks_seen} ;;
*) echo_stderr "error breaks_seen is ${breaks_seen} -
should be 0-1";;
esac
fi
done < "${input_file}"
mv "${input_file}" "${input_file}.orig"
mv "${input_file}.0" "${first_report}"
mv "${input_file}.1" "${second_report}"
}