body (GPL)

Add Comment | Related Links | TrackBack
Related Content

body (GPL)

The UNIX "head" and "tail" utilities are two of the most useful tools. Head outputs the first part of the file, while tail outputs the last part of the file. But sometimes you want to output the middle part of the file. This is where this "body" scripts comes in. Body will output the middle part of the file based on the starting line number and the ending line number specified on the command-line. Body reads from stdin and outputs to stdout. So you can pipe in a file.

body is distributed as executable source code under the GNU General Public License. Please see the license agreement elsewhere on this site.

Usage

  Usage:   body "start line number" "end line number"

Exmple

  cat readme.txt | body 20 40

Pipes readme.txt into body, which outputs the text from line 20 to line 40.

Attached File: body (411 B)

Chieh Cheng
Fri, 6 Jan 2006 15:30:30 -0800

As much as I hate non-structured programming, I finally gave into the demon when I decided to make the body shell script more efficient. By introducing a single "break" statement, the script can stop whenever it's done outputting the last specified line, rather than running through the rest of the text file.

I've also changed the command-line argument check. Although the previous implemented worked, this new implementation is more semantically correct.

The only problem left? The "head" and "tail" UNIX tools has the ability to output white spaces in front of each line. The "read" line statement in "body" strips white spaces. How could we re-implement body to keep the white spaces?

Attached File: 1 - body (494 B)

Chieh Cheng
Mon, 25 Jun 2007 15:36:10 -0700

A while back, I've thought of a way to greatly improve the speed of this "body" utility. I would take advantage of the efficiency of "head" and "tail". But I wasn't motivated to fix the problem, until today.

Today, I've found that it's hard to use the "body" utility in shell scripts due to it's inherit nature to strip extra white spaces. It make shell programming unpredictable and made things file in unexpected ways. So I decide to re-implement body with my "head" and "tail" idea.

Attached below is the latest version of "body" utility using the efficient UNIX "head" and "tail" utility. This latest version fixed the extra white space stripping problem. And I suspect it is faster than the original implementation. I am in the process of testing the performance and will post the result as soon as I have it.

Attached File: 2 - body (362 B)

Chieh Cheng
Tue, 21 Aug 2007 00:13:25 +0000

I now have the performance data. As I suspected, the "head" and "tail" algorithm is way faster than the original implementation. In my test setup, I wrote a simple script to retrieve ten lines from the "body" of a large file and time stamp the process. That large file is basically a large spam IP list that you can get from "Fight Comment Spam, Ban IP's". The following is the sample script.

ls -aogF BannedList.txt
date
cat BannedList.txt | body 10000 10010
date

The result of using the second implementation of "body" took almost four minutes (executing on a 300 MHz notebook computer), as shown below.


-rw-r--r-- 1 5296239 2007-08-20 17:59 BannedList.txt
Mon Aug 20 18:11:26 PDT 2007
125.213.38.217
125.213.39.122
125.213.39.16
125.213.39.76
125.213.40.122
125.213.40.144
125.213.40.157
125.213.41.161
125.213.41.171
125.213.41.232
125.213.4.134
Mon Aug 20 18:15:01 PDT 2007

Using the latest release of "body" with the "head" and "tail" algorithm, the process took less than a second (see below)!

-rw-r--r-- 1 5296239 2007-08-20 17:59 BannedList.txt
Mon Aug 20 18:15:20 PDT 2007
125.213.38.217
125.213.39.122
125.213.39.16
125.213.39.76
125.213.40.122
125.213.40.144
125.213.40.157
125.213.41.161
125.213.41.171
125.213.41.232
125.213.4.134
Mon Aug 20 18:15:20 PDT 2007

Chieh Cheng
Tue, 21 Aug 2007 01:22:44 +0000

Made an improvement to the script to handle errors. This version of body detects the first line being greater than the last line and output an error to stderr.

Attached File: 3 - body (527 B)

Chieh Cheng
Tue, 30 Oct 2012 20:32:43 +0300

Add Comment | Related Links | TrackBack
Related Content

Did your message disappear? Read the Forums FAQ.

Add Comment

Spam Control | * indicates required field

TrackBack

TrackBack only accepted from WebSite-X Suite web sites. Do not submit TrackBacks from other sites.

Send Ping | TrackBack URL | Spam Control

No TrackBacks yet. TrackBack can be used to link this thread to your weblog, or link your weblog to this thread. In addition, TrackBack can be used as a form of remote commenting. Rather than posting the comment directly on this thread, you can posts it on your own weblog. Then have your weblog sends a TrackBack ping to the TrackBack URL, so that your post would show up here.

Messages, files, and images copyright by respective owners.

Fun for Photographers

Get Our

Memecoins!