getline – A per-line character counting program
Here is my second C program, “getline”. It is a little trickier than the last in the way it works. This was actually an exercise in “The C Programming Language”. The source code below is almost unrecognizable from the original in the book (the original program would print only the longest line).
Some of the objectives of this program were:
- Remove white space from the end of lines
- Remove blank lines completely
- Present the count of each line next to its actual content
Todo:
- Possibly find a more efficient way to remove whitespace than looping backwards through the character array
- Add argument support as “wc” does
- Sort all lines by length and allow user to specify which order to sort by. This could also be accomplished by piping the output of the program into a “sort -n”.
/*
* getline.c - Count characters per-line from stdin. Strips white
* space from end of line, and ignores 0-length lines.
*
* Author: Ryan R. Uber <ryan@blankbmx.com>
* Date: Thu Nov 11 01:04:13 CST 2010
*
* Modified from section 1.9 (Character Arrays)
* "The C Programming Language" by Brian Kerrigan and Dennis Ritchie
*
*/
#include <stdio.h>
#define MAXLINE 1000
/* Here we initialize getline(). I am unclear on why it is initialized
* at the top of the file with different arguments than the actual
* function. It works identically without this initialization.
*/
int getline(char line[], int maxline);
int main()
{
int len;
char line[MAXLINE];
/* Gather lines that have a non-zero length. This would include *any*
* line in a file, even a blank line, as that would have a length of
* 1 for the '\n' character.
*/
while ((len = getline(line, MAXLINE)) > 0 )
{
/* As stated previously, a blank line has a length of 1. Therefore,
* in order to skip empty lines, we only want to look at lines with
* length of 2 or greater.
*/
if ( len > 1 )
printf("%6d %s", len-1, line);
}
return 0;
}
int getline(char s[], int lim)
{
int c, i, n;
for ( i=0; i < lim-1 && (c=getchar()) != EOF && c != '\n'; ++i )
s[i] = c;
if ( c == '\n' )
{
/* This part of the program strips the white space from the end of
* each line. This is tricky because we cannot do this until after
* the entire line of input is read, including the white space. Thus,
* we have to remove the white space characters from the s[] array,
* and decrement any counters we are using to compensate for the
* loss of characters. Here we loop backwards through the s[] array
* once a newline is encountered.
*/
for ( n = i-1; n >= 0; n-- )
{
/* The variable "n" is the previous element's offset in the array.
* Since we found a newline character, let's see if the character
* before it was white space.
*/
if ( s[n] == ' ' || s[n] == '\t' )
{
/* Here we zero the white space element in the array and
* decrement the character count variable "i".
*/
s[n] = '\0';
i--;
}
/* If there was no white space at the end of the string, add the
* newline, increment "i", and break out of the backward loop.
*/
else
{
s[i] = c;
++i;
break;
}
}
/* My ghetto way of fixing the 0-return on lines containing *only* '\n'.
* This entire function needs work so that I can remove this terribleness.
*/
if ( i < 1 )
i = 1;
}
s[i] = '\0';
return i;
}
/* EOF */
Here is some sample output:
$ cat ~/.bash_profile | getline
15 # .bash_profile
31 # Get the aliases and functions
25 if [ -f ~/.bashrc ]; then
12 . ~/.bashrc
2 fi
48 # User specific environment and startup programs
20 PATH=$PATH:$HOME/bin
11 export PATH
14 unset USERNAME
