entab – A utility to convert from white-space to tabbed output

Posted by Ryan Uber | C Programming | Friday 19 November 2010 2:05 pm

The following program was another exercise from “The C Programming Language” by Brian Kerrigan and Dennis Ritchie. This exercise was presented at the end of the first chapter, and no starting point is provided. Utilizing what I had learned from reading and previous exercises, I was able to get through the whole program, and it seems to work perfectly so far.

Here is the description of the exercise, directly from the book:

Write a program entab that replaces strings of blanks by the minimum number of tabs and blanks to achieve the same spacing. Use the same tab stops as for detab*. When either a tab or a single blank would suffice to reach a tab stop, which should be given preference?

Some additional thoughts / potential TODO’s before you read the program:

  • Should the tab-stop length (in spaces) be hard-set? Perhaps this would better be written as a command-line option for usability on programs and files where the programmer has set their editor to use, say, 4 spaces in place of a ‘\t’.
  • Possibly add white space trimming to the end of the line, as in my previous getline.c program. This should likely happen before converting to tabbed output.
  • Should the end-of-line white space trimming be in a function of its own? Perhaps there is already such a function available in string.h.
/*
 *  entab.c - Convert white space into tabbed output, being mindful that
 *            a tab stop is not simply a fixed number of consecutive
 *            white space.
 *
 *  Author: Ryan R. Uber <ryan@blankbmx.com>
 *  Date:   Fri Nov 19 04:39:29 CST 2010
 *
 */

#include <stdio.h>

#define MAXLINE 1000    /*  Maximum input per line */
#define TAB     8       /*  Tab stop interval */
#define TRUE    1       /*  Just symbolic names for readability. */
#define FALSE   0       /*  These could have just as effectively been
                            written as a 1 or 0 in the program */

void copy (char from, char to[]);
int getline (char line[], int lim);

/*  Replace white space with proper tabbing */
int main (void)
{
    int len, i, j, nspaces, stop;
    char line[MAXLINE], output[MAXLINE];

    nspaces = 0;

    while ((len = getline(line, MAXLINE)) > 0)
    {
        for (i = 0; i < len; ++i)
        {
            /*  This defines whether we are at a tab stop position. The
             *  division operation returns a zero if the quotient is an
             *  even number. As an example, try:
             *      ( 8 % 8 )
             *  in a separate C program. This would be the exact operation
             *  run if TAB is set to '8' and you are on the 8th character
             *  of input.
             */
            if ( i != 0 && i % TAB == 0 )
            {
                stop = TRUE;
            }

            else
            {
                stop = FALSE;
            }

            /*  We will not be adding any detected white space to our array
             *  if we are not currently at a tab stop. However, we need to
             *  keep track of the detected white space until to later determine
             *  if we need to replace it with the '\t' character.
             */
            if (line[i] == ' ' && stop == FALSE)
            {
                ++nspaces;
            }

            /*  The following tests that white space was detected all the way
             *  from the last non-white space up to the next tab stop. If this
             *  condition is true, we replace the white space with a tab stop.
             */
            else if (nspaces > 0 && stop == TRUE)
            {
                /*  Copy the '\t' character to the output array, and set our
                 *  space counter back to 0 in preparation for the next test.
                 */
                copy ('\t', output);
                nspaces = 0;

                /*  The current position we are at (i) has not yet had its
                 *  corresponding character added to the output array. We only
                 *  want to copy this character if it is not white space.
                 */
                if ( line[i] != ' ' )
                {
                    copy (line[i], output);
                }

                /*  As before, keep track of the white space if detected for
                 *  accurate white space replacement in the next tab stop.
                 */
                else
                {
                    ++nspaces;
                }
            }

            /*  If we have detected 1 or more white spaces that do not lead us
             *  all the way up until a tab stop, we need to add the actual white
             *  space to our output array so they are not lost.
             */
            else
            {
                /*  Add white space for each ' ' character counted */
                for (j = 0; j < nspaces; ++j)
                {
                    copy (' ', output);
                }

                /*  Set space count to 0 to count any remaining space before we
                 *  encounter the next tab stop. Copy non-white space characters.
                 */
                nspaces = 0;
                copy (line[i], output);
            }
        }
    }

    /*  Display the newly-formatted string and return */
    printf("%s", output);
    return 0;
}

/*  Append provided output to a designated character array */
void copy (char from, char to[])
{
    /*  Set character position to 0 */
    int pos;
    pos = 0;

    /*  Determine where we will append characters in the pre-existing array */
    while ( to[pos] != '\0' )
    {
        ++pos;
    }

    /*  Copy provided character to the last position in the array */
    to[pos] = from;
    to[pos+1] = '\0';
}

/*  Read input from stdin */
int getline (char s[], int lim)
{
    int c, i;

    /*  Read in characters until a newline '\n' is encountered */
    for (i = 0; i < lim-1 && (c=getchar()) != EOF && c != '\n'; ++i)
        s[i] = c;

    /*  If a newline was encountered, add to the array and increment counter */
    if (c == '\n')
    {
        s[i] = c;
        i++;
    }

    /*  Terminate input line, return number of characters found in the line */
    s[i] = '\0';
    return i;
}

/* EOF */

Futurama Quotes WordPress Plugin

Posted by Ryan Uber | PHP | Sunday 14 November 2010 1:11 pm

Futurama has been one of my favorite TV shows for years now. Lately I’ve had a bit of spare time on my hands now that standard time is here again. It’s dark long before I get home from work, so I recently went through a pretty good chunk of the Futurama episodes I have. It brought back some very fond memories.

You know that “Hello Dolly” plugin that always comes with wordpress? The one you never enable? Well I thought it handy as a framework for the plugin I decided to make: Futurama Quotes! This plugin is stupid-simple. There is nothing to it, just an array of quotes that I snagged from the internet and made HTML-friendly, a little CSS, and a randomizer. The result I think was worth it:

This zip file contains just one php script that will add this functionality to your blog. The text will only show up in your admin portal (not on public-facing pages).

Download:
Futurama_Quotes_1.0

getline – A per-line character counting program

Posted by Ryan Uber | C Programming | Thursday 11 November 2010 9:42 am

Here is my second C program, “getline”. It is a little trickier than the last in the way it works. This was actually an exercise in “The C Programming Language”. The source code below is almost unrecognizable from the original in the book (the original program would print only the longest line).

Some of the objectives of this program were:

  • Remove white space from the end of lines
  • Remove blank lines completely
  • Present the count of each line next to its actual content

Todo:

  • Possibly find a more efficient way to remove whitespace than looping backwards through the character array
  • Add argument support as “wc” does
  • Sort all lines by length and allow user to specify which order to sort by. This could also be accomplished by piping the output of the program into a “sort -n”.
/*
 *  getline.c - Count characters per-line from stdin. Strips white
 *              space from end of line, and ignores 0-length lines.
 *
 *  Author: Ryan R. Uber <ryan@blankbmx.com>
 *  Date:   Thu Nov 11 01:04:13 CST 2010
 *
 *  Modified from section 1.9 (Character Arrays)
 *  "The C Programming Language" by Brian Kerrigan and Dennis Ritchie
 *
 */

#include <stdio.h>
#define MAXLINE 1000

/*  Here we initialize getline(). I am unclear on why it is initialized
 *  at the top of the file with different arguments than the actual
 *  function. It works identically without this initialization.
 */
int getline(char line[], int maxline);

int main()
{
    int len;
    char line[MAXLINE];

    /*  Gather lines that have a non-zero length. This would include *any*
     *  line in a file, even a blank line, as that would have a length of
     *  1 for the '\n' character.
     */
    while ((len = getline(line, MAXLINE)) > 0 )
    {
        /*  As stated previously, a blank line has a length of 1. Therefore,
         *  in order to skip empty lines, we only want to look at lines with
         *  length of 2 or greater.
         */
        if ( len > 1 )
            printf("%6d %s", len-1, line);
    }

    return 0;
}

int getline(char s[], int lim)
{
    int c, i, n;

    for ( i=0; i < lim-1 && (c=getchar()) != EOF && c != '\n'; ++i )
        s[i] = c;

    if ( c == '\n' )
    {
        /*  This part of the program strips the white space from the end of
         *  each line. This is tricky because we cannot do this until after
         *  the entire line of input is read, including the white space. Thus,
         *  we have to remove the white space characters from the s[] array,
         *  and decrement any counters we are using to compensate for the
         *  loss of characters. Here we loop backwards through the s[] array
         *  once a newline is encountered.
         */
        for ( n = i-1; n >= 0; n-- )
        {
            /*  The variable "n" is the previous element's offset in the array.
             *  Since we found a newline character, let's see if the character
             *  before it was white space.
             */
            if ( s[n] == ' ' || s[n] == '\t' )
            {
                /*  Here we zero the white space element in the array and
                 *  decrement the character count variable "i".
                 */
                s[n] = '\0';
                i--;
            }

            /*  If there was no white space at the end of the string, add the
             *  newline, increment "i", and break out of the backward loop.
             */
            else
            {
                s[i] = c;
                ++i;
                break;
            }
        }

        /*  My ghetto way of fixing the 0-return on lines containing *only* '\n'.
         *  This entire function needs work so that I can remove this terribleness.
         */
        if ( i < 1 )
            i = 1;
    }

    s[i] = '\0';

    return i;
}

/* EOF */

Here is some sample output:

$ cat ~/.bash_profile | getline
    15 # .bash_profile
    31 # Get the aliases and functions
    25 if [ -f ~/.bashrc ]; then
    12  . ~/.bashrc
     2 fi
    48 # User specific environment and startup programs
    20 PATH=$PATH:$HOME/bin
    11 export PATH
    14 unset USERNAME