How do I find a function?

Get that feeling of dejavu as you go thru your code? Think you saw that same assignment or cout just a few lines ago? Maybe you should think of refactoring your code to pull out those common calculations, inputs, outputs, as a function.

Or, perhaps you just feel that your code looks cluttered -- there are too many details in the main flow of the code and you want to hide those away in a function to make things more readable. Excellent! That is a form of refactoring and really the easiest place to start...

Clutter Removal: Identify the Culprit

Our first task is to look through our program's code to find those bits that seem more clutter than use. Let's start with something like this:

#include <iostream>
#include <cmath>

using namespace std;

int main(void)
{
    double side1, side2, side3,      // 3 side lengths for triangle
           area;                     // area of triangle
    double s;          // Heron's semi-perimeter...left as s
                       // because most don't know what it is
                       // and only recognize it as s.

    // prints a welcome message on the screen.
    cout << "\nWelcome to the 3-Sided Triangle Program!\n\n"
            "Follow the prompts and you enjoy the ride!\n"
         << endl;

    cout << "Enter length of the triangle's first side:  ";
    cin >> side1;
    cout << "Enter length of the triangle's second side:  ";
    cin >> side2;
    cout << "Enter length of the triangle's third side:  ";
    cin >> side3;

    /*
       calculates area of a triangle from Heron's formula.  Needs
       three sides' lengths as input.
       
       Heron's formula (sides are a, b, and c):
       
                a+b+c
           s = -------
                  2
                 __________________
        area = \/ s(s-a)(s-b)(s-c)
      
        s here is known as the semi-perimeter (half the perimeter of
        the triangle) and is a temporary value to simplify the actual
        calculation of the area.
    */
    s = (side1+side2+side3)/2;
    area = sqrt(s*(s-side1)*(s-side2)*(s-side3));

    // prints area in nice message format.
    cout << "\nThe area of your triangle is "
         << area << '.' << endl;

    // prints a goodbye message on the screen.
    cout << "\n\nThank you for using the 3STP!\n\n"
            "Please endeavour to have a bountiful day!\n"
         << endl;

    return 0;
}

Our most obvious clutter culprit is the area calculation itself. Just the comments take up half the program! Let's pull that out into a function as is:

    {
        double area,                     // area of triangle
               s;          // Heron's semi-perimeter...left as s
                           // because most don't know what it is
                           // and only recognize it as s.
        /*
           calculates area of a triangle from Heron's formula.  Needs
           three sides' lengths as input.
           
           Heron's formula (sides are a, b, and c):
           
                    a+b+c
               s = -------
                      2
                     __________________
            area = \/ s(s-a)(s-b)(s-c)
          
            s here is known as the semi-perimeter (half the perimeter of
            the triangle) and is a temporary value to simplify the actual
            calculation of the area.
        */
        s = (side1+side2+side3)/2;
        area = sqrt(s*(s-side1)*(s-side2)*(s-side3));
        return area;
    }

Now I'll pull that huge comment out to above the function since it is describing the entire function:

    /*
       calculates area of a triangle from Heron's formula.  Needs
       three sides' lengths as input.
       
       Heron's formula (sides are a, b, and c):
       
                a+b+c
           s = -------
                  2
                 __________________
        area = \/ s(s-a)(s-b)(s-c)
      
        s here is known as the semi-perimeter (half the perimeter of
        the triangle) and is a temporary value to simplify the actual
        calculation of the area.
    */
    {
        double area,       // area of triangle
               s;          // Heron's semi-perimeter...left as s
                           // because most don't know what it is
                           // and only recognize it as s.
        s = (side1+side2+side3)/2;
        area = sqrt(s*(s-side1)*(s-side2)*(s-side3));
        return area;
    }

Now I'll remove the redundant local variable area since that is simply being stored and returned, I don't really need to make that space, fill that space, copy from that space, and then destroy that space during a call. I can just return the result of the calculation directly:

    /*
       calculates area of a triangle from Heron's formula.  Needs
       three sides' lengths as input.
       
       Heron's formula (sides are a, b, and c):
       
                a+b+c
           s = -------
                  2
                 __________________
        area = \/ s(s-a)(s-b)(s-c)
      
        s here is known as the semi-perimeter (half the perimeter of
        the triangle) and is a temporary value to simplify the actual
        calculation of the area.
    */
    {
        double s;          // Heron's semi-perimeter...left as s
                           // because most don't know what it is
                           // and only recognize it as s.
        s = (side1+side2+side3)/2;
        return sqrt(s*(s-side1)*(s-side2)*(s-side3));
    }

And now for a head... We are returning a double result that represents the area of a triangle. In order to calculate that area, we need from the caller the lengths of the three sides of the triangle. Each of these will be a double -- just in case. So:

    /*
       Calculates area of a triangle from Heron's formula.  Needs
       three sides' lengths as input.
       
       Heron's formula (sides are a, b, and c):
       
                a+b+c
           s = -------
                  2
                 __________________
        area = \/ s(s-a)(s-b)(s-c)
      
        s here is known as the semi-perimeter (half the perimeter of
        the triangle) and is a temporary value to simplify the actual
        calculation of the area.
    */
    double area_triangle(double side1, double side2, double side3)
    {
        double s;          // Heron's semi-perimeter...left as s
                           // because most don't know what it is
                           // and only recognize it as s.
        s = (side1+side2+side3)/2;
        return sqrt(s*(s-side1)*(s-side2)*(s-side3));
    }

Now that's a fine head and a complete function definition!

Replacing the Old Code with a Call

Next I'll go back to the main program (the caller in this example) and replace that huge chunk of comment and calculation with a simple call to our new 'factored out' function:

int main(void)
{
    double side1, side2, side3,      // 3 side lengths for triangle
           area;                     // area of triangle

    // prints a welcome message on the screen.
    cout << "\nWelcome to the 3-Sided Triangle Program!\n\n"
            "Follow the prompts and you enjoy the ride!\n"
         << endl;

    cout << "Enter length of the triangle's first side:  ";
    cin >> side1;
    cout << "Enter length of the triangle's second side:  ";
    cin >> side2;
    cout << "Enter length of the triangle's third side:  ";
    cin >> side3;

    area = area_triangle(side1, side2, side3);

    // prints area in nice message format.
    cout << "\nThe area of your triangle is "
         << area << '.' << endl;

    // prints a goodbye message on the screen.
    cout << "\n\nThank you for using the 3STP!\n\n"
            "Please endeavour to have a bountiful day!\n"
         << endl;

    return 0;
}

That couldn't be simpler, could it?! The main is so much more fluid now. It's like we did that Asian furniture thing ...but with code. *grin*

Where Do I Put All the Parts?

We've placed the call in the main (where we removed the original calculations, comments, and the s variable. We just need to place the prototype and definition in useful places to complete this simple de-cluttering.

The Prototype

Before the compiler will allow a call to the function, it needs to know that function's name, necessary input(s), and (usually) what kind of information is coming out. All of that is in the head. So, we place a copy of the head at the top of the program -- after any #include's and using but before the main:

    #include < ... >
    using namespace std;

    /*
       Calculates area of a triangle from Heron's formula.  Needs
       three sides' lengths as input.
    */
    double area_triangle(double side1, double side2, double side3);

    int main(void)
    {
        // ...
        return 0;
    }

Note the semi-colon added to the end of the head. The head by itself isn't a complete thought in C++. But with the semi-colon, the compiler knows that we are done with what we were trying to say. In this case, we were declaring the future existence of a function with the specified interface/head. Newer programmers may call this a function declaration (like a variable declaration). But traditionalists (older programmers) will know it as a function prototype. I will call it a prototype.

Note also that the comments on the prototype are much shorter than those we had on the definition. Those on the definition typically describe not only what the function does, what it needs to do that, and what comes back out to the caller, but also how those tasks are accomplished and how that output is generated. Those on the prototype, however, simply tell a potential caller what they need to know -- enforcing the black-box principle or encapsulation. That is, prototype comments only need to tell what the function does, what it needs to do that, and what comes back out to the caller.

The Definition

Now we need to tell the compiler how the function works by showing it the body. This is placed under another copy of the head below the main program:

    int main(void)
    {
        // ...
        return 0;
    }

    /*
       Calculates area of a triangle from Heron's formula.  Needs
       three sides' lengths as input.
       
       Heron's formula (sides are a, b, and c):
       
                a+b+c
           s = -------
                  2
                 __________________
        area = \/ s(s-a)(s-b)(s-c)
      
        s here is known as the semi-perimeter (half the perimeter of
        the triangle) and is a temporary value to simplify the actual
        calculation of the area.
    */
    double area_triangle(double side1, double side2, double side3)
    {
        double s;          // Heron's semi-perimeter...left as s
                           // because most don't know what it is
                           // and only recognize it as s.
        s = (side1+side2+side3)/2;
        return sqrt(s*(s-side1)*(s-side2)*(s-side3));
    }

This is known (be ye' old or new) as the function definition. It defines how the function does its job.

The Call

The final piece of a function is the call. This is a stripped version of the head. The exact form varies from head to head, but one thing remains the same: there are NO data types in the call.

You can call a function from any other function. Certainly you can call functions from the main(). But, as long as you don't call a function from within its own body, you may call it from anywhere.

We saw the call above, but here is the whole program with the area calculation factored out of the main:

    #include <iostream>
    #include <cmath>

    using namespace std;

    /*
       Calculates area of a triangle from Heron's formula.  Needs
       three sides' lengths as input.
    */
    double area_triangle(double side1, double side2, double side3);

    int main(void)
    {
        double side1, side2, side3,      // 3 side lengths for triangle
               area;                     // area of triangle
    
        // prints a welcome message on the screen.
        cout << "\nWelcome to the 3-Sided Triangle Program!\n\n"
                "Follow the prompts and you enjoy the ride!\n"
             << endl;
    
        cout << "Enter length of the triangle's first side:  ";
        cin >> side1;
        cout << "Enter length of the triangle's second side:  ";
        cin >> side2;
        cout << "Enter length of the triangle's third side:  ";
        cin >> side3;
    
        area = area_triangle(side1, side2, side3);
    
        // prints area in nice message format.
        cout << "\nThe area of your triangle is "
             << area << '.' << endl;
    
        // prints a goodbye message on the screen.
        cout << "\n\nThank you for using the 3STP!\n\n"
                "Please endeavour to have a bountiful day!\n"
             << endl;

        return 0;
    }

    /*
       Calculates area of a triangle from Heron's formula.  Needs
       three sides' lengths as input.
       
       Heron's formula (sides are a, b, and c):
       
                a+b+c
           s = -------
                  2
                 __________________
        area = \/ s(s-a)(s-b)(s-c)
      
        s here is known as the semi-perimeter (half the perimeter of
        the triangle) and is a temporary value to simplify the actual
        calculation of the area.
    */
    double area_triangle(double side1, double side2, double side3)
    {
        double s;          // Heron's semi-perimeter...left as s
                           // because most don't know what it is
                           // and only recognize it as s.
        s = (side1+side2+side3)/2;
        return sqrt(s*(s-side1)*(s-side2)*(s-side3));
    }

Isn't That Longer Now?!

Skeptics and nay-sayers may note that the program is now several lines longer than it was originally. However, the main program is clean and clear and more readable than it once was. At each point of the function's existance, we see/know only what is necessary. Each part is focused on its duty and is a sleek and functional (pardon the pun) entity.

And, if we were to have a re-usable function factored out, we'd be replacing its code with simple calls in several places.

Deja-Code: Refactoring Again

Let's look at the input code in the main. It seems not only a bit cluttery (more evident now that the area calculations have been factored out), but also a bit redundant. Certainly there are three sides to read and we can't get away from that, but we can surely find a common pattern to re-use here! Examining it more closely we find that:

    cout << "Enter length of the triangle's first side:  ";
    cin >> side1;
    cout << "Enter length of the triangle's second side:  ";
    cin >> side2;
    cout << "Enter length of the triangle's third side:  ";
    cin >> side3;

Note all of the parts that these three prompt/read's have in common. They only seem to differ in one of two ways. If only we could replace those differing parts with variables somehow, we'd be set!

Differences Are Variables

But, Jason, what do you mean? I see the differences in the three prompt/read statement pairs, but what's this about replacing them with variables? I don't get it...

Okay, think back to math. If you were given this function:

   f(x) = 4x2 - 3x + 12

And then you were asked to evaluate the function at the values of x -3, 1/2, and 4:

    f(-3) = 4(-3)2 - 3(-3) + 12
          = 4(9) - -9 + 12
          = 36 + 21
          = 57

    f(1/2) = 4(1/2)2 - 3(1/2) + 12
           = 4(1/4) - 3/2 + 12
           = 1 - 3/2 + 24/2
           = 2/2 + 21/2
           = 23/2

    f(4) = 4(4)2 - 3(4) + 12
         = 4(16) - 12 + 12
         = 64 + 0
         = 64

You started this process by 'plugging in' or substituting the specified value for the variable x in the function's definition. That's actually pretty close to how the compiler executes functions you write. It plugs in or substitutes the actual arguments' values for the formal arguments in the function's code/definition.

But, we've gone just a tad too far. Step back to the substitution step:

    f(-3) = 4(-3)2 - 3(-3) + 12

    f(1/2) = 4(1/2)2 - 3(1/2) + 12

    f(4) = 4(4)2 - 3(4) + 12

Now, we'll just remove the left side of that equation (and the = sign):

    4 * (-3)2  - 3 * (-3)  + 12

    4 * (1/2)2 - 3 * (1/2) + 12

    4 * (4)2   - 3 * (4)   + 12

Notice how the expressions all share a common form. What if, instead of having been given the function f(x) to start with, we were instead given these three expressions. Now instead of simply evaluating them, our job is to create a function that serves the common purpose: multiply the square by 4, subtract thrice the original value, and add 12. We look through the expressions to see what they have in common and what differs to determine this purpose or functionality. Then, by replacing the differing bits with a variable, we extract a function:

Phase I

Note similarities and differences in expressions.

    4 * (-3)2  - 3 * (-3)  + 12

    4 * (1/2)2 - 3 * (1/2) + 12

    4 * (4)2   - 3 * (4)   + 12
Phase II

Replace differences with a variable.

    4 * (x)2   - 3 * (x)   + 12
Phase III

Name the function.

   f(x) = 4x2 - 3x + 12

Applying What We Learned

So, back to C++, we had three prompt/input pairs with some common parts and two types of differences:

    cout << "Enter length of the triangle's first side:  ";
    cin >> side1;
    cout << "Enter length of the triangle's second side:  ";
    cin >> side2;
    cout << "Enter length of the triangle's third side:  ";
    cin >> side3;

The second difference actually IS a varaible already. And, it seems to be the entire point of the input operation. So I'll consider that for the function's output/result.

The first difference, though, is a part of a string -- a sub-string! Surely we can't do anything with that?!

Let's think it through. It is being simply output to the screen. It doesn't have to be physically part of the surrounding string -- it just has to appear in that order/position relative to the first part of the string and the last part. It's a single word that represents which of the sides we are currently requesting. Plus, I have a data type that can hold words (sequences of characters): string.

Putting all of this together, I get an idea:

    double side;
    cout << "Enter length of the triangle's " << which_side << " side:  ";
    cin >> side;
    return side;

That seems to summarize what we've got going on and parameterizes (i.e. makes parameters out of) the parts that differed in the original code. [Parameterize is actually a mathematics term and, truth-be-told, you should avoid using the term 'parameter' and its derivatives in the programming world unless you know what your immediate audience thinks it means. It can have drastically different meanings in different contexts!]

Now to simply complete the body, head, and attach a comment:

    /*
        Prompts the user for the designated side and reads it in.
        The length of the side is returned.
    */
    double read_side(string which_side)
    {
        double side;
        cout << "Enter length of the triangle's " << which_side
             << " side:  ";
        cin >> side;
        return side;
    }

There really isn't much magic to this function, so the prototype's comment will be the same as the definition's was:

    /*
        Prompts the user for the designated side and reads it in.
        The length of the side is returned.
    */
    double read_side(string which_side);

And then we just need to put it into action back in the main program:

    side1 = read_side("first");
    side2 = read_side("second");
    side3 = read_side("third");

We've taken up half as much space in the main and made things more readable to-boot!

Once more, the entire program, re-factored:

    #include <iostream>
    #include <string>
    #include <cmath>

    using namespace std;

    /*
        Prompts the user for the designated side and reads it in.
        The length of the side is returned.
    */
    double read_side(string which_side);

    /*
       Calculates area of a triangle from Heron's formula.  Needs
       three sides' lengths as input.
    */
    double area_triangle(double side1, double side2, double side3);

    int main(void)
    {
        double side1, side2, side3,      // 3 side lengths for triangle
               area;                     // area of triangle
    
        // prints a welcome message on the screen.
        cout << "\nWelcome to the 3-Sided Triangle Program!\n\n"
                "Follow the prompts and you enjoy the ride!\n"
             << endl;
    
        side1 = read_side("first");
        side2 = read_side("second");
        side3 = read_side("third");
    
        area = area_triangle(side1, side2, side3);
    
        // prints area in nice message format.
        cout << "\nThe area of your triangle is "
             << area << '.' << endl;
    
        // prints a goodbye message on the screen.
        cout << "\n\nThank you for using the 3STP!\n\n"
                "Please endeavour to have a bountiful day!\n"
             << endl;

        return 0;
    }

    /*
       Calculates area of a triangle from Heron's formula.  Needs
       three sides' lengths as input.
       
       Heron's formula (sides are a, b, and c):
       
                a+b+c
           s = -------
                  2
                 __________________
        area = \/ s(s-a)(s-b)(s-c)
      
        s here is known as the semi-perimeter (half the perimeter of
        the triangle) and is a temporary value to simplify the actual
        calculation of the area.
    */
    double area_triangle(double side1, double side2, double side3)
    {
        double s;          // Heron's semi-perimeter...left as s
                           // because most don't know what it is
                           // and only recognize it as s.
        s = (side1+side2+side3)/2;
        return sqrt(s*(s-side1)*(s-side2)*(s-side3));
    }

    /*
        Prompts the user for the designated side and reads it in.
        The length of the side is returned.
    */
    double read_side(string which_side)
    {
        double side;
        cout << "Enter length of the triangle's " << which_side
             << " side:  ";
        cin >> side;
        return side;
    }

Notice that we had to add string to the #include section at the top of the program!

So That's It?

Of course not! We can use [re]factoring to de-clutter the rest of the main as well! The three remaining statements are simply syntactic clutter, after all -- monolithic cout's galavanting about taking up as much space as they think they can get away with!

Messages

The first and last ones of these -- the greeting and closing -- are simple affairs, really. They know what to print and they do that job. They have no need of input and no results to give back to us. The usefulness of their jobs is in their side effects. That is, when they print messages on the screen, the user gets information and that proves useful to the main program.

Their definitions might look something like this:

    void greeting(void)
    {
        cout << "this is the welcome message";
        return;
    }

    void closing(void)
    {
        cout << "and now it's time to say good-bye";
        return;
    }

But, even just looking at these simplified forms, we notice that deja-code feeling again:

    void greeting(void)
    {
        cout << "this is the welcome message";
        return;
    }

    void closing(void)
    {
        cout << "and now it's time to say good-bye";
        return;
    }

Again, we've high-lighted the common parts of each function as well as the different parts. (We only look at the bodies since any differences/commonalities in the heads would have been as a result of the shared differences/commonalities in their body code.) Since the difference is a string, we can simply make that an input argument for the function. Generalizing the name of the function we now have:

    // prints the desired message for the user to read...
    void print_message(string display_me)
    {
        cout << display_me;
        return;
    }

And a prototype:

    // prints the desired message for the user to read...
    void print_message(string display_me);

And the calls:

    print_message("\nWelcome to the 3-Sided Triangle Program!\n\n"
                  "Follow the prompts and you enjoy the ride!\n\n");

    // ...

    print_message("\n\nThank you for using the 3STP!\n\n"
                  "Please endeavour to have a bountiful day!\n\n");

(Notice how the compiler combines the two literal strings separated only by white-space together -- even when passing them to a function!)

Results

The results of this program are fairly simple and we'll just encapsulate them to remove that clutter...

A function:

    // prints area in nice message format.
    void print_results(double area)
    {
        cout << "\nThe area of your triangle is "
             << area << '.' << endl;
        return;
    }

A prototype:

    // prints area in nice message format.
    void print_results(double area);

And a call:

    print_results(area);

Final Form of Refactored Program

Once all of these cluttery or re-usable parts are factored out of the main program, we have many functions, but each is simpler and easier to read/maintain/update. Here is how the whole program looks now:

    #include <iostream>
    #include <string>
    #include <cmath>

    using namespace std;

    // prints the desired message for the user to read...
    void print_message(string display_me);

    /*
        Prompts the user for the designated side and reads it in.
        The length of the side is returned.
    */
    double read_side(string which_side);

    /*
       Calculates area of a triangle from Heron's formula.  Needs
       three sides' lengths as input.
    */
    double area_triangle(double side1, double side2, double side3);

    // prints area in nice message format.
    void print_results(double area);

    int main(void)
    {
        double side1, side2, side3,      // 3 side lengths for triangle
               area;                     // area of triangle
    
        print_message("\nWelcome to the 3-Sided Triangle Program!\n\n"
                      "Follow the prompts and you enjoy the ride!\n\n");
    
        side1 = read_side("first");
        side2 = read_side("second");
        side3 = read_side("third");
    
        area = area_triangle(side1, side2, side3);
    
        print_results(area);
    
        print_message("\n\nThank you for using the 3STP!\n\n"
                      "Please endeavour to have a bountiful day!\n\n");

        return 0;
    }

    /*
       Calculates area of a triangle from Heron's formula.  Needs
       three sides' lengths as input.
       
       Heron's formula (sides are a, b, and c):
       
                a+b+c
           s = -------
                  2
                 __________________
        area = \/ s(s-a)(s-b)(s-c)
      
        s here is known as the semi-perimeter (half the perimeter of
        the triangle) and is a temporary value to simplify the actual
        calculation of the area.
    */
    double area_triangle(double side1, double side2, double side3)
    {
        double s;          // Heron's semi-perimeter...left as s
                           // because most don't know what it is
                           // and only recognize it as s.
        s = (side1+side2+side3)/2;
        return sqrt(s*(s-side1)*(s-side2)*(s-side3));
    }

    /*
        Prompts the user for the designated side and reads it in.
        The length of the side is returned.
    */
    double read_side(string which_side)
    {
        double side;
        cout << "Enter length of the triangle's " << which_side
             << " side:  ";
        cin >> side;
        return side;
    }

    // prints the desired message for the user to read...
    void print_message(string display_me)
    {
        cout << display_me;
        return;
    }

    // prints area in nice message format.
    void print_results(double area)
    {
        cout << "\nThe area of your triangle is "
             << area << '.' << endl;
        return;
    }

Just In Case...

The terms 'refactor', 'refactoring', etc. come from algebra's notion of factoring. When we refactor code we are taking code that's been around and working for some length of time (however long or short -- no pun intended) and pull out the common or cluttery parts as functions. Since the pulling out of the common parts is similar to factoring numbers or algebraic expressions, we liked the term.

But we weren't using it in the pure and mathematical sense overall, so we simply said we were re-factoring the code. (After all, any well designed program should already be broken into functions to begin with. We were just pulling out commonality ...or clutter... that the original programmer didn't notice during the design phase.)