The Wizard of Excel

Up front, you should know that I have been hacking, in the arcane sense of that word, around with spreadsheets since Lotus 1-2-3. I am self taught, with all of the limitations implied; in other words, sometimes I know what I don’t know, other times I don’t know what I don’t know, and the rest of the time I stumble around in total foggery.

But sometimes little nuggets of understanding fall from the sky and land in my lap. Today was such a day.

I exported some data from a SQL database to Excel, for the purpose of finding missing pieces of information (does that make sense?). In other words, I had to see what I had in order to know what I didn’t have. The spreadsheet was 6 columns and 3,789 rows, not too big in the hands of a power user, but plenty unwieldy for me. I put the old auto filter to work to begin to sort the pieces of data that I was looking for. It quickly became clear that my data was buried within a column that contained multiple data types. It could not be separated by a sort or a filter.

Stymied again was my first thought, defeated again by the complexities that lay beyond my understanding.

But then I grasped that the data was actually text, and that the particular data set that I needed was made up of 4 characters or less, while the unwanted data had either 5 letters or numbers. I recalled that in my programming classes we had used a function (Len) to measure the length of a string and either slice or concatenate text to suit our needs. Could Excel have such functionality?

Of course it does:

Len(a2)                         returns the length of the string in cell a2.

Further, you can tell Excel to count from the left, right, or middle. All I needed to do was tell Excel to give me the length of the string, starting from the right of the cell. From a source found on the web, called Tech Republic:

Figure A

We’ll use functions to extract certain portions of the entries in column A.

Those strings contained three distinct parts:

  • The first three characters (the K-numbers) represent a product code.
  • The second two digits (the B-numbers) represent a price code.
  • The final three digits represent a customer code.

My coworker wanted to separate out those three pieces into different columns, and she was retyping them from scratch! That approach is so wasteful and inefficient it makes my skin crawl. Fortunately, I got to be the hero by showing her how to use Excel’s string functions to extract the codes automatically.
Fun with string functions
All of you veteran spreadsheet users know this drill by heart. Here’s how it works.
Grabbing the first three characters. To extract the first three characters of the text entries, you enter the Left function like this:
=Left(source_string,number_of_characters)
In this case, we entered into cell B2 the function =Left(A2,3) and then copied that formula to cells B3:B8. Figure B shows the results.

Figure B

The Left function eliminates the need to re-key the first three letters from the entries in column A.

Pulling out the two characters in the middle. To extract the two characters in the middle of the string, we’ll use the Mid function, which takes the form:
=Mid(source_string,start_position,length)
Since we know that the string we want to extract always starts in position 4, we entered into cell C2 the function =Mid(A2,4,2) and then copied that formula to cells C3:C8. Figure C shows the results.

Figure C

The Mid function lets you pull a string out of the middle.

Extracting the last three characters of a string. In order to extract the last three characters of a string, you use the Right function in the form:
=Right(source_string,number_of_characters)
In our example, we entered in cell D2 the function =Right(A2,3) and copied it into cells D3:D8. As Figure D shows, that function returns the three rightmost characters in the source string.

Figure D

The Right function makes it easy to copy a set of characters from the right side of a string.

Once you’ve extracted the strings, then what?
After you’ve used the string functions to parse the source string into substrings, you’re free to sort or subtotal your data on any of the columns that contain those substrings. It only takes a minute or two to compose the function call and copy it to the appropriate cells. This technique comes in handy when you’re importing text files that have dumped from a mainframe database or from some other application.

Pretty cool, huh? So all I had to do was put the string search in an adjacent cell, and then run a simple if statement….

An explanation of the if statement in Excel:

Returns one value if a condition you specify evaluates to TRUE and another value if it evaluates to FALSE.

Use IF to conduct conditional tests on values and formulas.

Syntax

IF(logical_test,value_if_true,value_if_false)

Logical_test     is any value or expression that can be evaluated to TRUE or FALSE. For example, A10=100 is a logical expression; if the value in cell A10 is equal to 100, the expression evaluates to TRUE. Otherwise, the expression evaluates to FALSE. This argument can use any comparison calculation operator.

Value_if_true     is the value that is returned if logical_test is TRUE. For example, if this argument is the text string “Within budget” and the logical_test argument evaluates to TRUE, then the IF function displays the text “Within budget”. If logical_test is TRUE and value_if_true is blank, this argument returns 0 (zero). To display the word TRUE, use the logical value TRUE for this argument. Value_if_true can be another formula.

Value_if_false     is the value that is returned if logical_test is FALSE. For example, if this argument is the text string “Over budget” and the logical_test argument evaluates to FALSE, then the IF function displays the text “Over budget”. If logical_test is FALSE and value_if_false is omitted, (that is, after value_if_true, there is no comma), then the logical value FALSE is returned. If logical_test is FALSE and value_if_false is blank (that is, after value_if_true, there is a comma followed by the closing parenthesis), then the value 0 (zero) is returned. Value_if_false can be another formula.

So, by now you’ve probably got it figured out. I used the Len function to count the text in each cell of the column, and then I used an if statement to copy the text I wanted into yet another column, while not printing the text that I did not want. And, it worked!

To say that I felt a sense of achievement would be a gross understatement.

What a day for your scribe – a nice mix of programming techniques mixed with a deeper understanding of Excel, the combination of which is at least as satisfying as a nice, dry martini and a Fuente 8-5-8.

Yea Me!

Advertisements

2 thoughts on “The Wizard of Excel

  1. maia ingle

    My eyes glazed over about 1/2 way through. However, I am very impressed with your fortitude and tenacity!

    Reply
  2. Poppy

    Never have I had more fun with a string function. I don’t know how you do this and work, too. Interesting though.

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s