======Exercise 9: Arrays And Strings====== In the last exercise you went through an introduction to creating basic arrays and how they map to strings. In this exercise we'll more completely show the similarity between arrays and strings, and get into more about memory layouts. This exercise shows you that C stores its strings simply as an array of bytes, terminated with the '\0' (nul) byte. You probably clued into this in the last exercise since we did it manually. Here's how we do it in another way to make it even more clear by comparing it to an array of numbers: #include <stdio.h> int main(int argc, char *argv[]) { int numbers[4] = {0}; char name[4] = {'a'}; // first, print them out raw printf("numbers: %d %d %d %d\n", numbers[0], numbers[1], numbers[2], numbers[3]); printf("name each: %c %c %c %c\n", name[0], name[1], name[2], name[3]); printf("name: %s\n", name); // setup the numbers numbers[0] = 1; numbers[1] = 2; numbers[2] = 3; numbers[3] = 4; // setup the name name[0] = 'Z'; name[1] = 'e'; name[2] = 'd'; name[3] = '\0'; // then print them out initialized printf("numbers: %d %d %d %d\n", numbers[0], numbers[1], numbers[2], numbers[3]); printf("name each: %c %c %c %c\n", name[0], name[1], name[2], name[3]); // print the name like a string printf("name: %s\n", name); // another way to use name char *another = "Zed"; printf("another: %s\n", another); printf("another each: %c %c %c %c\n", another[0], another[1], another[2], another[3]); return 0; } In this code, we setup some arrays the tedious way, by assigning a value to each element. In numbers we are setting up numbers, but in name we're actually building a string manually. ======What You Should See====== When you run this code you should see first the arrays printed with their contents initialized to zero, then in its initialized form: $ make ex9 cc -Wall -g ex9.c -o ex9 $ ./ex9 numbers: 0 0 0 0 name each: a name: a numbers: 1 2 3 4 name each: Z e d name: Zed another: Zed another each: Z e d $ You'll notice some interesting things about this program: * I didn't have to give all 4 elements of the arrays to initialize them. This is a short-cut that C has where, if you set just one element, it'll fill the rest in with 0. * When each element of numbers is printed they all come out as 0. * When each element of name is printed, only the first element 'a' shows up because the '\0' character is special and won't display. * Then the first time we print name it only prints "a" because, since the array will be filled with 0 after the first 'a' in the initializer, then the string is correctly terminated by a '\0' character. * We then setup the arrays with a tedious manual assignment to each thing and print them out again. Look at how they changed. Now the numbers are set, but see how the name string prints my name correctly? * There's also two syntaxes for doing a string: char name[4] = {'a'} on line 6 vs. char *another = "name" on line 44. The first one is less common and the second is what you should use for string literals like this. Notice that I'm using the same syntax and style of code to interact with both an array of integers and an array of characters, but that printf thinks that the name is just a string. Again, this is because to the C language there's no difference between a string and an array of characters. Finally, when you make string literals you should usually use the char *another = "Literal" syntax. This works out to be the same thing, but it's more idiomatic and easier to write. ======How To Break It====== The source of almost all bugs in C come from forgetting to have enough space, or forgetting to put a '\0' at the end of a string. In fact it's so common and hard to get right that the majority of good C code just doesn't use C style strings. In later exercises we'll actually learn how to avoid C strings completely. In this program the key to breaking it is to forget to put the '\0' character at the end of the strings. There's a few ways to do this: * Get rid of the initializers that setup name. * Accidentally set name[3] = 'A'; so that there's no terminator. * Set the initializer to {'a','a','a','a'} so there's too many 'a' characters and no space for the '\0' terminator. Try to come up with some other ways to break this, and as usual run all of these under Valgrind so you can see exactly what is going on and what the errors are called. Sometimes you'll make these mistakes and even Valgrind can't find them, but try moving where you declare the variables to see if you get the error. This is part of the voodoo of C, that sometimes just where the variable is located changes the bug. ======Extra Credit====== * Assign the characters into numbers and then use printf to print them a character at a time. What kind of compiler warnings did you get? * Do the inverse for name, trying to treat it like an array of int and print it out one int at a time. What does Valgrind think of that? * How many other ways can you print this out? * If an array of characters is 4 bytes long, and an integer is 4 bytes long, then can you treat the whole name array like it's just an integer? How might you accomplish this crazy hack? * Take out a piece of paper and draw out each of these arrays as a row of boxes. Then do the operations you just did on paper to see if you get them right. * Convert name to be in the style of another and see if the code keeps working. Copyright (C) 2010 Zed. A. Shaw Credits