PA3

Programming Assignment Three: PA3 (anagrams)
Milestone: Wednesday night, May 13 @ 11:59pm
Final: Tuesday night, May 19 @ 11:59pm
Overview:
The purpose of this programming assignment is to gain more experience with C programming, the Standard C
Library routines, system calls, dynamic memory allocation on the Heap using malloc() and realloc(),
sorting using qsort() and associated compare functions, hash table creation and insertion using hcreate()
and hsearch(), reading data from files with fgets(), and more string manipulation using strlen(),
strchr(), strncpy(), and strncmp().
In this program you will find various anagrams of a word read from stdin. You are writing an interactive
program that prompts for input, takes input from stdin, directs the output to stdout, and prompts for another
input. The program will not exit until the user types the control sequence to indicate no more input (EOF). On
Unix the sequence is Control-D (^D), and Control-Z (^Z) on DOS-based environments.
Grading:
README: 10 points
Compiling: 5 points
Using our Makefile; no warnings.
Style: 10 points
Correctness: 75 points
-10 points for each unimplemented module or module written in the wrong language (C vs
Assembly and vice versa).
Includes both abnormal and normal output, both to the correct destination (stderr vs stdout).
Make sure you have all files tracked in Git - we will be checking for multiple commits of each
file and that they were meaningful commits.
Wrong Language: -10 points
-10 for each module in the wrong language, C vs. Assembly or vice versa.
Extra Credit: 5 points
NOTE: If what you turn in does not compile with given Makefile, you will receive 0 points for this assignment.
Setup and Git:
You are required to use Git with this and all future programming assignments. Look at the PA0 writeup for
additional information on Git.
Setting up a local repository
Create your pa3 directory and initialize a local git repository:
[cs30xyz@ieng9]:$ mkdir ~/pa3
[cs30xyz@ieng9]:$ cd ~/pa3
[cs30xyz@ieng9]:pa3$ git init
Starter Files
[cs30xyz@ieng9]:pa3$
[cs30xyz@ieng9]:pa3$
[cs30xyz@ieng9]:pa3$
[cs30xyz@ieng9]:pa3$
[cs30xyz@ieng9]:pa3$
[cs30xyz@ieng9]:pa3$
[cs30xyz@ieng9]:pa3$
[cs30xyz@ieng9]:pa3$
cp
cp
cp
cp
cp
cp
cp
cp
~/../public/pa3.h ~/pa3
~/../public/pa3_strings.h ~/pa3
~/../public/pa3_globals.c ~/pa3
~/../public/Makefile-PA3 ~/pa3/Makefile
~/../public/test.h ~/pa3
~/../public/testcharCompare.c ~/pa3
~/../public/words ~/pa3
~/../public/anagram_phrases ~/pa3
Example Output
A sample stripped executable for you to try, and compare your output against, is available at:
~/../public/pa3test
When there is a discrepancy between the sample output in this document and the pa3test output,
follow the pa3test output.
Below are some example outputs of this program.
1. Invalid Output
1.1. Wrong number of arguments.
[cs30xyz@ieng9]:pa3:527$ ./anagrams words1 words2
Usage: ./anagrams dictionary_file
dictionary_file - containing a list of words
1.2. Run out of memory.
[cs30x2@ieng9]:pa3.sp15.anagrams-hash:553$ ulimit -d 8
[cs30x2@ieng9]:pa3.sp15.anagrams-hash:555$ ./anagrams words
Error creating hash table: Not enough space
1.3. Invalid dictionary file.
[cs30xyz@ieng9]:pa3:544$ ./anagrams words1
Error opening dictionary file: No such file or directory
2. Valid Output
2.1. Word with several anagrams.
[cs30xyz@ieng9]:pa3:545$ ./anagrams words
Enter a word to search for anagrams [^D to exit]: stop
Anagram(s) are: opts, OSTP, Post, post, POTS, pots, SPOT, spot, TOPS, tops
Enter a word to search for anagrams [^D to exit]:
2.2. Word with no anagrams.
[cs30xyz@ieng9]:pa3:546$ ./anagrams words
Enter a word to search for anagrams [^D to exit]: thisisnotaword
No anagrams found.
Enter a word to search for anagrams [^D to exit]:
2.3. Word with only itself as an anagram
[cs30xyz@ieng9]:pa3:547$ ./anagrams words
Enter a word to search for anagrams [^D to exit]: Zygote
Anagram(s) are: zygote
Enter a word to search for anagrams [^D to exit]: zygote
No anagrams found.
Enter a word to search for anagrams [^D to exit]:
2.4. Exiting successfully.
[cs30xyz@ieng9]:pa3:548$ ./anagrams words
Enter a word to search for anagrams [^D to exit]: ^D
Overview
The function prototypes for the various C and Assembly functions are as follows.
C routines
int appendAnagram(struct anagramHeader *head, struct anagram *anagram)
struct anagramHeader* createAnagramHeader(const char *key)
void createHashKey(char *key, const char *src,
int (*compare)(const void *, const void *))
int initTable(FILE *file)
void printAnagrams(struct anagramHeader *head, char *word)
int main(int argc, char* argv[])
void searchForAnagrams()
Assembly routines:
int charCompare(const void *lhs, const void *rhs)
void printUsage(const char *programName)
void stripNewLine(char *str)
Memory Overview
Seen below is a diagram to help demonstrate the layout of the program. We will be using the C stdlib hash
table functions: hcreate(), hsearch(), and hdestroy().
Each element of the table is a type ENTRY which is a struct containing two char pointers, key and data. The
key is used to find elements within the table so each ENTRY must have a unique key.
In our case the key is pointing to a character array of lower-case sorted letters. All words that are anagrams of
each other share this key. For example “post” and “pots” are anagrams of one another and they both are
“opst” when lower-case and sorted.
The data will point to a struct anagramHeader (note that data is defined as a char* so we will need to
use casting) which contains an array of characters (for the key in the entry to point to), a pointer to a dynamic
array of struct anagram (each struct anagram has a char array for the word it represents, all of which
share the same lower-case sorted key), and a count of the number of elements currently in the array. This
struct anagramHeader is dynamically allocated using malloc(). The array of struct anagram that it
holds a pointer to is also dynamically allocated, but using realloc() so that it can be resized as more words
are appended to it.
C Modules
1. appendAnagram
int appendAnagram(struct anagramHeader *head, struct anagram *anagram)
This function allocates additional memory at the end of the struct anagram array pointed to by the
anagramHeader that is passed in using realloc() and copies the struct anagram that is passed in to
the newly allocated memory at the end of the array using memcpy().
Return Value
Returns -1 if the realloc() fails, otherwise returns the number of elements in the array after the new
addition.
2. createAnagramHeader
struct anagramHeader* createAnagramHeader(const char *key)
This function allocates memory for one struct anagramHeader using malloc() and initializes its values.
The key is set by using strncpy() to copy the key value passed in, the anagrams pointer is set to NULL, and
numElements is set to zero.
Return Value
Returns NULL if the malloc() fails, otherwise returns a pointer to the struct.
3. createHashKey
void createHashKey(char *key, const char *src,
int (*compare)(const void *, const void *))
This function copies each character from src to key and makes each lower case using tolower(), then it
sort the characters using qsort() and the passed in comparison function compare.
Return Value
None.
4. initTable
int initTable(FILE *file)
This function reads one line of the file at a time using fgets() and inserts the word into the hash table using
hsearch(). If an ENTRY with the same key is already inserted in the table, the word must be appended to the
anagrams array associated with that entry using appendAnagram(). If there is not already an ENTRY, then a
struct anagramHeader must be allocated using createAnagramHeader() and a corresponding ENTRY
inserted into the table before appending the new word.
Return Value
If createAnagramHeader(), apendAnagram() or hsearch() fail for any reason, return -1, otherwise
return the number of words inserted into the table.
5. printAnagrams
void printAnagrams(struct anagramHeader *head, char *word)
This function must print out each word in the struct anagram array pointed to within head. It must not print any
words that match the word passed in (because a word isn’t an anagram of itself).
Return Value
None.
6. main
int main(int argc, char* argv[])
This function drives the program by checking for the expected number of arguments, reading the file passed in
on the command line, creating the hash table using hcreate(), filling the table by calling initTable(),
interacting with the user using searchForAnagrams(), then destroying the table using hdestroy().
Make sure to check for errors with any of the mentioned function calls and print the correct string found in
pa3_strings.h. For the calls to hcreate() and fopen() make sure to use perror() for error message
printing.
Return Value
If any function main() calls encounters an error, call exit(EXIT_FAILURE), otherwise return 0.
7. searchForAnagrams
void searchForAnagrams()
This function reads in a word from stdin using fgets(), then looks for an ENTRY in the table that has a
matching key. If a match is found (that isn’t only the word itself), printAnagrams() is called. If there is no
match, print that no anagrams were found. Keep re-prompting the user for a new word until they type in ^D.
Note, that when you find an ENTRY and try to access its data pointer (which must be casted to a pointer to
struct anagramHeader) that Lint will give you an error about “improper alignment”. Add the comment “/*
LINTED */” immediately above the cast to suppress the Lint error.
Return Value
None.
Assembly Modules
1. charCompare
int charCompare(const void *lhs, const void *rhs)
This function takes two pointers to characters (the prototype uses two void pointers, but it can be assumed
that they are char pointers) and compares them. This function must be a leaf subroutine.
Return Value
Return -1 if the first char is smaller, +1 if the first char is larger, and 0 if they are the same.
2. printUsage
void printUsage(const char *programName)
This function must print the usage message to stderr.
Return Value
None.
3. stripNewLine
void stripNewLine(char *str)
This function must use strchr() to look for a newline character in the str. If one is found, it should be
changed to a null character.
Return Value
None.
Unit Testing
Provided in the Makefile for PA3 are rules for compiling and running tests on individual functions that are part
of this assignment. You are given the source for testcharCompare.c used to test your charCompare()
function. Use testcharCompare.c as a template for you to write modules for your other test functions.
Unit tests you need to write:
testappendAnagram.c
testcreateAnagramHeader.c
testcreateHashKey.c
teststripNewLine.c
Think of how to test each of these functions -- boundary cases, special cases, general cases, extreme limits,
error cases, etc. as appropriate for each function.
As part of the grading, we will run all the required unit tests using the targets in the Makefile and manually
grade your required unit test programs.
README File
Along with your source code, you will be turning in a README (use all caps and no file extension for example,
the file is README and not README.txt) file with every assignment. Use vi/vim to edit this file!
Your README file for this and all assignments should contain:
- Header with your name, cs30x login
- High level description of what your program does
- How to compile it (usually just typing "make")
- How to run it (give an example)
- An example of normal output and where that normal output goes (stdout or a file or ???)
- An example of abnormal/error output and where that error output goes (stderr usually)
- How you tested your program
- Anything else that you would want/need to communicate with someone who has not read the writeup
Extra Credit
There are 5 points total for extra credit on this assignment.
[2 Points] Early turnin, 48 hours before regular due date and time. (1 point if you get it 24 hours early)
[3 Points] Modify your program to accept phrases with whitespace and punctuation. Strip the non-alphanumeric
characters before saving the lowercase sorted encoding of the line. Examples can be found in the
anagram_phrases file.
If you choose to do the extra credit, you will need to increase the MAX_WORD_LENGTH constant defined in
pa3.h to support the phrases in anagram_phrases.
Milestone and Turnin Instructions
Milestone Check - due Wednesday night, May 13 @ 11:59 pm
[16 points of Correctness Section]
Before final and complete turnin of your assignment, you are required to turnin several modules to your local
for Milestone check.
Files required for Milestone:
appendAnagram.c
charCompare.s
createAnagramHeader.c
createHashKey.c
stripNewLine.s
pa3.h
pa3_strings.h
Makefile
Each module is worth 3 points for a total of 15 points. Each module must pass all of our unit tests in order to
receive full credit. The function charCompare() must be written as a lead subroutine or it will not receive points
on the milestones and may have additional points taken off on the final assignment.
A working Makefile with all the appropriate targets and any required headers files must be turned in as well. All
five Makefile test cases must compile successfully via the commands make test*** each of the five
modules required for the Milestone.
In order for your files to be graded for the Milestone Check, you must use the milestone specific turnin script.
cd ~/pa3
cse30_pa3milestone_turnin
Complete Turnin - due Tuesday night, May 19 @ 11:59 pm
Once you have checked your output, compiled, executed your code, and finished your README file (see
below), you are ready to turn it in. Before you turn in your assignment, you should do make clean in order to
remove all the object files, lint files, core dumps, and executables.
How to Turn in an Assignment
First, you need to have all the relevant files in a subdirectory of your home directory. The subdirectory should
be named: pa#, where # is the number of the homework assignment.
Besides your source/header files, you may also have one or more of the following files. Note the capitalization
and case of each letter of each file.
Makefile: To compile your program with make -- usually provided or you will be instructed to modify an
existing Makefile.
README: Information regarding your program.
Again, we emphasize the importance of using the above names *exactly* otherwise our Makefiles won't find
your files.
When you are ready to submit your pa3, type:
Cd ~/pa3
cse30turnin pa3
Additionally, you can type the following to verify that everything was submitted properly.
cse30verify pa3
Failure to follow the procedures outlined here will result in your assignment not being collected properly and
will result in a loss of points. Late assignments WILL NOT are accepted.
If, at a later point you wish to make another submittal BEFORE the deadline:
cd
cse30turnin pa3
Or whatever the current pa# is, the new archive will replace/overwrite the old one.
To verify the time on your submission file:
cse30verify pa3
It will show you the time and date of your most recent submission. The governing time will be the one which
appears on that file, (the system time). The system time may be obtained by typing "date".
Your files must be located in a subdirectory of your home directory, named paX (where X is the assignment
number, without capitalizations). If the files aren't located there, they cannot be properly collected. Remember
to cd to your home directory first before running turnin.
If there is anything in these procedures which needs clarifying, please feel free to ask any tutor, the instructor,
or post on the Piazza Discussion Board.
Style Requirements
You will be graded for the style of programming on all the assignments. A few suggestions/requirements for
style are given below. Read carefully, and if any of them need clarification do not hesitate to ask.
- Use reasonable comments to make your code clear and readable.
- Use file headers and function header blocks to describe the purpose of your programs and functions. Sample
file/function headers are provided with PA0.
- Explicitly comment all the various registers that you use in your assembly code.
- In the assembly routines, you will have to give high level comments for the synthetic instructions, specifying
what the instruction does.
- You should test your program to take care of invalid inputs like nonintegers, strings, no inputs, etc. This is
very important. Points will be taken off if your code doesn't handle exceptional cases of inputs.
- Use reasonable variable names.
- Error output goes to stderr. Normal output goes to stdout.
- Use #defines and assembly constants to make your code as general as possible.
- Use a local header file to hold common #defines, function prototypes, type definitions, etc., but not variable
definitions.
- Judicious use of blank spaces around logical chunks of code makes your code much easier to read and
debug.
- Keep all lines less than 80 characters, split long lines if necessary.
- Use 2-4 spaces for each level of indenting in your C source code (do not use tab). Be consistent. Make sure
all levels of indenting line up with the other lines at that level of indenting.
- Do use tabs in your Assembly source code.
- Always recompile and execute your program right before turning it in just in case you commented out some
code by mistake.
- Before running turnin please do a make clean in your project directory.
- Do #include only the header files that you need and nothing more.
- Always macro guard your header files (#ifndef … #endif).
- Never have hardcoded magic numbers. This means we shouldn't see magic constants sitting in your code.
Use a #define if you must instead.