CSC B09 Assignment 3, Winter 2015: Write a trivial shell in C The

CSC B09 Assignment 3, Winter 2015: Write a trivial shell in C
Due by the end of Friday March 20, 2015; no late assignments without written explanation.
This assignment involves implementing basic command execution such as is performed by any unix shell.
The parsing of the command line is supplied; your task is to implement the fork/exec/wait, i/o redirection,
and pipes, as follows.
The supplied parsing code
In /cmshome/ajr/b09/a3 there is skeleton source code for a feeble little unix shell which I call ‘‘tsh’’. The
code there does some simple parsing of a command line, resulting in the following structure:
struct cmdline {
char *inputfile, *outputfile; /* i/o redirection with ’<’ and ’>’ */
char **argv; /* ends with NULL; if argv[0] is NULL, it’s a blank line */
struct cmdline *pipedinto;
};
‘‘inputfile’’ and ‘‘outputfile’’ are the file names which the command is redirected from or to with ‘<’
and/or ‘>’, respectively. If a redirection has not been performed, they will be NULL.
‘‘argv’’ is an array of the words in the command, where argv[0] is the program to be executed (after
looking up its location using the PATH variable). However, unlike the arguments to main(), there is no
argc value but rather, the array is terminated with a NULL pointer value. This makes it suitable for
passing to execve() directly.
‘‘pipedinto’’ is NULL for a simple command, or the command to the right of this one in the pipeline.
In the struct object pointed to by pipedinto, inputfile and outputfile will always be NULL. For the
purpose of this assignment you only need to handle pipelines of up to two components, e.g. ‘‘foo | bar’’
and not ‘‘foo | bar | baz’’.
An example of calling parse() is in the supplied skeleton tsh.c in main(). In fact you do not need to
look inside parse.c for most of the assignment, and you do not need to change the supplied main().
Suggested sequence of implementation
1. First, compile and run the distributed tsh.c and parse.c and type some commands to it. Type zero or
more argv lists separated by vertical bars, possibly with an input or output redirection for the entire
command.
2. Make execute() execute a simple command which uses an absolute path name, by using execve().
(Remove the existing dummy execute() contents.) Thus, so long as p->argv[0] is not a null pointer, you
can use p->argv[0] as the first parameter to execve(), and p->argv itself as the second parameter. The
third parameter to execve() will be the global variable ‘‘environ’’. You can declare it with ‘‘extern char
**environ;’’.
Note that this means that you can’t type ‘‘cat file’’ but must instead type ‘‘/bin/cat file’’. We’ll fix
that in the next step.
3. If p->argv[0] does not contain a slash, construct a string consisting of a directory name from the
searchlist array concatenated with p−>argv[0] (you may impose a length limit of, say, 1000 chars, so long
as you check that this is not exceeded no matter what wacky things the user either types in or supplies as a
PATH variable). Go through the searchlist array in order, calling stat() on each one to determine whether
it exists, and stopping when you find a file which is executable. Pass that file path name as the first
parameter to execve() instead of argv[0].
Note that parsePATH(), previously called from main(), initializes the searchlist array (which doesn’t
subsequently change). As distributed, parsePATH() puts a hard-coded path in searchlist, but you will
(continued)
-2-
change this later to parse the PATH environment variable.
So for example, if the elements of searchlist are ‘‘/bin’’ and ‘‘/usr/bin’’ (thus searchlistsize is 2), you
will try to stat /bin/cat and then /usr/bin/cat, except that the stat of /bin/cat would succeed so you would
stop there.
If the command is not found in any of the directories in the search list, print the usual error message:
‘‘%s: Command not found\n’’.
Note that the above routine with concatenating strings only applies if argv[0] does not contain a
slash. Test with strchr(p−>argv[0], ’/’). For example, the user can still type /bin/cat, and this doesn’t
mean /bin/bin/cat, or /usr/bin/bin/cat—it just means /bin/cat as typed. Also, ‘‘./cat’’ means to run cat in
the current directory, even though ‘‘/bin/./cat’’ would be a valid name for /bin/cat. (That is to say, ‘‘./cat’’
is not the same as ‘‘cat’’!) To summarize this paragraph in other words, if a slash appears anywhere in the
argv[0] string, it is a complete file pathname (absolute or relative), not to have the search directories
prepended.
After a failed execve(), call perror(). The parameter to perror() should be the first parameter to
execve() including the prepended directory name.
4. Implement i/o redirection. You have to open the appropriate files after the fork(), in the child only.
Test your implementation with commands such as ‘‘ls >file’’ and ‘‘tr e f <file’’.
5. Implement pipelines of length two. That is, if p−>pipedinto is non-null, do a pipe() call in the child
process, then fork again, then rearrange file descriptors as appropriate in the two youngest processes, and
exec. Make sure that simple commands still work! Now is also a good time to make sure that you just
get another prompt if you just press return (which yields a ‘‘pipeline’’ of length zero), with no extra
lingering processes. (Getting two prompts would be one sign of the lingering processes problem.)
Pipelines of length greater than two are trickier and you don’t have to do them for this assignment,
but they will be implemented in my sample solution.
6. Implement parsePATH(). As shown in the distributed code, begin with getenv("PATH") to get the
PATH variable from the environment. If the variable PATH is not set in the environment, getenv returns
NULL; this is extremely unusual and simply exiting with an appropriate message (as already
implemented) is an adequate reaction.
The PATH variable contains directory names separated by colons. You want to store each directory
name in a separate element of the searchlist array, and leave the appropriate value in searchlistsize. If the
number of entries exceeds MAXSEARCHLIST, you can just abort with an error.
You will have to malloc() the appropriate strings, but you may want to use the ‘‘estrsavelen’’ function
in parse.c, which is exported for this possible purpose.
For the purposes of this assignment, you don’t have to worry about the possibility of empty strings in
the PATH (e.g. /bin:/usr/bin: (note the trailing colon)).
Note: parsePATH() is worth only about ten percent of the value of the assignment.
To think about: freeparse()
Free()ing the data structure created by parse() is something else which needs to be written as part of a
complete implementation of tsh, although it is not part of this assignment. For the purposes of this
assignment, you can leave the dummy freeparse() in parse.c.
How would you write freeparse()? You may find it instructive to produce a draft version, although
that is not to be submitted. My sample solution will contain a correct freeparse() implementation.
(continued)
-3-
Epilogue: Memory leak
This part is worth about five to ten percent of the value of the assignment.
The supplied parse.c contains a bug of the kind called a ‘‘memory leak’’—over time, the tsh program
will use more and more memory; not everything would be freed properly even after the hypothetical
freeparse() is called.
The reason for this is that there is a place where a pointer variable (data area) is assigned to be the
return value from malloc() but that variable might already contain the only copy of another return value
from malloc(), so the previous malloc() pointer is lost and cannot be freed.
Find this in parse.c and fix it. Submit your revised parse.c under the name ‘‘parse-fixed.c’’.
Be sure to diff your parse-fixed.c with the original. The change you make should be minimal. If you
make changes throughout the file you will get zero for this part of the assignment; full marks requires
changing only the portion of the code which has this memory leak problem.
Note: Your tsh.c will be compiled with the original parse.c (and parse.h); your parse-fixed.c will be
graded separately.
Other notes
You will want to begin by making a subdirectory to hold the .c files. You’ll want to copy in the starter
files from /cmshome/ajr/b09/a3, i.e. ‘‘cp /cmshome/ajr/b09/a3/* . ’’. You can type ‘‘make’’ to use the
supplied Makefile to build your program, or you can simply type ‘‘gcc −Wall tsh.c parse.c’’.
Your C programs must be in standard C. They must compile on the UTSC linux machines with
‘‘gcc −Wall’’ with no errors or warning messages, and may not use linux-specific or GNU-specific
features.
Your revised tsh.c file will be compiled with the original versions of all of the other files for
automated testing. If you have edited the other files (e.g. to fix parse.c’s memory leak, or e.g. to put in
debugging printfs somewhere), I strongly recommend copying over all other modified files anew from
/cmshome/ajr/b09/a3 (perhaps in a new directory, then also copying in your tsh.c) and doing ‘‘make
clean’’ and then ‘‘make’’ to produce a tsh for your final testing.
Once you are satisfied with your files, you can submit them for grading with the command
submit −c cscb09w15 −a a3 tsh.c parse-fixed.c
and the other ‘‘submit’’ commands are also as before.
Please see the assignment Q&A web page at
http://mathlab.utsc.utoronto.ca/courses/cscb09w15/a3/qna.html
for other reminders, and answers to common questions.
Remember:
This assignment is due at the end of Friday, March 20, by midnight. Late assignments are not ordinarily
accepted and always require a written explanation. If you are not finished your assignment by the
submission deadline, you should just submit what you have, for partial marks.
Despite the above, I’d like to be clear that if there is a legitimate reason for lateness, please do submit
your assignment late and send me that written explanation.
And above all: please be careful not to commit an academic offence in your work on this (or any)
assignment, even if you’re under pressure. Just submit what you can do yourself; do not look at other
students’ assignments, and do not show your assignment (complete or partial) to other students. Even a
zero out of 10% is far better than cheating and suffering an academic penalty. Students also receive
academic offence penalties for giving their assignment to other students, since they are helping that other
student to commit an academic offence. Your friend might promise in all sincerity not to hand in your
work as their own, but if they can’t do the assignment themselves, a copy of your solution is not going to
help them enough and when the deadline approaches, they might hand in some of your work. Don’t
tempt them.