Lab 7 - Dynamic Program and Library Loading
MIDWAY SUBMISSION DUE: Tuesday, November 22nd by 11:59 PM
FINAL SUBMISSION DUE: Friday, December 9th by 11:59 PM
Objectives
By the end of this lab students will be able to:
- Understand how to dynamically load programs and libraries into memory.
- Understand how to alter the location for relocatable variables.
- Create new system calls to manipulate shared libraries.
Lab Sections
1. Background
Today operating systems such as Linux and Windows support the execution of programs from a backing storage. The typical helloworld program when compiled by gcc creates an executable that can be executed from a shell (the default name generated by gcc is typically a.out). However in XINU, all processes are created and executed using a function pointer. There is currently no way to execute a program that has been compiled outside of XINU. Your job is to provide a mechanism and interface to allow for the execution of dynamic code in XINU. You will do this in two parts:
- Part 1: The loading and execution of a program that is stored outside of the XINU executable.
- Part 2: The dynamic loading of libraries that contain functions that can be called from within XINU.
File Storage
Executable and library files will be stored in XINUs remote file system (RFS). The RFS consists of two pieces:
- The rfserver - a network server that runs on a XINU frontend (e.g. xinu19.cs.purdue.edu)
- The RFS client within XINU - runs within XINU on the backend
The rfserver, which runs on your frontend work station, serves files to the XINU instance running on the backend. XINU connects to the rfserver through a UDP port, and sends/receives messages to perform file operations on the file system where the rfserver is running. For this lab, all you will need to use the RFS to read the binary images for the programs and the libraries.
The source code for the rfserver is located inside the tarball for the lab (see below) in the rfserver
directory. You will be provided with a UDP port number specifically for you to use for the RFS server via email. You will need to change the macro RF_SERVER_PORT
in the file include/rfilesys.h
.
To allow XINU to connect to the rfserver, the IP address for the server needs to be set in include/rfilesys.h
. The value of RF_SERVER_IP
must be changed to match the IP address of your frontend workstation. For example, if your frontend workstation is xinu19.cs.purdue.edu
then the RF_SERVER_IP
value in include/rfilesys.h
needs to be changed to 128.10.136.69
.
Running make in the directory rfserver
generates the executable “rfserver” for the RFS server and places it in the programs
directory. You will need to run this code before you start your XINU image on the backend, like so:
$ ./rfserver
NOTE: Every time you change include/rfilesys.h
make sure you run a make clean and then make prior to running your xinu image on a backend.
To open a file in the remote file system, you will use the open
system call specifying the name of the file that you wish to open and RFILESYS
as the device name (this tells XINU that you are opening a new file in the remote file system). For example:
int fd = open(RFILESYS, "helloworld", "or");
This opens the file helloworld
, located in the remote file system, for reading (“r” tells XINU to open for reading, “o” tells XINU to ensure that file exists, if it does not exist SYSERR
is returned). The resulting file descriptor is stored in fd
. If there is a problem opening the file (timeout connecting to the rfserver, etc.), then fd
will contain the value SYSERR
.
NOTE: If you do not use “o” in file open mode, a new file will be created.
NOTE: the remote file server creates a file system that is relative to the directory where the rfserver is executed. For example, if your current working directory on your frontend workstation is xinu-fall2016-lab7/programs
and you run the rfserver command, then the root directory that XINU will see on the open
system call will be xinu-fall2016-lab7/programs
on the frontend workstation.
NOTE: do NOT use a beginning slash '/' in your file, or '..' anywhere in file path. Doing so will cause open
to fail. The RFS does not use a root directory with '/'. The remote file server will always look for your path in the directory starting with the current working directory when rfserver was executed.
Assuming the file opens successfully, you can later read the file with the read
system call, specifying the file descriptor, a read buffer, and read size. For example:
char read_buffer[300]; int rc = read(fd, read_buffer, sizeof(read_buffer));
This will return the number of bytes actually read from the file or SYSERR
if an error occurred.
You can request the size of the file in bytes by using the control function like so:
int32 filesize = control(RFILESYS, RFS_CTL_SIZE, fd, 0);
Here fd
is the file descriptor returned by open(..)
HINT: To read the entire file, you can first find out the size of the file, allocate enough memory to hold the entire file and then read the entire file.
You can close an opened file using the close() system call like so:
close(fd);
Here, fd
is the file discriptor returned by open(..)
File Format
Compiled files will be created by gcc
in the executable and linkable format (ELF). For part 1 (loading of an executable) the file will contain a single function with the symbol name “main”. For part 2, the file may contain one or more functions of different names.
As part of this project you will have to familiarize yourself with the ELF format and be able to parse the headers and various sections of the ELF file.
Things to consider in the ELF file:
- A loaded program can make calls to existing XINU functions (e.g
kprintf
). When the compiler compiles the program, it does not know the address ofkprintf
. The compiler builds relocation entries for the symbols that it cannot resolve at compile time. These functions are present in the XINU text section at runtime. Your job is to read the relocation entries in the ELF file and fix up the relocation offsets as part of theload_program
andload_library
system calls. For more information on relocation, see the ELF specification (links provided in references section). - When a program is compiled resulting in an ELF file, the file contains a section called “.symtab” (symbol table). The symbol table contains the names and addresses of all symbols in the executable. As you know, XINU is also compiled into an ELF file (xinu.elf). At the time of resolving relocation entries in the loaded program, you will need the addresses of XINU functions. Your task is to read the
xinu.elf
file (using RFS), specifically the symbol table in thexinu.elf
file. You only need to do this once during a run. - When you run make from the
compile
directory to compile XINU, the resultingxinu.elf
file is also placed in theprograms
directory. You will open thisxinu.elf
file using RFS, to read the XINU symbol table.
Part 1: Running a program executable in XINU
Recall the format for the create system call:
pid32 create( void *funcaddr, /* Address of the function */ uint32 ssize, /* Stack size in words */ pri16 priority, /* Process priority > 0 */ char *name, /* Name (for debugging) */ uint32 nargs, /* Number of args that follow */ ... )
NOTE: the create
system call takes as the first parameter the pointer to the function that the new process will execute. In part 1, you need to create an interface to allow a program to be loaded into XINU's memory from a program located in a file system. To do this you will be required to write a new system all with the following format:
void* load_program(char* path);
This system call takes as a parameter the path name to a program and returns a function pointer to the program that has been loaded into memory. If the operating system could not load the program (e.g. the file doesn't exist, there is no more memory to load the program into, etc.) then the function returns SYSERR
cast to a void pointer:
(void*)SYSERR
As previously mentioned the program to be loaded must contain only a single function with the symbol named “main”. If more than one function exists in the file, then return (void*)SYSERR
.
NOTE: the name “main” for the symbol of the function is only used to find the location of the program entry point. Once the load_program
system call returns, the pointer to the function will be used.
Once the program is loaded, the program can then be executed by using the create system call. For example, consider the following “helloworld” program:
#include <xinu.h> int32 main(int32 argc, char* argv[]) { kprintf("Hello World\n"); return 0; }
Using the new system calls, this program should be able to be executed with the following:
void* helloworld = load_program("helloworld"); resume(create(helloworld, 4096, 20, "helloworld", 0, NULL));
Part 2: Loading a library into XINU
For part 2, you will load a dynamic library (i.e. a file that contains multiple functions), and will create a mechanism that allows an existing XINU function to call any of the functions in the dynamic library. As an example, suppose a library file contains functions x, y, and z. Once the library has been loaded, a XINU process can:
- Ask for the address of x and then
- Use the address to call function x
To make the example concrete, suppose you wanted to create two utility functions in XINU and dynamically load them for programs to use:
int add1(int val) { return val + 1; } int add2(int val) { retun val + 2; }
You created these functions in a file myadd.c
and compiled the file to myadd
. You later want to load these functions into XINU and call them. To do that you will need to provide two new system calls:
syscall load_library(char* path); void* find_library_function(char* name);
The first system call, load_library
, is similar to load_program
as it loads the image from the remote file system and performs the fix up for relocatable variables. However, it does not return a function pointer. Instead it stores the pointers to all the functions within the file in a table. Another system call, find_library_function
, can then be called to find the function pointer for a dynamically loaded function by name.
For the myadd
example:
int myvalue = 2; /* Load the library */ if(load_library("myadd") == SYSERR) { return; } /* Find the add1 function */ int32 (*add1)(int32) = find_library_function("add1"); if((int32)add1 == SYSERR) { return; } /* Call the function */ add1(myvalue);
If the load_library
function cannot load the library for any reason, it returns SYSERR
. Similarly, if the find_library_function
cannot successfully retrieve the function pointer for the given name, then it will return (void*)SYSERR
.
For this lab:
- You will have to be able to support a maximum of 3 loaded libraries, each with a maximum of 10 functions each (30 functions total). If a library contains more than 10 functions then return
SYSERR
. - A library can only be loaded once, so if a process attempts to load the same library twice then return
SYSERR
. - A function loaded from a library can only exist once in XINU so if two libraries both define the same function name then return
SYSERR
when the second library is loaded.
2. Setup
In /homes/cs503/xinu there is a file called xinu-fall2016-lab7.tar.gz
that contains a start to the code. Unpack:
tar zxvf /u/u3/cs503/xinu/xinu-fall2016-lab7.tar.gz
This will create a directory called xinu-fall2016-lab7
.
Along with the main code for XINU, this tarball contains the following files (additional explanation of the contents of the files is in the following sections).
system/load_program.c
- function declaration for theload_program
system call.system/load_library.c
- function declaration for theload_library
system call.system/find_library_function.c
- function declaration for thefind_library_function
call.rfserver/*
- The source code for the rfserver (see above)programs/*
- The directory for your programs with an example hello world and Makefileinclude/prototypes.h
- modifiedprototypes.h
file which includes theload_program
,load_library
, andfind_library_function
system calls.
First things to be done when you untar the lab tarball:
- Change the macro
RF_SERVER_PORT
in fileinclude/rfilesys.h
to the port that is assigned specifically to you. - Change the macro
RF_SERVER_IP
to the IP address of the XINU frontend on which you will run the RFS server. - Go into directory
rfserver
and run make. This will place the executable “rfserver” in the directoryprograms
. You will have to do this step every time you change theRF_SERVER_IP
.
How to run the RFS server, cd
into directory programs
and run the following:
$ ./rfserver
NOTE: Whenever you make any changes to the programs or to XINU, you will have to restart the RFS server.
3. Modifications to the XINU Shell
In the shell directory you will find a modified version of the XINU shell which has been expanded to use dynamically loadable commands. As part of this lab, you will be required to implement the following two new commands for the XINU shell. Place these code for the new commands in the programs
directory.
- semdump - this command takes no parameters and prints the contents of the semaphore table to the user
- ls - list the contents of a remote file system directory
Provide the implementation for the new shell commands as loadable programs from programs
directory. The details for the commands are as follows:
semdump - this command takes no parameters and prints the contents of the semaphore table to the user. You must provide a readable table format for the output including a table header and the contents of each semaphore table entry. See include/semaphore.h
for a description of the contents of the semaphore table. For example:
xsh $ semdump Entry State Count Queue 0 S_USED 3 3 1 S_FREE 0 0
ls - list the contents of a remote file system directory. This command takes zero or one parameter. If a parameter specifying a directory is given, the contents of that directory within the remote file system are printed to the user or an error is printed indicating that the directory does not exist. For each entry within a directory that is also a directory put a '/' at the end. If no parameter is specified, the contents of the remote file system root directory are printed. For example:
xsh $ ls mydir mydir1/ mydir2/ myfile1 myfile2
Note: A directory is a special file in Linux, which maintains a list of files in the directory. For this command, you will need to open a directory as a file using RFS. The mode must be “ro” when opening a directory as a file. For example: you can open the directory in which the RFS server is running using the open system call like so:
int32 dirfd = open(RFILESYS, ".", "ro");
Each read performed on an opened directory will return the next entry in the directory which will be in the following format (defined in include/rfilesys.h
):
struct rfdirent { byte d_type; byte d_name[256]; };
The d_type
field can have the following two values (defined in include/rfilesys.h
)
#define RF_DIRENT_FILE 1 #define RF_DIRENT_DIR 2
You can use this value to differentiate between a file and a directory.
A read
on a directory will look like this:
struct rfdirent rfdentry; rc = read(dirfd, (char *)&rfdentry, sizeof(struct rfdirent)); if(rc == SYSERR) { /* error accured while reading */ } else if(rc == 0) { /* Reached end of list */ } else { /* Handle next entry in the directory */ }
You will have to make multiple calls to read()
to get all the entries in the directory. When you reach the end of the list, read()
will return a zero.
NOTE: Do not forget to close the opened directory once you are done with all the reading.
4. Additional Requiremnets
- Make sure to remove all debug output from your system calls. When the TAs run your submitted code, calling a system call directly should not produce any output.
- Provide a set of test cases to ensure that your code works as required. Put these test cases in
main.c
- Make sure your test cases not only test the new functionality of the new system calls, but that existing system calls still function correctly.
- Don't forget to include test cases for the extra credit (see below) if you choose to implement the extra credit requirements. No extra credit will be awarded if test cases are not included.
- The TAs will be replacing
main.c
with their own test cases after running your submitted test cases. Make sure you do not define any dependent variables inmain.c
. You are free to modify any other file(s) to implement the lab requirements.- Make sure that there are no dependent declarations in
main.c
.
- If your submitted code does not compile (either the exact submitted code or the code after the TA's replace any test case files), you will receive zero (0) points for code execution. If this happens, you will be allowed to resubmit for half credit only.
- Please run “make clean” prior to submission so that you don't submit object files
- NOTE: When you make xinu for this lab the make file will generate two files in the compile directory:
- xinu - this is the file you will download to the xinu backend
- xinu.elf - this is the executable and linkable format version of the xinu binary. This is not the format to use when sending to the backends. However, you will need to use it to find the addresses of symbols and to perform relocation.
5. Updating the Global Descriptor Table
On Intel x86 there is a structure called the global descriptor table which defines characteristics for regions of memory. One of those characteristics is which segment of memory code resides. If the CPU attempts to fetch an instruction from a memory segment that is not designated as code, then a general protection violation is signaled. Normally, Xinu only has the .text
section of memory designated in its code segment. This makes sense since normally only code (.text
) should be executed and not data from the stack or getmem
, etc. However for lab 7, you need to be able to execute code that is located beyond the .text
section of Xinu since you're loading programs into memory allocated with getmem
. To allow this, you will need to extend the code segment in the global descriptor table. In the function setsegs()
in system/meminit.c
the following needs to be changed:
/*------------------------------------------------------------------------ * setsegs - Initialize the global segment table *------------------------------------------------------------------------ */ void setsegs() { extern int etext; struct sd *psd; uint32 np, ds_end; ds_end = 0xffffffff/PAGE_SIZE; /* End page number of Data segment */ psd = &gdt_copy[1]; /* Kernel code segment: identity map from address 0 to etext */ /* Change the following line which sets np */ //np = ((int)&etext - 0 + PAGE_SIZE-1) / PAGE_SIZE; /* Number of code pages */ np = ds_end; /* End of memory */ psd->sd_lolimit = np; psd->sd_hilim_fl = FLAGS_SETTINGS | ((np >> 16) & 0xff); psd = &gdt_copy[2]; /* Kernel data segment */ psd->sd_lolimit = ds_end; psd->sd_hilim_fl = FLAGS_SETTINGS | ((ds_end >> 16) & 0xff); psd = &gdt_copy[3]; /* Kernel stack segment */ psd->sd_lolimit = ds_end; psd->sd_hilim_fl = FLAGS_SETTINGS | ((ds_end >> 16) & 0xff); memcpy(gdt, gdt_copy, sizeof(gdt_copy)); }
6. Extra Credit
Unloading a library
Since you can only load at most 3 libraries at any given time, there is a limit to the number of dynamically loaded functions that a program can use. For extra credit, implement a system call that allows a process to unload a library so that it can later load the same or a different library using load_library
. To do this implement the following system call:
syscall unload_library(char* path);
This system call unloads the library specified by the path name. Any functions that are contained in the library will be no longer usable (find_library_function
will return SYSERR
for those functions). You may assume that any processes using functions from the library to be unloaded have coordinated to ensure that no process is using the library while it is being unloaded. If the library to be unloaded has not been loaded, then return SYSERR
.
In your report, consider what might happen if a process asks for a library to be unloaded while another process is using the library. What mechanism could you add to prevent such a problem? (Hint: consider a reference count).
Loading a library archive
In operating systems such as Unix and Linux, libraries can be created and loaded from groups of files instead of just one. Statically linked libraries (also called archive libraries) can be created with the Unix ar
command (see man ar
for more information on the ar
command). When used, the compiler statically links the symbols that your program uses and makes the code part of your program. You can see this by creating a helloworld
program and compiling with gcc -static
. The resulting binary will be much larger than if you don't use -static
. Dynamically linked libraries (also called shared object libraries) can be created using gcc
with the -shared
flag. When used, the compiler just checks for the existence of the symbol at compile time and the library itself is loaded by the operating system automatically at run time.
As extra credit, expand the load_library
system call to allow it to load archive libraries into the XINU address space. To do this you will need to review the standard for the .a
archive library format created by the Unix ar
command. This file consists of a global header and a file header for each file within the archive.
NOTE: the GNU variant of ar
archives also contains a symbol table in the list of files. You will need to be able to support this.
NOTE: a single .a
file even though it contains multiple files within it, is still considered (for the purposes of load_library
) to be a single file. You will have to be able to support the loading of up to three .a
files that each consist of multiple object files.
NOTE: For the purposes of this lab, a single .a
file will still contain only a maximum of 10 functions within the files (as is stated in the load_library
description above. For example, a single .a
file might contain the following 2 object files:
- myadd - containing the
add1
andadd2
functions - mysub - containing
sub1
andsub2
which behave the same as theadd1
andadd2
yet subtract instead of add
This is to be considered a single library file containing 4 functions.
To create your own .a
for testing use the following:
ar rcs LIBRARY_NAME FILE1 FILE2 ...
Where:
LIBRARY_NAME
represents the library file to be created
FILE1
, FILE2
, etc. are the object files
A more concrete example:
ar rcs mymath.a myadd mysub
Creates a single archive library called mymath.a
with the two object files myadd
and mysub
. This file can then be loaded into XINU using the system call:
load_library("mymath.a");
More information on the Unix ar
command and the file format can be found here and here.
7. Useful Commands
There are a couple commands available on the xinu lab frontend machines that you might find useful for this lab:
For example, you can use the readelf
command to print out the symbols in a xinu.elf
file:
$ readelf -a xinu.elf
Lab Submission
What to turn in
Midway Submission
You will be required to perform a midway submission (see due date above). In your midway submission include all of your source code for XINU, any programs and/or libraries you have written, and a midway progress report in PDF format in the system
directory called lab7_progress.pdf
containing the following:
- What have you been able to implement and get working up until now?
- What do you plan on implementing before the final due date?
- What problems have you run into while preforming your implementation? What did you do to solve those problems?
To turn in your lab for midway submission, use the following command:
turnin -c cs503 -p lab7_mid xinu-fall2016-lab7
assuming xinu-fall2016-lab7
is the name of the directory containing your code.
If you wish to, you can verify your submission by typing the following command:
turnin -v -c cs503 -p lab7_mid
Do not forget the -v above, as otherwise your earlier submission will be erased (it is overwritten by a blank submission).
Note that resubmitting overwrites any earlier submission and erases any record of the date/time of any such earlier submission.
We will check that the submission time stamp is before the due date; Any submission past the due date will be deducted the appropriate number of grace days. If submission is beyond your remaining number of grace days, your work will not be accepted.
Final Submission
Submit using turnin
command your complete source code (all of XINU) including the any files you added to complete the lab. In the system
directory include a PDF file called lab7_analysis.pdf
with a report discussing:
- The details behind your implementation. As part of this discussion write answers to the following questions:
- The separation between a Xinu ELF file and the running image leads to a potential problem: if the ELF file is changed (Xinu sources are recompiled) after an image starts to run, symbol table addresses in the ELF file may no longer match the locations of items in the running image.
- How can you ensure the ELF file read at run time matches the image that is executing?
- What is the most difficult aspect of the project? Why?
To turn in your lab use the following command
turnin -c cs503 -p lab7_final xinu-fall2016-lab7
assuming xinu-fall2016-lab7
is the name of the directory containing your code.
If you wish to, you can verify your submission by typing the following command:
turnin -v -c cs503 -p lab7_final
Do not forget the -v above, as otherwise your earlier submission will be erased (it is overwritten by a blank submission).
Note that resubmitting overwrites any earlier submission and erases any record of the date/time of any such earlier submission.
We will check that the submission time stamp is before the due date; Any submission past the due date will be deducted the appropriate number of grace days. If submission is beyond your remaining number of grace days, your work will not be accepted.