cs42200:spring19:labs:lab02

Lab02 : Transfer large blocks of text

  1. Design a protocol to send arbitrary size blocks of text
  2. Handle partially received strings
  3. Handle endianness

Due on: Jan 28th, 11:59pm

In this lab, you will be implementing a protocol that sends a sequence of messages between two applications. Each message consists of an arbitrary block of text. The protocols must handle partially received strings, and must be able to interoperate across arbitrary (heterogeneous) computers. Think carefully about interoperation when designing your code.

You will implement both a client application and a server application. You can use last week's echo client and echo server as an example. We encourage you to examine the code for the echo client and echo server carefully before beginning this week's lab. If you do not understand the code, you will not be able to build the required extension.

Also, you will use simplified API for this lab. Further details about the API is described in textbook - Appendix 1.

  1. To connect to a remote system: ssh [Put Purdue Account Name Here]@[Put Target Computer Here]
    1. For example:arastega@xinu1.cs.purdue.edu
    2. Note: Target computer should be one of the xinu machines (xinu01-xinu21)
  2. Prompts you for your Password
  3. Once verified, you gain access to the remote system
  4. You should have a directory created for the course under your home (if you have done lab1 as per provided instructions)
  5. Run the following commands to create a directory for lab02:
    cd ~/cs422
  6. Download lab02.tar.gz with wget under ~/cs422/. Note that option '-O' is the letter O and must be capitalized.
    cd ~/cs422/
    wget http://courses.cs.purdue.edu/_media/cs42200:spring19:lab02.tar.gz -O lab02.tar.gz
  7. Unzip the downloaded file.
    tar xvf lab02.tar.gz
  8. Navigate to the apps directory under lab02.
     cd ~/cs422/lab02/apps 
  9. You will find two empty files under this directory: server.c and client.c . Your code for server and client will go in these files. Use echoserver.c and echoclient.c as hints on how to write the code for this lab.
  10. To compile the files:
    cd ~/cs422/lab02/compile_linux/
    make server client 
  11. If compiled without errors, this will generate two executables: server, client.

In this lab, your client will receive a series of paragraphs as input, and will send each paragraph as a separate message. A paragraph can contain multiple sentences; each paragraph starts with a <tab> ('\t') and ends with an empty line [i.e. two newlines ('\n')]. An End-Of-File on the input will also terminate the last paragraph. On the server side, the server will extract and print each paragraph, preceding the paragraph with a line of the form “Received paragraph >”. Here are examples of the input to the client and output from the server:

# At the server
$ ./server SERVER_APPLICATION_NUMBER

# At the client
$ cat smalltext.txt
   Hello, world!
World is fresh and new.

   Testing sentence 1
Testing sentence 2
Testing sentence 3

$ cat smalltext.txt | ./client SERVER_IP SERVER_APPLICATION_NUMBER

# At the server
Received paragraph >
   Hello, world!
World is fresh and new.

Received paragraph >
   Testing sentence 1
Testing sentence 2
Testing sentence 3

# At the client
$ cat longtext.txt
   VERY LONG PARAGRAPH1 .....
   
   VERY LONG PARAGRAPH2 .....
   
$ cat longtext.txt | ./client SERVER_IP SERVER_APPLICATION_NUMBER

# At the server
Received paragraph >
   VERY LONG PARAGRAPH1 .....
   
Received paragraph >
   VERY LONG PARAGRAPH2 .....


# At the client
$ cat testtext.txt
   Para1 line1
Para1 line2
Para1 line3

Para2 line1
Para2 line2

   Para3 line1
Para3 line2


$ cat testtext.txt | ./client SERVER_IP SERVER_APPLICATION_NUMBER


# At the server

Received paragraph>
   Para1 line1
Para1 line2
Para1 line3

Received paragraph>
   Para3 line1
Para3 line2

Note in the last example, the two lines starting with Para2 are ignored because a paragraph must start with a tab.

In the last lab01 example, echo server and echo client used NEWLINE ('\n') to specify the end of each sentence. However, the same trick will not work this time, because a paragraph may contain multiple number of lines that end with a NEWLINE. Therefore, the server and the client must agree on “how to specify boundaries of each block(=paragraph)”. For lab2, you will design a protocol that sends:

  1. number of bytes of the paragraph, and
  2. actual paragraph text so that the server knows how much bytes it must read.

You may be wondering why we need to send the number of bytes – doesn't recv() return the whole received text at once? Unfortunately it does not. If you send a very long message, recv() may return fragmented text. Therefore, you must use a loop to receive the remaining bytes of the message. However, if you don't know how many bytes remain, you won't know whether you have received all the bytes for a paragraph or not. Therefore, the receiver must know how many bytes remain to be read.

So the client will send each paragraph with following format:

  1. unsigned 32bit integer (=length of paragraph)
  2. actual characters of paragraph

For example, if a client wants to send

    Hello, world!
World is fresh and new.

the client will compute length of the paragraph (15 + 24 = 39), send the length as 32-bit integer (0x00 0x00 0x00 0x27), and send the actual text:

0x00 0x00 0x00 0x2E '\t' 'H' 'e' 'l' 'l' 'o' ',' ' ' 'w' 'o' 'r' 'l' 'd' '!' '\n' 'W' 'o' 'r' 'l' 'd'
' ' 'i' 's' ' ' 'f' 'r' 'e' 's' 'h' ' ' 'a' 'n' 'd' ' ' 'n' 'e' 'w' '.' '\n'

On the other side, the server will read first 4 bytes as 32-bit integer, and do recv() until it receives all the remaining bytes for the paragraph.

You must be careful when you are sending a 32-bit integer, because different machines may store multibyte integers in different order. This is called Endianness. If you send() a two-byte short int from an Intel machine to a PowerPC, what one computer thinks is the number 1, the other will think is the number 256, and vice-versa. To make this compatible between different machines, you must use Network Byte Order, which is big-endian, when you are sending multibyte integers. However, Intel machines use “little-endian” so you have to convert multibyte integers. There are four functions that perform the conversion:

#include <netinet/in.h>
 
uint32_t htonl(uint32_t hostlong);
uint16_t htons(uint16_t hostshort);
uint32_t ntohl(uint32_t netlong);
uint16_t ntohs(uint16_t netshort);

In the function name, “h” stands for host, which is your machine, and “n” stands for Network Byte Order. “l” and “s” stand for “long” and “short” respectively. So if you want to convert your host machine's uint32 to Network Byte Order, you should use htonl(). Note that htonl() will actually swap the bytes to convert in Intel machines, but will do nothing for PowerPC machines because they are already in the Network Byte Order. So if you use htonl() before sending integers and ntohl() after receiving integers, your application will work correctly across different machines.

  1. Client takes two arguments: server IP address and application number.
    ./client SERVER_IP_ADDR SERVER_APPLICATION_NUMBER
  2. Client reads multiple paragraph texts from standard input. Each paragraph starts with a <tab> and ends with an empty line (2 <new lines>).
  3. Keep sending until socket or stdin breaks.
  4. You may have to set up a fixed buffer size to store text before sending it. You can safely assume that the size of the each paragraph will not exceeds 1,000,000 bytes.
  1. Server takes one argument: application number.
    ./server SERVER_APPLICATION_NUMBER
  2. Server will recv() paragraph using the specified protocol and will print out the paragraph with “Received paragraph >” as a header for each paragraph.
  1. You must test sending large paragraphs (over 100K bytes) from xinuXX to some other machines (e.g. mooreXX)
  2. Here are some example commands that will help you create large text files:
    echo -e -n "\t" > testfile_1K; for ((i=0;i<100;i++)); do echo "$i-123456789" >> testfile_1K; done; echo "" >> testfile_1K
    echo -e -n "\t" > testfile_10K; for ((i=0;i<1000;i++)); do echo "$i-123456789" >> testfile_10K; done; echo "" >> testfile_10K
    echo -e -n "\t" > testfile_100K; for ((i=0;i<10000;i++)); do echo "$i-123456789" >> testfile_100K; done; echo "" >> testfile_100K
    for((i=0;i<100;i++)); do echo -e -n "\t" >> testfile_1K_100par; for((j=0;j<100;j++)); do echo "$j-123456789" >> testfile_1K_100par; done; echo -e "--para$i--\n" >> testfile_1K_100par; done
  1. You can use lore.cs.purdue.edu and xinuXX.cs.purdue.edu to test endianness; they have different endianness.
  2. To compile the code on lore machine, navigate to the compile_solaris directory:
    cd ~/cs422/compile_solaris/
    make server client
  3. DO NOT SEND A LARGE PARAGRAPH BETWEEN lore.cs.purdue.edu and xinuXX.cs.purdue.edu! lore machine is not designed to support such tests. Test only short paragraphs.

You should use turnin command to submit your whole directory.

cd ~/cs422
turnin -c cs422 -p lab02 lab02

You can check with turnin -v.

turnin -c cs422 -p lab02 -v
Grading Rubric (tentative) Points
TA Test Cases +15
Organization, Coding style, Commenting, etc +3
Code Compile without errors and Runs +2
Total +20
  • cs42200/spring19/labs/lab02.txt
  • Last modified: 2019/01/24 10:49
  • by arastega