Build your own web server in a few simple steps

Self Made

© Pavel Ignatov, 123RF.com

© Pavel Ignatov, 123RF.com

Article from Issue 262/2022
Author(s):

If you want to learn a little bit more about the communication between a web browser and an HTTP server, why not build your own web server and take a closer look.

Programming your own web server might seem like a difficult and unnecessary undertaking. Any number of freely available web servers exist in the Linux space, from popular all-rounders like Apache or NGINX to lightweight alternatives like Cherokee or lighttpd (pronounced "lighty").

But sometimes you don't need a full-blown web server. If you just want to share a couple of HTML pages locally on your own network or offer people the ability to upload files, Linux on-board tools are all it takes. A simple shell script is fine as a basic framework that controls existing tools from the GNU treasure chest. Network communication is handled by Netcat [1], aka the Swiss army knife of TCP/IP.

Getting Ready

With a project like this, the best place to start is at the root. Because a web server is still a server at the end of the day, it needs to constantly listen on a given port and respond appropriately to requests. Usually, web servers listen on port 80 for normal requests, and port 80 generally only accepts HTTP requests without encryption. The web server I'll describe in this article listens on ports 8080 and 8081 and communicates without encryption. If you are using a firewall and want to test the server on the local network, remember to allow these two ports in the firewall.

A web server needs a root folder from which it loads the requested HTML files. It also needs a directory in which it can store uploaded files. Your first step is to define a configuration using a series of simple variables at the start of the server script (Listing 1). And you need to create the directories, along with a FIFO file, either manually or using the Bash test builtin. The server6.sh script, which is included with the code from this article [2], offers a solution.

Listing 1

Configuration

HTTP_HOME=http_home
HTTP_UPLOAD=${HTTP_HOME}/upload
CACHE_DATEI=${HTTP_UPLOAD}/filetoprocess
FIFO_GET=fifo_get
HTTP_GET_PORT=8080
HTTP_POST_PORT=8081
MEINE_IP=$(ip addr show <enp2s0> | grep -Eo "([0-9]{1,3}\.){3}[0-9]+" | sed 1q)

In the last line of Listing 1, you can see that your own IP address is also important. You will need to modify the network device specification (the Ethernet interface enp2s0 in this example) to suit your own system. When a web browser tries to submit a file via a web form, it needs a target address. GET requests are the simplest approach to doing this. When a browser sends a GET request, it expects the content of a web page in response, and it displays this content in the browser window.

You'll also need to create some sample HTML files for testing your homegrown server. (See the box entitled "Sample Files.")

Sample Files

Files for testing the web server are easily scripted. The function in Listing 2 runs through a for loop seven times. The routine uses a here document (heredoc) to support the entry of HTML code almost 1:1 (third line). Heredocs let you refer to the variable set in the for statement, which then simply contains the sequence number.

Heredocs help to define sections of text in many programming languages. Unlike conventional output via echo or printf, line breaks, indents, and some special characters are preserved in the text. Bash also supports the use of variables in heredocs.

In this way, you can create as many HTML files as you need with just a few lines of code. You could optionally integrate additional dynamic content that you generate with a script within the heredoc.

Listing 2

Creating Sample Files

function create_files () {
  for x in {1..7}; do
    cat <<-FILE > ${HTTP_HOME}/datei${x}.html
      <html><head><meta charset="utf-8">
      <title>Page ${x}</title>
      </head><body>
      <p> $( date ) </p>
      <p> Page ${x} </p>
      </body></html>
    FILE
  done
}

GET Requests

Responding to a GET request entails much more than just sending the content of a file. HTTP and HTTPS require that additional information be sent along with the transmission. If you want to know what a response from a genuine web server looks like, type the following command:

wget --spider -S "https://www.zeit.de/index"

The wget utility downloads a web page from the terminal. The --spider option tells wget to behave like a web spider; in other words, it won't download the actual content but will check that the content is there and will receive the transmission information associated with an HTTP request.

In the first line, the server confirms that it is happy to take the HTTP request – HTTP/1.1 200 OK. Further lines in the form of value pairs (such as Connection: keep-alive, Content-Length:300) are used to send back additional information or instructions.

It also appears that this service is a well-secured web server, because it does not reveal precisely what kind of server program it is. Many servers out themselves at this point as server: nginx, for example – not advisable, because such disclosures makes things easier for attackers. If you want Netcat to behave like a genuine web server, you'll need a way to generate this header information associated with HTTP.

Netcat

Netcat is available on virtually any Linux system and can be used for many purposes given a little creativity on the user's part, although it admittedly has some limitations. You can emulate basic network operations using Netcat, but complex interactions are difficult or impossible. You definitely don't want to try to compete with Apache or NGINX just using Netcat.

If you want Netcat to permanently listen on a port and also send different responses, you have to combine it in a loop with a FIFO file. FIFO refers to the "first in, first out" principle. This means that the information comes back out of the file in the same order in which it was sent in [3]. Listing 3 shows an example.

Listing 3

Netcat Response

while true; do
  respond < $FIFO_GET | netcat -l $HTTP_GET_PORT > $FIFO_GET
done

The FIFO file improves the communication between Netcat and the respond function, as shown in Listing 4. Netcat listens on the specified port and writes to the FIFO file. On the left side of the pipe, you can see the call to the function that reads the browser request. It evaluates the request and then sends a matching response, containing an HTML header and HTML data, back through the pipe to Netcat. The respond function decides what to return to the browser.

Listing 4

FIFO File

01 function respond () {
02   read get_or_post address httpversion
03   if [ ${#address} = 1 ]; then
04     list_dir
05   elif [ ${#address} -gt 1 ]; then
06     return_file $address
07   fi
08 }

This variant is already a fairly powerful solution. If the length of the browser request is 1 (line 3), then it is /, and Netcat returns a directory listing. If the length is not equal to 1, Netcat returns the content of a file from the root directory. To get the web server to return a list of the files contained in the root folder, a very simple ls directory_name is all that is needed. However, the results then need to be embedded in suitable HTML code so that the links work and the browser can actually use them (Figure 1). The sed [4] stream editor is recommended for converting a directory listing into HTML code.

Figure 1: The DIY web server returns a listing of the root directory content.

Listing 5 shows the functions referenced in Listing 4. In the list_dir function, the directory content is output with a simple ls command. Sed then converts the results into plain vanilla HTML. The files generated by the function from Listing 2, which reside in the root directory, already contain HTML code. The server uses the return_file function in line 19 of Listing 5 to send a file back to the browser with a matching header.

Listing 5

Output

01 function list_dir () {
02   local output=$( ls --hide=upload -1 $HTTP_HOME | sed -r '
03   1 i<html><head><meta charset="utf-8"><title>Content</title></head>\
04   <body style="margin: 45px; font-family: sans-serif">
05   s#(.*)#<li><a href="\1">\1</a></li>#
06   $ a</body></html>
07   ' )
08
09   local content_length="Content-Length: $( cat <<<$output | wc --bytes )"
10
11   cat <<<$output | sed '
12   1 i HTTP/1.1 200 OK
13   1 i Server: Your GET SERVER
14   1 i Connection: close
15   1 i '"$content_length"'\n
16   '
17 }
18
19 function return_file () {
20   content=$( cat ${HTTP_HOME}/${1:1} )
21   if [[ $? -eq 0 ]]; then
22     laenge=$( cat <<<${content} | wc --bytes )
23     cat <<<${content} | sed -r '
24       1 i HTTP/1.1 200 OK
25       1 i Server: Your GET SERVER
26       1 i Connection: close
27       1 i Content-Length: '"$length"'\n'
28   else
29     cat <<-ERROR
30       HTTP/1.1 404 Not Found
31       Connection: close
32       Content-Length: 42
33
34       The requested page does not exist, sorry!
35   ERROR
36   fi
37 }

Because Netcat is continuously available for requests in the loop and sends a header and the corresponding HTML, a browser in the local network thinks it is dealing with a real web server.

However, it can also happen that the user manually requests a page in the browser that does not exist. This leads to the infamous 404 error, which you have probably seen on the web before [5]. The custom web server can also come up with this feature. If the cat command in the first line of the return_file function (line 20) throws an error, the else branch starting at line 28 is executed. The web browser then displays a message that the requested page does not exist.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Local File Inclusion

    A local file inclusion attack uses files that are already on the target system.

  • Bash Web Server

    With one line of Bash code, you can create a Bash web server for quickly viewing the output from Bash scripts and commands.

  • Backdoors

    Backdoors give attackers unrestricted access to a zombie system. If you plan to stop the bad guys from settling in, you’ll be interested in this analysis of the tools they might use for building a private entrance.

  • Netcat and Socat

    Netcat is the Swiss Army knife of networking for admins. Socat takes this principle one step further, offering multiplexing, TLS-secured channels, pipes, Unix sockets, and executables.

  • Instant File Hosting with a Simple PHP script
comments powered by Disqus
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters

Support Our Work

Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.

Learn More

News