amccormack.net

Things I've learned and suspect I'll forget.

Improving Justin Seitz's proxy.py from Black Hat Python

Over the holiday I've been reading Justin Seitz's book Black Hat Python. The book has a lot of great ideas of how to incorporate Python into Pentesting work. While the ideas in the code are good, and they execute properly, I was disappointed that Seitz doesn't use a lot of the baked in goodies included in Python that can really make code readability and extensibility much better. This post is going to show how I would have written the proxy.py tool Seitz introduces in chapter 2.

As someone who routinely uses Python in my day to day work, I find myself continuously updating and improving older scripts that I wrote a while ago. Two of my favorite modules that help make code extensible are argparse and logging.

argparse was introduced in Python 2.7, so I understand that it can make portability difficult, and therefore you may not want to use it. However, I have found that it is such a strong argument parser and so easy to read that I would rather write my code assuming I can use it, and then backport my code if I have to. argparse has a lot of great features, and almost always, it is going to save you lines of code and make it easier to read and augment your argument handling code. Even better, it has built in support for help and documentation, default values, and optional arguments. I also like that you never have to worry about conditional argument counts.

logging was introduced in Python 2.3, so there is no worry about backwards compatibility. The reason I like the logging module is because it gives fine control over printing logging or debugging information without having to use print statements and global if statements. Additionally, it offers benefits like standard output formatting, the ability to output to console or to a file, and much, much more. I tend to use logging by default, simply because it makes adding a --verbose flag extremely simple.

Overview of changes

Throughout this post we'll look at the following set of changes:

  1. Introduced argparse for Argument Parsing
  2. Move main() to launch only if actually main, not if module loaded
  3. Changed print messages to use logging module
  4. Added verbose flag to arguments

1. Introduced argparse for Argument Parsing

As I stated before, I wanted to use argparse instead of parsing sys.argv manually because it makes the code easier to read but will also make it much easier to add optional arguments or parameters should I need to extend the code in the future. Here is the change log created when I added argparse to the original proxy.py:

diff --git a/proxy.py b/proxy.py
index 87ac686..c599111 100644
--- a/proxy.py
+++ b/proxy.py
@@ -1,6 +1,7 @@
 import sys
 import socket
 import threading
+import argparse



@@ -148,31 +149,15 @@ def server_loop(local_host,local_port,remote_host,remote_port,receive_first):

 def main():

-    # no fancy command line parsing here
-    if len(sys.argv[1:]) != 5:
-        print "Usage: ./proxy.py [localhost] [localport] [remotehost] [remoteport] [receive_first]"
-        print "Example: ./proxy.py 127.0.0.1 9000 10.12.132.1 9000 True"
-        sys.exit(0)
-
-    # setup local listening parameters
-    local_host  = sys.argv[1]
-    local_port  = int(sys.argv[2])
-
-    # setup remote target
-    remote_host = sys.argv[3]
-    remote_port = int(sys.argv[4])
-
-    # this tells our proxy to connect and receive data
-    # before sending to the remote host
-    receive_first = sys.argv[5]
-
-    if "True" in receive_first:
-           receive_first = True
-    else:
-           receive_first = False
-
+    parser = argparse.ArgumentParser()
+    parser.add_argument('localhost')
+    parser.add_argument('localport',type=int)
+    parser.add_argument('remotehost')
+    parser.add_argument('remoteport',type=int)
+    parser.add_argument('--receivefirst', action='store_true')
+    args = parser.parse_args()
     # now spin up our listening socket
-    server_loop(local_host,local_port,remote_host,remote_port,receive_first)
+    server_loop(args.localhost, args.localport, args.remotehost, args.remoteport, args.receivefirst)

 main()

The entire argument parsing operation takes place in a few very easy to read lines (shown below). Notice the following:

  1. Arguments are named by the add_argument method. After adding the argument to the parser, the arguments can be called as properties from the args variable.
  2. Argument types are defined when the argument is added. By default, the type of argument is a string, but by specifying type=int we can tell Argparse to call int(value) when parsing the argument.
  3. Optional flags can be specified and given a default value. The -- in --receivefirst tell argparse that receivefirst is an optional flag. action='store_true' means if the flag is not present, the value is False, otherwise, it is True.
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('localhost')
parser.add_argument('localport',type=int)
parser.add_argument('remotehost')
parser.add_argument('remoteport',type=int)
parser.add_argument('--receivefirst', action='store_true')
args = parser.parse_args()

# now spin up our listening socket
server_loop(args.localhost, args.localport, args.remotehost, args.remoteport, args.receivefirst)

The last thing I wanted to point out about argparse is that you get help by default. Look at what happens when you run python proxy.py with no arguments:

$ python proxy.py
usage: proxy.py [-h] [--receivefirst] localhost localport remotehost remoteport
proxy.py: error: too few arguments

Lets say the term localhost may be confusing, we can add a help statement to remind ourselves of what we meant:

parser.add_argument('localhost', help='The local interface to listen on. Usually 127.0.0.1 or 0.0.0.0')

Then when you run python proxy.py --help:

$ python proxy.py --help
usage: proxy.py [-h] [--receivefirst] localhost localport remotehost remoteport

positional arguments:
  localhost       The local interface to listen on. Usually 127.0.0.1 or 0.0.0.0
  localport
  remotehost
  remoteport

optional arguments:
  -h, --help      show this help message and exit
  --receivefirst

2. Move main() to launch only if actually main, not if module loaded

This is a simple change, but it addresses a pet peeve of mine:

diff --git a/proxy.py b/proxy.py
index c599111..97475d1 100644
--- a/proxy.py
+++ b/proxy.py
@@ -160,4 +160,5 @@ def main():
     # now spin up our listening socket
     server_loop(args.localhost, args.localport, args.remotehost, args.remoteport, args.receivefirst)

-main()
+if __name__ == '__main__':
+    main()

As you can see, all I did was move the main() call into the scope of an if statement. This if statement checks to see how the module was loaded. If the proxy.py module was called from the command line (./proxy.py or python proxy.py) then main() is executed. However, if another python module calls import proxy then main() won't execute on the load. This change makes the code more portable since other modules can use it, at no additional cost.

3. Changed print messages to use logging module

The python logging module allows us to distinguish what kind of log message we are sending to the user. logging defines 5 types of messages, in order of importance, DEBUG, INFO, WARNING, ERROR, CRITICAL. The default logging level is WARNING. Anything that is logged as WARNING or above will be displayed to the user. You can change the logging level using logging.basicConfig, which we will do in the last section of this post. In the following diff, you can see that I change most print statements to logging.info. By default this prevents these messages from being printed to the screen. Whenever an error occurs, I use logging.error.

diff --git a/proxy.py b/proxy.py
index 97475d1..723b613 100644
--- a/proxy.py
+++ b/proxy.py
@@ -2,6 +2,7 @@ import sys
 import socket
 import threading
 import argparse
+import logging



@@ -69,11 +70,11 @@ def proxy_handler(client_socket, remote_host, remote_port, receive_first):
                 hexdump(remote_buffer)

                 # send it to our response handler
-       remote_buffer = response_handler(remote_buffer)
+                remote_buffer = response_handler(remote_buffer)

                 # if we have data to send to our local client send it
                 if len(remote_buffer):
-                        print "[<==] Sending %d bytes to localhost." % len(remote_buffer)
+                        logging.info(" [<==] Sending %d bytes to localhost" % len(remote_buffer))
                         client_socket.send(remote_buffer)

    # now let's loop and reading from local, send to remote, send to local
@@ -85,8 +86,8 @@ def proxy_handler(client_socket, remote_host, remote_port, receive_first):


        if len(local_buffer):
-
-           print "[==>] Received %d bytes from localhost." % len(local_buffer)
+
+           logging.info("[==>] Received %d bytes from localhost." % len(local_buffer))
            hexdump(local_buffer)

            # send it to our request handler
@@ -94,7 +95,7 @@ def proxy_handler(client_socket, remote_host, remote_port, receive_first):

            # send off the data to the remote host
            remote_socket.send(local_buffer)
-           print "[==>] Sent to remote."
+           logging.info("[==>] Sent to remote.")


        # receive back the response
@@ -102,7 +103,7 @@ def proxy_handler(client_socket, remote_host, remote_port, receive_first):

        if len(remote_buffer):

-           print "[<==] Received %d bytes from remote." % len(remote_buffer)
+           logging.info("[<==] Received %d bytes from remote." % len(remote_buffer))
            hexdump(remote_buffer)

            # send to our response handler
@@ -111,13 +112,13 @@ def proxy_handler(client_socket, remote_host, remote_port, receive_first):
            # send the response to the local socket
            client_socket.send(remote_buffer)

-           print "[<==] Sent to localhost."
+           logging.info("[<==] Sent to localhost.")

        # if no more data on either side close the connections
        if not len(local_buffer) or not len(remote_buffer):
            client_socket.close()
            remote_socket.close()
-           print "[*] No more data. Closing connections."
+           logging.info("[*] No more data. Closing connections.")

            break

@@ -128,11 +129,11 @@ def server_loop(local_host,local_port,remote_host,remote_port,receive_first):
         try:
                 server.bind((local_host,local_port))
         except:
-                print "[!!] Failed to listen on %s:%d" % (local_host,local_port)
-                print "[!!] Check for other listening sockets or correct permissions."
+                logging.error("[!!] Failed to listen on %s:%d" % (local_host,local_port))
+                logging.error("[!!] Check for other listening sockets or correct permissions.")
                 sys.exit(0)

-        print "[*] Listening on %s:%d" % (local_host,local_port)
+        logging.info("[*] Listening on %s:%d" % (local_host,local_port))


         server.listen(5)
@@ -141,7 +142,7 @@ def server_loop(local_host,local_port,remote_host,remote_port,receive_first):
                 client_socket, addr = server.accept()

                 # print out the local connection information
-                print "[==>] Received incoming connection from %s:%d" % (addr[0],addr[1])
+                logging.info("[==>] Received incoming connection from %s:%d" % (addr[0],addr[1]))

                 # start a thread to talk to the remote host
                 proxy_thread = threading.Thread(target=proxy_handler,args=(client_socket,remote_host,remote_port,receive_first))

4. Added verbose flag to arguments

The last change hides a lot of information from the user when the program is running and lets the screen stay relatively clean. But what if the user wants to see that information? This is why I love argparse and logging so much. With argparse I can easily add a verbose flag, and with logging, all I have to do is change a variable whenever that verbose flag is set. Here is the diff:

diff --git a/proxy.py b/proxy.py
index 723b613..c936f62 100644
--- a/proxy.py
+++ b/proxy.py
@@ -151,12 +151,16 @@ def server_loop(local_host,local_port,remote_host,remote_port,receive_first):
 def main():

     parser = argparse.ArgumentParser()
+    parser.add_argument('--verbose','-v', action='store_true')
     parser.add_argument('localhost')
     parser.add_argument('localport',type=int)
     parser.add_argument('remotehost')
     parser.add_argument('remoteport',type=int)
     parser.add_argument('receivefirst', action='store_true')
     args = parser.parse_args()
+
+    if args.verbose:
+        logging.basicConfig(level=logging.INFO)

     # now spin up our listening socket
     server_loop(args.localhost, args.localport, args.remotehost, args.remoteport, args.receivefirst)

Thats it! A 3 line addition and we can control the verbosity of the output. We can even use the short -v flag instead of specifying --verbose. If we wanted to, we could introduce a format into the basicConfig to add things like Timestamps and debug levels. It is even possible to write different verbosity levels to a file and the console at the same time.

Conclusion

In conclusion, when you're on an engagement, the most important aspect of a piece of code is that it achieves its objective. However, knowing some of these modules is going to make whipping up and modifying code much easier the next time around. While advanced log handling is probably not necessary for a small proxy script, it will come in handy when you start writing advanced plugins to burp or start fooling around with custom protocols in Scapy. You can download the Seitz's code here and buy his book here. You can download my version here or grab the git revision tracked version here

Greetz and thanks to F4C3 for checking this post for errors!

published on 2015-01-02 20:00:00 by alex