Writing Your Own Flask(like) Framework

I have wanted to demystify what goes behind the Python Flask framework. How does defining something as simple as app.route handle HTTP Requests? How does app.run create a server and maintain it?

To demystify flask, I had two options: read Flask code end to end and understand or Reverse engineer flask by building one on my own. I chose the latter, and this blog is a step-by-step log of how it went.

Side Note

If you are new to Flask, then How to build your 1st flask app might be a good place to start.

Reverse Engineering

Reverse engineering started in my head. I am going to be working with just two files, ownflask.py and demo.py.

Here is how a simple flask application would look

# demo.py
from flask import Flask

app = Flask(__name__)


@app.route("/", methods=["GET", "POST"])
def hello():
    return "hello"

if __name__ == "__main__":
    app.run()

Looking at this sample snippet, I want to mimic the same interface. Take a pass from top to bottom and see what all we need

We need a class Flask which initializes an app object
The Flask class has a method runAnd it starts a server
The Flask class also has a route method that registers the endpoints

Let's lay them down

#ownflask.py

class Flask:
    def __init__(self, name):
        self.name = name
    
    def run(self):
        pass
        
    def route(self, path, methods):
        def wrapper(f):
            pass
        return wrapper

That's gives us the basic skeleton. Let's add the functionality one by own. Python http module provides a HTTPServer let's use that.

Starting a Server

In Flask, app.run is responsible for starting a development webserver. The server then listens to all HTTP requests and responds to them.

#ownflask.py

from http.server import HTTPServer, BaseHTTPRequestHandler

class Flask:
    ...
    
    def run(self, server_class=HTTPServer, handler_class=BaseHTTPRequestHandler, port=8000):
        server_address = ('', port)
        print (f"Running server in port {port}")
        httpd = server_class(server_address, handler_class)
        httpd.serve_forever()

In demo.py, change from flask to from ownflask to work with the module, we just created and run demo.py. On hitting the http://127.0.0.1:8000 You get a 501 error from the browser since we haven't implemented anything to handle the incoming request.

Mapping Requests

The app.route method in Flask registers an endpoint. When an HTTP request comes, it maps it to the associated function call. These routes are maintained in a global object so that the request handler can refer to it. For our ownflask, let's use a global dictionary.

Here I have two methods one to record routes to its associated functions and route_methods to associate endpoints and its HTTPMethods.

routes = {}
route_methods = {}

class Flask:
    ...
    def route(self, path, methods):
        def wrapper(f):
            routes[path] = f
            route_methods[path] = methods
        return wrapper

Handling GET Request

When running our server, we have used a BaseHTTPRequestHandler. From the Python documentation, it is clear that we have to extend it to support handling requests.

By itself, it cannot respond to any actual HTTP requests; it must be subclassed to handle each request method (e.g., GET or POST).

class RequestHandler(SimpleHTTPRequestHandler):
    def do_GET(self):
        self.send_response(200)
        self.send_header("Content-type", "application/json")
        self.end_headers()
        self.wfile.write(str.encode("Handling GET"))
    
    def do_POST(self):
        pass

The above snippet sends Handling GET as a response despite what the route function returns. Let's change that.

dir(self) returns that self.path is the URL, mapping that with routes dict, we can call the respective function.

class RequestHandler(SimpleHTTPRequestHandler):
    def do_GET(self):
        resp = routes[self.path]()
        ...
        ...
        self.wfile.write(str.encode(resp))

Handling URL Params

Flask is known for passing URL params as a part of a URL string or a query string.

/book/<int:id>
/book?id=10

The 1st one would require some form of regex in the routes and the way we store them. Let's handle them later. Let's handle hello world with the name http://127.0.0.1:8000?name=Joe

The current code fails with a KeyError since the query string is also a part of the route.

KeyError: '/?name=jpe'

To parse this and separate the URL path and the query params, we will use urllib

import urllib.parse as urlparse

class RequestHandler(SimpleHTTPRequestHandler):
    def do_GET(self):
        path = urlparse.urlparse(self.path).path
        qs = urlparse.parse_qs(urlparse.urlparse(self.path).query)
        resp = routes[path]()
        
        self.send_response(200)
        self.send_header("Content-type", "application/json")
        self.end_headers()
        self.wfile.write(str.encode(resp))

In Flask, the routing function can access the request params via the global Request object. In our case, for the hello route to access query params, we need the means to pass it to them.

Request Class

class Request:
    def __init__(self, request, method):
        self.request = request
        self.method = method
        self.path = urlparse.urlparse(request.path).path
        self.qs = urlparse.parse_qs(urlparse.urlparse(request.path).query)
        self.headers = request.headers

Let's pass this Request object to the route.

class RequestHandler(SimpleHTTPRequestHandler):
    def do_GET(self):
        request = Request(self, "GET")
        resp = routes[request.path](request)
        
        self.send_response(200)
        self.send_header("Content-type", "application/json")
        self.end_headers()
        self.wfile.write(str.encode(resp))

With the current state, hello() takes 0 positional arguments but 1 was given let's capture request

@app.route("/")
def hello(request):
    return f"hello {request.qs["name"][0]}"

Handling JSON Response

If we modify the hello endpoint to return a dict instead of str, we will receive an error.

the descriptor 'encode' for 'str' objects doesn't apply to a 'dict' object.

It happens because we convert dict to a bytes object. To do this, we should convert the response dict to str and then encode it.

def do_GET(self):
    request = Request(self, "GET")
    resp = routes[request.path](request)
    if isinstance(resp, dict):
        resp = json.dumps(resp)

Handling POST Request

For handling POST requests, you need to access the request body along with other parameters. Let's update the request class to support the same.

class Request:
    def __init__(self, request, method):
        ...
        ...
        self.content_length = int(self.headers.get('content-length', 0))
        self.body = request.rfile.read(self.content_length)
        try:
            self.json = json.loads(self.body)
        except json.decoder.JSONDecodeError: 
            self.json = {}

Let's consume the same via a POST API

@app.route("/todo", methods=["POST"])
def todo(request):
    return {"status": "success", "data": request.json}

Handling Unsupported Request

Right now, if you hit /todo from the browser, you will get the response. This is wrong since we have clearly defined that /todo on supports post request. This is where route_methods comes in really handy.

def do_GET(self):
    ...
    if "GET" not in route_methods[request.path]:
        self.send_response(401)
        self.send_header("Content-type", "application/json")
        self.end_headers()
        self.wfile.write(str.encode(f"{request.path} {request.method} not supported"))
        return

Looks like we are repeating ourselves a lot; let's move them to a common function

class RequestHandler(SimpleHTTPRequestHandler):
    ...
    ...
    def write_response(self, response, status_code):
        self.send_response(status_code)
        self.send_header("Content-type", "application/json")
        self.end_headers()
        if isinstance(response, dict):
            response = json.dumps(response)
        self.wfile.write(str.encode(response))

The final do_GET and do_POST method looks like this.

    def do_GET(self):
        request = Request(self, "GET")
        if "GET" not in route_methods[request.path]:
            self.write_response("Method not supported", 401)
            return
        resp = routes[request.path](request)
        self.write_response(resp, 200)
    
    def do_POST(self):
        request = Request(self, "POST")
        if "POST" not in route_methods[request.path]:
            self.write_response("Method not supported", 401)
            return
        resp = routes[request.path](request)
        self.write_response(resp, 200)

We can further refactor them into

    def not_found(self, request):
        return self.write_response(f"{request.path} 404 NOT FOUND", 404)

    def method_not_supported(self, request):
        return self.write_response(f"{request.path} {request.method} not supported", 401)
        
    def process_request(self, request):
        if request.path not in routes:
            return self.not_found(request)
        if request.method in route_methods[request.path]:
            return self.method_not_supported(request)
        
        resp = routes[request.path](request)
        self.write_response(resp)
    
            
    def do_GET(self):
        request = Request(self, method='GET')
        return self.process_request(request)

    def do_POST(self):
        request = Request(self, method='POST')
        return self.process_request(request)

Introducing Multithreading

At this point, if you write a small multithreading script and hit our server, it will hang because HTTPServer is not designed to handle multiple requests. Replacing it with ThreadingHTTPServer.

from http.server import ThreadingHTTPServer

class Flask:
    ...
    def run(self, name):
        ...
        ...
        self.server = WSGIServer((self.host, self.port), ThreadingHTTPServer)

WSGI vs. HTTP

At this point, I was happy with what I had accomplished and had already posted a tweet, and ArunMozhi nudged me in the direction to explore WSGIServer.

from wsgiref.simple_server import WSGIServer

class Flask:
    ...
    def run(self, name):
        ...
        ...
        self.server = WSGIServer((self.host, self.port), HttpReqHandler)

What started as an experiment to Demyistify flask and understood it better got me into a rabbit hole of new questions.

How is WSGIServer different from HTTPServer the interface look the same?
How can we plug the ownflask to work with Gunicorn
How to add async to ownflask?
Going one step further, How does Gunicorn work?
What are my unknown unknowns?

If you know the answer to any of these, you can send them to me via Twitter.

Also published here.

Writing Your Own Flask(like) Framework

Too Long; Didn't Read

Company Mentioned

Reverse Engineering

Starting a Server

Mapping Requests

Handling GET Request

Handling URL Params

Request Class

Handling JSON Response

Handling POST Request

Handling Unsupported Request

Introducing Multithreading

WSGI vs. HTTP

About Author

TOPICS

THIS ARTICLE WAS FEATURED IN...

Categories

Trending Topics

Writing Your Own Flask(like) Framework

Too Long; Didn't Read

Company Mentioned

Reverse Engineering

Starting a Server

Mapping Requests

Handling GET Request

Handling URL Params

Request Class

Handling JSON Response

Handling POST Request

Handling Unsupported Request

Introducing Multithreading

WSGI vs. HTTP

About Author

TOPICS

THIS ARTICLE WAS FEATURED IN...

RELATED STORIES

Categories

Trending Topics