Notes on Python Starlette
Sun Aug 18 2024E.W.Ayers
Starlette is a Python library for writing HTTP servers. This is a set of notes on how it works. If you just want to use Starlette, the docs are very good. I wrote this becuase I found myself trawling the source code a lot.
1. ASGI
The first piece of the puzzle is ASGI. This is a protocol that the server process (ie Uvicorn) uses to communicate with the application process.
1.1. Transport
An ASGI Application is a Python callable with the signature:
ASGI Application signature.
type Value = bytes | str | int | float | list[Value] | dict[str, Value] | bool | Nonetype Scope = dict[str, Value]type Event = dict[str, Value]type ReceiveFn = Callable[[], Awaitable[Event]]type SendFn = Callable[[Event], Awaitable[None]]class ASGIApp(Protocol):def __call__(self,scope: Scope,receive: ReceiveFn,send: SendFn,) -> Awaitable[None]:...
Each new connection the server recieves will invoke this app function.
The recieve
and send
functions amount to defining a transport for communicating with your connection peer.
Middleware means some wrapper for an asgi-app that is itself an asgi-app. Eg you could have a middleware that does auth.
Then on top of this, there are a load of standards for the contents passed in the various dictionaries. This is called the protocol.
The "type"
value in scope
defines the protocol.
Every Event
dict must have a "type"
key saying what kind of message it is.
Afaict, the type is always prefixed with the protocol name.
1.2. Http protocol
The scope
value in the case of HTTP is given here.
This contains all of the information you need to handle the http request.
The only weird thing is the "state"
key, which is to do with lifespans.
Using my made-up syntax for typed python dictionaries. These are non-exhaustive, but give a flavour of what the data looks like.
HTTP ASGI protocol dictionary types.
type HttpConnectionScope = {"type": Literal["http"],"scheme": Literal["http", "https"],"path": str,"""HTTP request target excluding any query string""""method": Literal["POST", "GET", ...],"query_string" : str,"headers" : list[tuple[str, str]]"state"?: dict[str, Any],... # other stuff}type HttpReceiveEvent = {"type": Literal["http.request"],"body": bytes, # the body of the request"more_body": bool, # whether there is more body to come} | {"type": Literal["http.disconnect"],}type HttpSendEvent = {"type": Literal["http.response.start"],"status": int, # http status code"headers": list[tuple[str, str]], # http headers} | {"type": Literal["http.response.body"],"body": bytes, # the body of the response"more_body": bool, # whether there is more body to come}
There is a similar protocol for websockets.
1.3. Lifespans
There is a special protocol for managing startup and shutdown of the server, called the lifespan protocol.
Lifespan ASGI protocol dictionary types.
type LifespanScope = {"type": Literal["lifespan"],"state" ?: dict[str, Any],}type LifespanReceiveEvent = {"type": Literal["lifespan.startup"],} | {"type": Literal["lifespan.shutdown"],}type LifespanSendEvent = {"type": Literal["lifespan.startup.complete"],} | {"type": Literal["lifespan.startup.failed"],"message"?: str,} | {"type": Literal["lifespan.shutdown.complete"],} | {"type": Literal["lifespan.shutdown.failed"],"message"?: str,}
How it works:
The server will call
asgi_app
with ascope
dictionary set to{"type": "lifespan", "state" : {}}
, it keeps a reference to the"state"
dictionary.The app function
asgi_app
willawait recieve()
a"lifetime.startup"
event.The app function will then do whatever it needs to do to start up, includings setting values on the
scope['state']
dictionary.The app function will
await send({"type": "lifespan.startup.complete"})
, The server will then start processing requests. Each request will callasgi_app
withscope
dictionaries that have a shallow copy of the state dictionary.The app function will
await receive()
a"lifetime.shutdown"
event, which will resolve when the server is shutting down.The app function will then do whatever it needs to do to shut down.
The app function will
await send({"type": "lifespan.shutdown.complete"})
, and the server will exit, (or a"lifespan.shutdown.failed"
message if it failed)
The "state"
dictionary is a good place to put things like database connections, so that they can be shared between requests.
There is a caveat to using the lifespan protocol though, which is that it is only called once for the lifetime of the server, not for each worker thread/process in the server. This can cause nasty bugs with DB connections. For example, SQLAlchemy connections are not multiprocess-safe, so you can't keep a connection object on the state dictionary, because it will be copied between uvicorn worker processes.
2. Starlette
Now we are ready to talk about Starlette. Starlette is a Python library for creating ASGI applications to map HTTP requests to handler functions.
2.1. Routers and Routes
Starlette does this using the Router
and BaseRoute
classes.
Here is a simplified version of the code for these classes.
Simplified excerpt for Starlette routing. source
class Match(Enum):NONE = 0"""The route does not match the scope"""PARTIAL = 1"""The route matches the scope, but it should be givenlower priority if any other routes are a full match."""FULL = 2"""The route matches the scope"""class BaseRoute:def matches(self, scope : Scope) -> tuple[Match, Scope]:"""A predicate function to determine whether therequest scope will match with this route.Returns:Match: whether the route matches the scope.Scope: a new scope that will be merged with theoriginal scope and passed to the handler function."""raise NotImplementedError()def url_path_for(self, name : str, **path_params) -> URLPath:""" Generate a URL from a route name and path parameters.The route name is some string that internally identifies the route."""raise NotImplementedError()async def handle(self, scope : Scope, receive : ReceieveFn, send : SendFn):"""Handle the request using ASGI protocol."""raise NotImplementedError()async def __call__(self, scope, receive, send):match, child_scope = self.matches(scope)if match == Match.None:return await not_found(scope, receive, send)scope.update(child_scope)return await self.handle(scope, receive, send)type Lifespan= Callable[[ASGIApp], AsyncContextManager[dict[str, Any]]# I've changed the signature of middleware slightly to# make it clear it's just a function on ASGI appstype Middleware = Callable[[ASGIApp], ASGIApp]@dataclassclass Router:routes : list[BaseRoute]lifespan : Lifespanmiddleware : list[Middleware]def app(self, scope, receive, send):""" ASGI app function _before_ middleware is applied."""if scope["type"] == "lifespan":# lifespan() is as described in the lifespan protocol,# using Lifespan type as you would expectreturn self.lifespan(scope)partial = Nonefor route in self.routes:match, child_scope = route.matches(scope)if match == Match.FULL:scope.update(child_scope)return await route.handle(scope, receive, send)elif match == Match.PARTIAL and partial is None:partial = routepartial_scope = child_scopeif partial is not None:scope.update(partial_scope)await partial.handle(scope, receive, send)return# ... some extra logic here to that if the route path ends# with a slash we redirect to the same path without the slash# not_found will pump out a 404 errorreturn await self.not_found(scope, receive, send)def __call__(self, scope, receive, send):""" ASGI app function _after_ middleware is applied."""app = self.appfor middleware in reversed(self.middleware):app = middleware(app)return app(scope, receive, send)
That's all Starlette is doing at its core.
There is a class Starlette
that afaict is just a wrapper around a Router
instance, with some convenience methods for adding routes and default middleware for exception handling.
You could just use the bare Router
class as the ASGI app and everything would still work.
2.2. Route implementations
Starlette comes with some implementations of BaseRoute
that you can use to create routes.
Route
takes a path string like"/foo/{bar}"
, an http method"GET"
and a callback function, and will invoke the callback if the path matches the request path (parsing out parameters like{bar}
). You can also pass regex patterns to match the path. See thecompile_path
function inrouting.py
for more details.Route
's callback function will not use ASGI and instead use convenience classesRequest
andResponse
for working with HTTP requests and responses. These manage details of HTTP such as streaming responses and requests, forms, and making sure headers are set properly.WebSocketRoute
is the same asRoute
but for websockets instead of HTTP requests.Mount
takes a path prefix, an ASGI app or a list of routes and makes a 'sub-app' that will match requests with the path prefix.Host
will look at theHost:
http header and match the request to a sub-app based on the host.StaticFiles
will serve static files from a directory.
2.3. Extra things Starlette does
Middleware library for things like auth, cors, gzip, error management etc.
A simple OpenAPI schema generator
Response
classes for forms, streaming, jinja templates, etc.A test client for unit tests.
3. FastAPI
FastAPI is another library that builds on top of Starlette to provide some convenience features:
It will use type annotations on router handlers to automatically perform validation and OpenAPI documentation generation.
It has a type-annotation-based dependency injection system that can automatically inject things like database connections into handler functions.
That is it.
4. Appendix: URLs
An example URL.
https://hello.example.com:443/cheese/cheddar?strength=10&sort=asc#nutritional-info
And here are the bits you have to know
scheme = the bit before
://
, iehttps
authority or host =
hello.example.com:443
. The//
indicates the next part is the authority.domain =
hello.example.com
. Alternatively to a domain we can have a straight-up IP address.subdomain =
hello
domain name =
example.com
top-level domain =
com
port =
443
path =
/cheese/cheddar
query =
?strength=10&sort=asc
, a set of key-value pairs separated by&
and starting with?
anchor =
#nutritional-info
, an identifier that points to a specific part of the page (eg in html, it's theid
attribute of an element)
So when we write an http router for the server, we can condition on:
the url subdomain
the url path
the url query
the http headers
the http method
the http request body
So there are 6 ways of passing information to the router, and they are all used in different ways on different servers.