@ -257,7 +257,7 @@ And now in the file `sql_app/main.py` let's integrate and use all the other part
In a very simplistic way create the database tables:
```Python hl_lines="8 9 10"
```Python hl_lines="10 11 12"
{!./src/sql_databases_peewee/sql_app/main.py!}
```
@ -265,7 +265,7 @@ In a very simplistic way create the database tables:
Create a dependency that will connect the database right at the beginning of a request and disconnect it at the end:
```Python hl_lines="15 16 17 18 19 20 21 22"
```Python hl_lines="19 20 21 22 23 24 25"
{!./src/sql_databases_peewee/sql_app/main.py!}
```
@ -277,15 +277,54 @@ And then, in each *path operation function* that needs to access the database we
But we are not using the value given by this dependency (it actually doesn't give any value, as it has an empty `yield`). So, we don't add it to the *path operation function* but to the *path operation decorator* in the `dependencies` parameter:
```Python hl_lines="25 33 40 52 58 65"
```Python hl_lines="36 44 51 63 69 76"
{!./src/sql_databases_peewee/sql_app/main.py!}
```
### Context Variable Middleware
For all the `contextvars` parts to work, we need to make sure there's a new "context" each time there's a new request, so that we have a specific context variable Peewee can use to save its state (database connection, transactions, etc).
For that, we need to create a middleware.
Right before the request, we are going to reset the database state. We will "set" a value to the context variable and then we will ask the Peewee database state to "reset" (this will create the default values it uses).
And then the rest of the request is processed with that new context variable we just set, all automatically and more or less "magically".
For the **next request**, as we will reset that context variable again in the middleware, that new request will have its own database state (connection, transactions, etc).
```Python hl_lines="28 29 30 31 32 33"
{!./src/sql_databases_peewee/sql_app/main.py!}
```
!!! tip
As FastAPI is an async framework, one request could start being processed, and before finishing, another request could be received and start processing as well, and it all could be processed in the same thread.
But context variables are aware of these async features, so, a Peewee database state set in the middleware will keep its own data throughout the entire request.
And at the same time, the other concurrent request will have its own database state that will be independent for the whole request.
#### Peewee Proxy
If you are using a [Peewee Proxy](http://docs.peewee-orm.com/en/latest/peewee/database.html#dynamically-defining-a-database){.external-link target=_blank}, the actual database is at `db.obj`.
@ -313,9 +352,9 @@ def read_users(skip: int = 0, limit: int = 100):
## Testing Peewee with async
This example includes an extra *path operation* that simulates a long processing request with `time.sleep(15)`.
This example includes an extra *path operation* that simulates a long processing request with `time.sleep(sleep_time)`.
It will have the database connection open at the beginning and will just wait 15 seconds before replying back.
It will have the database connection open at the beginning and will just wait some seconds before replying back. And each new request will wait one second less.
This will easily let you test that your app with Peewee and FastAPI is behaving correctly with all the stuff about threads.
@ -325,6 +364,17 @@ If you want to check how Peewee would break your app if used without modificatio
# db._state = PeeweeConnectionState()
```
And in the file `sql_app/main.py` file, comment the middleware:
@ -341,15 +391,19 @@ The tabs will wait for a bit and then some of them will show `Internal Server Er
### What happens
The first tab will make your app create a connection to the database and wait for 15 seconds before replying back and closing the connection.
The first tab will make your app create a connection to the database and wait for some seconds before replying back and closing the connection.
Then one of the other tabs will try to open a database connection, but as one of those requests for the other tabs will probably be handled in the same thread as the first one, it will have the same database connection that is already open, and Peewee will throw an error and you will see it in the terminal, and the response will have an `Internal Server Error`.
Then, for the request in the next tab, your app will wait for one second less, and so on.
This means that it will end up finishing some of the last tabs' requests than some of the previous ones.
Then one the last requests that wait less seconds will try to open a database connection, but as one of those previous requests for the other tabs will probably be handled in the same thread as the first one, it will have the same database connection that is already open, and Peewee will throw an error and you will see it in the terminal, and the response will have an `Internal Server Error`.
This will probably happen for more than one of those tabs.
If you had multiple clients talking to your app exactly at the same time, this is what could happen.
And as your app starts to handle more and more clients at the same time, the waiting time in a single requests needs to be shorter and shorter to trigger the error.
And as your app starts to handle more and more clients at the same time, the waiting time in a single request needs to be shorter and shorter to trigger the error.
### Fix Peewee with FastAPI
@ -359,6 +413,17 @@ Now go back to the file `sql_app/database.py`, and uncomment the line:
db._state = PeeweeConnectionState()
```
And in the file `sql_app/main.py` file, uncomment the middleware:
Repeat the same process with the 10 tabs. This time all of them will wait and you will get all the results without errors.
@ -367,7 +432,7 @@ Repeat the same process with the 10 tabs. This time all of them will wait and yo
## Review all the files
Remember you should have a directory named `my_super_project` that contains a sub-directory called `sql_app`.
Remember you should have a directory named `my_super_project`(or however you want) that contains a sub-directory called `sql_app`.
`sql_app` should have the following files:
@ -405,4 +470,65 @@ Repeat the same process with the 10 tabs. This time all of them will wait and yo
## Technical Details
If you want to go deeper into the technical details related to Peewee with FastAPI, you can <ahref="https://github.com/coleifer/peewee/pull/2072"target="_blank">read more about it here</a>.
!!! warning
These are very technical details that you probably don't need.
### The problem
Peewee uses [`threading.local`](https://docs.python.org/3/library/threading.html#thread-local-data){.external-link target=_blank} by default to store it's database "state" data (connection, transactions, etc).
`threading.local` creates a value exclusive to the current thread, but an async framework would run all the "tasks" (e.g. requests) in the same thread, and possibly not in order.
On top of that, an async framework could run some sync code in a threadpool (using `asyncio.run_in_executor`), but belonging to the same "task" (e.g. to the same request).
This means that, with Peewee's current implementation, multiple tasks could be using the same `threading.local` variable and end up sharing the same connection and data, and at the same time, if they execute sync IO-blocking code in a threadpool (as with normal `def` functions in FastAPI, in *path operations* and dependencies), that code won't have access to the database state variables, even while it's part of the same "task" (request) and it should be able to get access to that.
### Context variables
Python 3.7 has [`contextvars`](https://docs.python.org/3/library/contextvars.html){.external-link target=_blank} that can create a local variable very similar to `threading.local`, but also supporting these async features.
There are several things to have in mind.
The `ContextVar` has to be created at the top of the module, like `some_var = ContextVar("some_var", default="default value")`.
To set a value used in the current "context" (e.g. for the current request) use `some_var.set("new value")`.
To get a value anywhere inside of the context (e.g. in any part handling the current request) use `some_var.get()`.
### Set context variables in middleware
If some part of the async code sets the value with `some_var.set("updated in function")` (e.g. the middleware), the rest of the code in it will see that new value.
And if it calls any other function with `await some_function()` (e.g. `response = await call_next(request)` in our middleware) that internal `some_function()` (or `response = await call_next(request)` in our example) and everything it calls inside, will see that same new value `"updated in function"`.
So, in our case, if we set the Peewee state variable in the middleware and then call `response = await call_next(request)` all the rest of the internal code in our app (that is called by `call_next()`) will see this value we set in the middleware and will be able to reuse it.
But if the value is set in an internal function (e.g. in `get_db()`) that value will be seen only by that internal function and any code it calls, not by the parent function nor by any sibling function. So, we can't set the Peewee database state in `get_db()`, or the *path operation functions* wouldn't see the new Peewee database state for that "context".
### But `get_db` is an async context manager
You might be thinking that `get_db()` is actually not used as a function, it's converted to a context manager.
So the *path operation function* is part of it.
But the code after the `yield`, in the `finally` is not executed in the same "context".
So, if you reset the state in `get_db()`, the *path operation function* would see the database connection set there. But the `finally` block would not see the same context variable value, and so, as the database object would not have the same context variable for its state, it would not have the same connection, so you couldn't close it in the `finally` in `get_db()` after the request is done.
In the middleware we are setting the Peewee state to a context variable that holds a `dict`. So, it's set for every new request.
And as the database state variables are stored inside of that `dict` instead of new context variables, when Peewee sets the new database state (connection, transactions, etc) in any part of the internal code, underneath, all that will be set as keys in that `dict`. But the `dict` would still be the same we set in the middleware. That's what allows the `get_db()` dependency to make Peewee create a new connection (that is stored in that `dict`) and allows the `finally` block to still have access to the same connection.
Because the context variable is set outside all that, in the middleware.
### Connect and disconnect in dependency
Then the next question would be, why not just connect and disconnect the database in the middleware itself, instead of `get_db()`?
First, the middleware has to be `async`, and creating and closing the database connection is potentially blocking, so it could degrade performance.
But more importantly, the middleware returns a `response`, and this `response` is actually an awaitable function that will do all the work in your code, including background tasks.
If you closed the connection in the middleware right before returning the `response`, some of your code would not have the chance to use the database connection set in the context variable.
Because some other code will call that `response` with `await response(...)`. And inside of that `await response(...)` is that, for example, background tasks are run. But if the connection was already closed before `response` is awaited, then it won't be able to access it.