explicitly start ASGI run with empty context#2742
Conversation
|
This does make sense, but do you know why it happens? |
|
@Kludex I was wrong. This is a bug in |
|
Ping @Kludex |
|
We need to fix it in httptools as well, and do you think you can create a test that reproduces the issue? |
Can you explain that? Are you talking about Lines 41 to 44 in 8ae0bcb https://github.com/MagicStack/httptools
Yeah, likely. I haven't written one, because the solution is not the only one that is possible to fix the issue. See #2167 (comment). Can you decide if you want to go with the solution before I put in more work? |
It's a bit unfortunate we need to do it here, but I guess it's fine.
We have a |
There is an open PR python/cpython#141158 that'll hopefully fix this for Python >=3.15. But until that there is no way around doing this in
I've added a test that should reproduce the issue. However, it also won't pass right now I don't know why yet. Trying my fix on reproducers outside of the test suite works just fine. Thus, my working assumption is that one of the mock components does not behave like the real component would. I'll try to find out which one. |
pmeier
left a comment
There was a problem hiding this comment.
@Kludex I opted to avoid using the mock components, because in order to reproduce the issue, we'd need to implement the buggy behavior for them. Instead, the regression test now uses the outer asyncio loop.
The CI for d099a87 (just the test without the changes to the task handling) shows nicely that
- The regression test correctly fails if we do not start the ASGI task with an empty context
- Windows is not affected by this bug (see #2167 (comment))
|
|
||
|
|
||
| @contextlib.asynccontextmanager | ||
| async def server(*, app, port, http_protocol_cls): |
There was a problem hiding this comment.
Naming TBD. Just wanted to get feedback first if the test architecture is ok.
There was a problem hiding this comment.
You should use run_server from utils.py.
This reverts commit d099a87.
|
Ping @Kludex |
| # For the asyncio loop, we need to explicitly start with an empty context | ||
| # as it can be polluted from previous ASGI runs. | ||
| # See https://github.com/python/cpython/issues/140947 for details. | ||
| task = contextvars.Context().run(self.loop.create_task, self.cycle.run_asgi(app)) | ||
| # TODO: Replace the line above with the line below for Python >= 3.11 | ||
| # task = self.loop.create_task(self.cycle.run_asgi(app), context=contextvars.Context()) |
There was a problem hiding this comment.
Currently, the lifespan is a sibling task, but that's not the case in Hypercorn. It may be the case in the future that we refactor the server to make the lifespan task a parent of the whole process instead of a sibling task. Which means that the context would need to come from there. We still don't want the context to leak between sibling tasks as it's currently happening in asyncio...
Since this is going to be fixed in Python 3.15, can we have a note about it? I would like to revert the context= when it lands, or at least remember that we can do it if we decide to make the lifespan task a parent from this one.
There was a problem hiding this comment.
I'm aware of python/cpython#141158, but since this is not merged or ready to be merged, do we already want to assume that 3.15 will fix this?
| await task | ||
|
|
||
|
|
||
| async def test_no_contextvars_pollution(http_protocol_cls: type[HTTPProtocol], unused_tcp_port: int): |
There was a problem hiding this comment.
| async def test_no_contextvars_pollution(http_protocol_cls: type[HTTPProtocol], unused_tcp_port: int): | |
| async def test_no_contextvars_pollution_asyncio(http_protocol_cls: type[HTTPProtocol], unused_tcp_port: int): |
|
|
||
|
|
||
| @contextlib.asynccontextmanager | ||
| async def server(*, app, port, http_protocol_cls): |
There was a problem hiding this comment.
| async def server(*, app, port, http_protocol_cls): | |
| async def server(*, app: ASGIApplication, port: int, http_protocol_cls: type[HTTPProtocol]): |
| await task | ||
|
|
||
|
|
||
| async def test_no_contextvars_pollution(http_protocol_cls: type[HTTPProtocol], unused_tcp_port: int): |
There was a problem hiding this comment.
I would like to remove this test when 3.15 reaches EOF, given it's the event loop's job to make sure this is the case. 😎
Can we have a mark to not run this on >=3.15? 🙏
There was a problem hiding this comment.
We can depending on the answer of #2742 (comment).
Sorry. I was on a short vacation. |
|
hey @Kludex @pmeier I dont have much expertise on asyncio stuff, but is this a temporary fix or the path this project will go? I'm asking because I'm working on implementing automatic context propagation for python asyncio with eBPF and I found out that this PR breaks the use case of FasAPI+uvicorn. The conflicting bit is where "we need to explicitly start with an empty context". We are using eBPF to track the execution of thanks! |
I can't answer for the project / @Kludex, but IMO this is just a temporary fix. This is only required, because I'm interested in your use case though: are you setting a "global" context var, i.e. not on a per-request basis, and by starting with an empty context this global context is hidden? |
|
Hey @pmeier thanks for quick response!
We hook Before 0.39, this worked because the request handler inherited whatever context was active. The context pointer we captured at request start was the same one (or a child of it) when the handler made HTTP calls. So we're not setting a "global" contextvar. Instead, we intercept the C functions in libpython that manage context objects and track the raw memory addresses (pointers) of those |
|
I think we should make lifespan a parent task, and then use the context from there. We can also make this fix only for |
|
Hey thanks also for quick response!
Unless I'm doing something wrong or I missunderstood your message, it's also broken for uvicorn + uvloop. I never tested with uvloop before and I had to tweak the code, but it works for versions prior 0.39. |
What I meant is to "revert" this PR for uvloop, since the problem this PR tried to solve was not an issue with uvloop. |
|
By "revert" here I mean wrap the line that runs the app, check what event loop is running, and apply a behavior accordingly. |
|
I see, I understand now, thanks for clarifying! |
Co-authored-by: Marcelo Trylesinski <marcelotryle@gmail.com>
|
Hi, this fix has actually broken some metaprogramming features I was relying on - I was accessing a single variable for multiple uvicorn apps in my monorepo and then changing the variable depending on the context it was in. This context is no longer accessible inside ASGI function calls. Any idea what I should do? Pinging @Kludex since this is a rather old thread. Also I realize this might seem like an insane use of context vars but from my point of view it made a lot of sense that I could use a FastAPI lifespan to inject whatever I wanted into the context. Classic Hyrum's Law 😅:
|
|
@pmeier I know it's just one user complaining (for now), but I think it's a fair point. I'm inclined to revert the changes (given that I've seen progress on the CPython side for 3.15).
It does, but my reasoning was that the current bug in CPython (and the non-predictability behavior) overweights that. |
|
Yeah, I get it. TBH when I started this I was 90%+ sure that this was a bug in But I guess your reasoning to revert is sound. We just need to make sure to also push this to downstream projects. For example, I originally stumbled over this bug when instrumenting with opentelemetry. And after my fix here was merged and released, I posted there that the bug reports can be closed: open-telemetry/opentelemetry-python-contrib#2852 (comment) |
Reintroduces the behavior from #2742 as an opt-in flag. When set, each ASGI request runs in a fresh contextvars.Context, which works around the asyncio context leak in python/cpython#140947 (expected to be fixed upstream for Python >=3.15 via python/cpython#141158). Default is off so context set in the lifespan or by external instrumentation (e.g. OpenTelemetry eBPF propagation) remains visible to ASGI handlers.
|
I've reverted this PR and merged #2912. To have this behavior, you need to opt-in with I'm sorry for the delay. I'll post a message on related closed issues. |
Summary
One possible solution for #2167.
Checklist