This document proposes a way to complement docstrings to add additional documentation metadata to Python symbols (classes, functions, methods, variables, parameters), like marking something as "deprecated", based on the same Python language and structure, using type annotations with `Annotated`and decorators.
This document proposes a way to complement docstrings to add additional documentation to Python symbols using type annotations with `Annotated`(in class attributes, function and method parameters, return values, and variables).
## Motivation
The current standard method of documenting code APIs in Python is using docstrings. But there's no standard way to add additional metadata like marking something as "deprecated", and there's no standard for how to document parameters in docstrings.
The current standard method of documenting code APIs in Python is using docstrings. But there's no standard way to document parameters in docstrings.
There are several pseudo-standards for the format in these docstrings, and new pseudo-standards can appear easily: numpy, Google, Keras, reST, etc.
All these formats are some specific syntax put inside a string. Because of this, when editing those docstrings, editors can't easily provide support for autocompletion, inline errors for broken syntax, etc.
All these formats are some specific syntax inside a string. Because of this, when editing those docstrings, editors can't easily provide support for autocompletion, inline errors for broken syntax, etc.
Editors don't have a way to support all the possible micro languages in the docstrings and show nice user interfaces when developers use libraries with those different formats. They could even add support for some of the syntaxes, but probably not all.
Editors don't have a way to support all the possible micro languages in the docstrings and show nice user interfaces when developers use libraries with those different formats. They could even add support for some of the syntaxes, but probably not all, or not completely.
Because the docstring is in a different place in the code than the actual parameters and it requires duplication of information (the parameter name) the information about a parameter is easily in a place in the code quite far away from the declaration of the actual parameter. This means it's easy to refactor a function, remove a parameter, and forget to remove its docs. The same happens when adding a new parameter, it's easy to forget to add the docstring for it.
Editors can't check or ensure the consistency of the parameters and their docs.
It's not possible to programatically test these docstrings, for example to ensure documentation consistency for the parameters across several similar functions, at least not without implementing a custom syntax parser.
Some of these formats tried to account for the lack of type annotations in older Python versions, but now that information doesn't need to be there.
## Rationale
This proposal intends to address these shortcomings by extending and complementing the information in docstrings, keeping backwards compatibility with existing docstrings, and doing it in a way that leverages the Python language and structure, via type annotations with `Annotated` and decorators, and a new function in `typing`.
This proposal intends to address these shortcomings by extending and complementing the information in docstrings, keeping backwards compatibility with existing docstrings, and doing it in a way that leverages the Python language and structure, via type annotations with `Annotated`, and a new function in `typing`.
The reason why this would belong in the standard Python library instead of an external package is because although the implementation would be quite trivial, the actual power and benefit from it would come from being a standard, so that editors and other tools could implement support for it.
The reason why this would belong in the standard Python library is because although the implementation would be quite trivial, the actual power and benefit from it would come from being a standard, so that editors and other tools could implement support for it.
This doesn't deprecate current usage of docstrings, it's transparent to common developers (library users), and it's only opt-in for library authors that would like to adopt it.
## Specification
### `typing.doc`
The main proposal is to have a new function `doc()` in the `typing` module. Even though this is not strictly related to the type annotations, it's expected to go in `Annotated` type annotations, and to interact mainly with type annotations.
The main proposal is to have a new function `doc()` in the `typing` module. Even though this is not strictly related to the type annotations, it's expected to go in `Annotated` type annotations, and to interact with type annotations.
There's also the particular benefit to having it in `typing` that it could be implemented in the `typing_extensions` package to have support for older versions of Python.
This `doc()` function would receive several parameters for metadata and documentation. It could be used inside of `Annotated`, or also as a decorator on top of classes, functions, and methods.
This `doc()` function would receive one single parameter `documentation` with a documentation string.
* `description: str`: in a parameter, this would be the description of the parameter. In a class, function, or method, it would replace the docstring.
* This could probably contain markup, like Markdown or reST. As that could be highly debated, that decision is left for a future proposal, to focus here on the main functionality.
* `deprecated: bool`: this would mark a parameter, class, function, or method as deprecated. Editors could display it with a strike-through or other appropriate formatting.
* `discouraged: bool`: this would mark a parameter, class, function, or method as discouraged. Editors could display them similar to `deprecated`. The reason why having a `discouraged` apart from `deprecated` is that there are cases where something is not gonna be removed for backward compatibility, but it shouldn't be used in new code. An example of this is `datetime.utcnow()`.
* `raises: Mapping[Type[BaseException], str | None]`: in a class, function, or method, this indicates the types of exceptions that could be raised by calling it in they keys of the dictionary. Each value would be the description for each exception, possibly being `None` when the exception has no description. Editors and tooling could show a warning (e.g. a colored underline) if the call is not wrapped in a `try` block or the parent caller doesn't include the same exceptions in its `raises` parameter.
* `extra: dict`: a dictionary containing any additional metadata that could be useful for developers or library authors.
* An `extra` parameter instead of `**kwargs` is proposed to allow adding future standard parameters.
* `**kwargs: Any`: allows arbitrary additional keyword args. This gives type checkers the freedom to support experimental parameters without needing to wait for changes in `typing.py`. Type checkers should report errors for any unrecognized parameters. This follows the same pattern designed in [PEP 681 – Data Class Transforms](https://peps.python.org/pep-0681/).
This string could be a multi-line string, in which case, when extracted by tools, should be interpreted cleaning up indentation as if using `inspect.cleandoc()`, the same procedure used for docstrings.
This string could probably contain markup, like Markdown or reST. As that could be highly debated, that decision is left for a future proposal, to focus here on the main functionality.
This specification targets static analysis tools and editors, and as such, the values passed to these parameters should allow static evaluation and analysis. If a developer passes as the value to one of these parameters something that requires runtime execution (e.g. a function call) the behavior of static analysis tools is unspecified and they could omit that parameter from their process and results. For static analysis tools to be conformant with this specification they need only to support statically accessible values.
This specification targets static analysis tools and editors, and as such, the value passed to `doc()` should allow static evaluation and analysis. If a developer passes as the value something that requires runtime execution (e.g. a function call) the behavior of static analysis tools is unspecified and they could omit it from their process and results. For static analysis tools to be conformant with this specification they need only to support statically accessible values.
An example documenting the attributes of a class, or in this case, the keys of a `TypedDict` could look like this:
An example documenting the attributes of a class, or in this case, the keys of a `TypedDict`, could look like this:
```python
from typing import Annotated, TypedDict, NotRequired, doc
class User(TypedDict):
firstname: Annotated[str, doc(description="The user's first name")]
lastname: Annotated[str, doc(description="The user's last name")]
firstname: Annotated[str, doc("The user's first name")]
lastname: Annotated[str, doc("The user's last name")]
name: Annotated[
NotRequired[str],
doc(
description="The user's full name, this field is deprecated",
deprecated=True,
),
doc("The user's full name, this field is deprecated"),
]
```
An example documenting a function could look like this:
An example documenting the parameters of a function could look like this:
```python
from typing import Annotated, doc
@doc(
description="Create a new user in the system",
raises={
InvalidUserError: "Raised when the user name is not allowed by the system",
UserExistsError: "Raised when the user already exists in the system",
},
)
def create_user(
lastname: Annotated[
str, doc(description="The **last name** of the newly created user")
],
lastname: Annotated[str, doc("The **last name** of the newly created user")],
*,
firstname: Annotated[
str | None,
doc(description="The **first name** of the newly created user"),
doc("The **first name** of the newly created user"),
] = None,
name: Annotated[
str | None,
doc(
descrption="The user's full name, this argument is deprecated.",
deprecated=True,
),
doc("The user's full name, this argument is deprecated."),
] = None,
) -> Annotated[User, doc(description="The created user after saving in the database")]:
) -> Annotated[User, doc("The created user after saving in the database")]:
"""
Create a new user in the system, it needs the database connection to be already
initialized.
@ -101,28 +86,19 @@ def create_user(
pass
```
In this example, the description in the `@doc()` decorator would override the docstring.
Also notice that the parameter `name` is marked as `deprecated`.
This could also be used to add more documentation to symbols in the standard library. For example:
```python
from datetime import datetime
from typing import doc
The return of the `doc()` function is an instance of a class that can be checked and used at runtime, defined as:
```Python
from dataclasses import dataclass
@doc(discouraged=True)
def utcnow() -> datetime:
"""
Return the current UTC date and time, with tzinfo None.
This function is not deprecated for backwards compatibility with legacy code, but
is discouraged. For new code you should use: `datetime.now(timezone.utc)`.
"""
return datetime.utcnow()
@dataclass
class DocInfo:
documentation: str
```
where the attribute `documentation` contains the same value string passed to the function `doc()`.
## Alternate Form
To avoid delaying adoption of this proposal until after the `doc()` function has been added to the typing module, type checkers and tools may support an alternative form `__typing_doc__`. This form can be defined locally without any reliance on the `typing` or `typing_extensions` modules. It allows immediate adoption of the specification by library authors. Type checkers that have not yet adopted this specification will retain their current behavior.
@ -130,22 +106,18 @@ To avoid delaying adoption of this proposal until after the `doc()` function has
To use this alternate form, library authors should include the following declaration within their type stubs or source files.
```Python
from typing import Any, Callable, Mapping, Type, TypeVar
# If used within a stub file, the following implementation can be
# replaced with "...".
return lambda a: a
documentation: str,
) -> DocInfo:
return DocInfo(documentation)
```
And then they can use it in the same places they would use `doc()`.
@ -156,42 +128,78 @@ This alternate form can be used by early adopters including libraries, linters,
## Rejected Ideas
### Standardize Current Docstrings
A possible alternative would be to support and try to push as a standard one of the existing docstring formats. But that would only solve the standardization.
It wouldn't solve any of the other problems, like getting editor support (syntax checks) for library authors, the distance and duplication of information between a parameter definition and its documentation in the docstring, etc.
### Extra Metadata and Decorator
An earlier version of this proposal included several parameters for additional metadata, including a way to deprecate parameters. To allow also deprecating functions and classes, it was expected to also be used as a decorator. But this functionality is covered by `typing.deprecated()` in [PEP 702](https://peps.python.org/pep-0702/), so it was dropped from this proposal.
The previously proposed parameters were:
* `description: str`: in a parameter, this would be the description of the parameter. In a class, function, or method, it would replace the docstring.
* This could probably contain markup, like Markdown or reST. As that could be highly debated, that decision is left for a future proposal, to focus here on the main functionality.
* `deprecated: bool`: this would mark a parameter, class, function, or method as deprecated. Editors could display it with a strike-through or other appropriate formatting.
* `discouraged: bool`: this would mark a parameter, class, function, or method as discouraged. Editors could display them similar to `deprecated`. The reason why having a `discouraged` apart from `deprecated` is that there are cases where something is not gonna be removed for backward compatibility, but it shouldn't be used in new code. An example of this is `datetime.utcnow()`.
* `raises: Mapping[Type[BaseException], str | None]`: in a class, function, or method, this indicates the types of exceptions that could be raised by calling it in they keys of the dictionary. Each value would be the description for each exception, possibly being `None` when the exception has no description. Editors and tooling could show a warning (e.g. a colored underline) if the call is not wrapped in a `try` block or the parent caller doesn't include the same exceptions in its `raises` parameter.
* `extra: dict`: a dictionary containing any additional metadata that could be useful for developers or library authors.
* An `extra` parameter instead of `**kwargs` is proposed to allow adding future standard parameters.
* `**kwargs: Any`: allows arbitrary additional keyword args. This gives type checkers the freedom to support experimental parameters without needing to wait for changes in `typing.py`. Type checkers should report errors for any unrecognized parameters. This follows the same pattern designed in [PEP 681 – Data Class Transforms](https://peps.python.org/pep-0681/).
A way to declare some of these parameters could still be useful in the future, but taking early feedback on this document, all that was postponed to future proposals. This also shifts the focus from an all-encompasing function `doc()` with multiple parameters to multiple composable functions, having `doc()` handle one single use case: additional documentation in `Annotated`.
This design change also allows better interoperability with other proposals like `typing.deprecated()`, as in the future it could be considered to allow having `typing.deprecated()` also in `Annotated` to deprecate individual parameters, coexisting with `doc()`.
## Open Issues
### Runtime Behavior
### Verbosity
The main argument against this would be the increased verbosity.
Nevertheless, this verbosity would not affect end users as they would not see the internal code using `typing.doc()`.
And the cost of dealing with the additional verbosity would only be carried by those library maintainers that decide to opt-in into this feature.
Any authors that decide not to adopt it, are free to continue using docstrings with any particular format they decide, no docstrings at all, etc.
This argument could be analogous to the argument against type annotations in general, as they do indeed increase verbosity, in exchange for their features. But again, as with type annotations, this would be optional and only to be used by those that are willing to take the extra verbosity in exchange for the benefits.
### Doc is not Typing
It could also be argued that documentation is not really part of typing, or that it should live in a different module. Or that this information should not be part of the signature but live in another place (like the docstring).
Nevertheless, type annotations in Python could already be considered, by default, mainly documentation: they carry additional information about variables, parameters, return types, and by default they don't have any runtime behavior.
It could be argued that this proposal extends the type of information that type annotations carry, the same way as [PEP 702](https://peps.python.org/pep-0702/) extends them to include deprecation information.
And as described above, being able to include this in `typing_extensions` to support older versions of Python is a very simple and practical reason why it would make sense to have this as part of `typing`.
Runtime behavior is still not defined. It would probably make sense to have an attribute `__typing_doc__` on the function, method, or class. `__doc__` is already reserved for docstrings.
### Multiple Standards
For parameters, it could include the same object in the same `Annotated` type annotations.
Another argument against this would be that it would create another standard, and that there are already several pseudo-standards for docstrings. It could seem better to formalize one of the currently existing standards.
### Additional Parameters
Nevertheless, as stated above, none of those standards cover the general drawbacks of a doctsring-based approach that this proposal solves naturally.
Other possible future parameters could include:
None of the editors have full docstring editing support (even when they have rendering support). Again, this is solved by this proposal just by using standard Python syntax and structures instead of a docstring microsyntax.
* `version_added: str`: the version of the library where this parameter, class, method, or function was added.
* `blocks: bool`: this would mark a callable as a synchronous blocking call. This way, editors could use it to warn about using it in async contexts directly.
* `example: Any`: an example value for a parameter, or an example of the usage of the class, function, or method.
The effort required to implement support for this proposal by tools would be minimal compared to that required for alternative docstring-based pseudo-standards, as for this proposal, editors would only need to access an already existing value in their ASTs, instead of writing a parsers for a new string microsyntax.
### Additional Features
In the same way, it can be seen that, in many cases, a new standard that takes advantage of new features and solves several problems from previous methods can be worth having. As is the case with the new `pyproject.toml`, `dataclass_transform`, the new typing pipe/union (`|`) operator, and other cases.
Other possible features that could be added in the future, probably in the form of additional parameters, could include:
### Adoption
* `**kwargs`: some form of documenting additional keyword arguments defined with `**kwargs` instead of using the independent parameters in the function/callable signature.
* Generators: a way to document the types of values yielded by a generator.
* Warnings: a way to document the possible warnings that a function or method can create.
* Section titles: a way to separate groups of parameters into sections (or tags), mainly for documentation purposes, if that was included, it could make sense to also support a way to define the order of sections and their descriptions. Although this might be achievable using the current `extra` parameter for extensions.
As this is a new standard and proposal, it would only make sense if it had interest from the community.
### Duplicated Effort
Fortunately there's already interest from several mainstream libraries from several developers and teams, including FastAPI, Typer, SQLModel, Asyncer (from the author of this proposal), Pydantic, Strawberry, and others, from other teams.
There are some cases where there's already some type of similar alternative in place, sometimes applying only at runtime. For example, for the standard library, there's [`warnings._deprecated()`](https://github.com/python/cpython/blob/e9e63ad8653296c199446d6f7cdad889e492a34e/Lib/warnings.py#L498-L514).
There's also interest and support from documentation tools, like [mkdocstrings](https://github.com/mkdocstrings/mkdocstrings), which added support even for an earlier version of this proposal.
This new proposal would possibly involve some effort duplication for library maintainers adopting it if they already use another custom solution. Nevertheless, this duplication would only affect library authors, final users would not be affected by the duplication and would still benefit from the proposed standard plus any additional benefits provided by the current custom solution.
All the CPython core developers contacted for early feedback (at least 4) have shown interest and support for this proposal.
It's also possible it would be wanted to have some connection from the documented information (e.g. a deprecation) to the runtime exception being raised or the warning generated. This document doesn't propose any solution for that, but it's something that could be considered in the future, for example, tools like mypy could warn if the code uses `doc()` but doesn't document exceptions being raised or warnings being generated.
Editor developers (VS Code and PyCharm) have shown some interest, while showing concerns about the verbosity of the proposal, although not about the implementation (which is what would affect them the most). And they have shown they would consider adding support for this if it were to become an official standard. In that case, they would only need to add support for rendering, as support for editing, which is normally non-existing for other standards, is already there, as they already support editing standard Python syntax.