Skip to content

elninotech/discourse-reader

Repository files navigation

discourse-reader

A typed, read-only Python client for the Discourse forum API.

Respect the forum's Terms of Service

Always check the Terms of Service of any forum you use this client against, especially if you don't own it. Discourse's default ToS template explicitly prohibits automated access except for public search engine indexing:

You may not automate access to the forum, or monitor the forum, such as with a web crawler, browser plug-in or add-on, or other computer program that is not a web browser. You may crawl the forum to index it for a publicly available search engine, if you run one.

It's likely most sites keep this default, so unauthenticated reading may not be permitted. Using an authenticated API key might be preferred (though there are currently no plans to add that functionality to this package). If you have an API key pydiscourse can also be used. It adds more functionality, though it is not wrapped with Pydantic.

Install

pip install discourse-reader

Quick start

from discourse_reader import DiscourseClient

client = DiscourseClient("https://meta.discourse.org")

# Browse categories
for cat in client.categories():
    print(f"{cat.name}: {cat.topic_count} topics")

# Get a topic with all its posts
topic = client.topics.get(12345)
print(topic.title)
print(topic.opening_post.cooked)       # the original post (HTML)
print(topic.accepted_answer)           # accepted answer or None
for reply in topic.posts.replies():
    print(reply.username, reply.cooked)

API

Site-level (flat on client)

client.about()                         # About
client.statistics()                    # SiteStatistics
client.categories()                    # list[Category]
client.tags()                          # list[TagDetail]
client.user("username")                # User
client.search("query", limit=50)       # Iterator[SearchPost]

Topics (client.topics)

client.topics.latest(limit=100)        # Iterator[Topic]
client.topics.top(period="monthly")    # Iterator[Topic]
client.topics.by_category(cat)         # Iterator[Topic]  (pass a Category)
client.topics.by_tag("tag-name")       # Iterator[Topic]
client.topics.get(topic_id)            # TopicResult

All listing methods are lazy iterators with optional limit.

TopicResult

topics.get() returns a TopicResult which delegates to TopicDetail for attributes like title, category_id, views, etc.

topic = client.topics.get(12345)
topic.title                            # str (delegated to TopicDetail)
topic.opening_post                     # Post  -- the original post
topic.accepted_answer                  # Post | None
topic.detail                           # raw TopicDetail model

Posts (topic.posts)

Discourse delivers ~20 posts with the topic detail. The rest are fetched lazily in batches when you iterate.

topic.posts.all()                      # Iterator[Post] -- everything
topic.posts.replies()                  # Iterator[Post] -- everything except OP
len(topic.posts)                       # total post count
for post in topic.posts:               # same as .all()
    ...

Single post

client.posts.get(post_id)             # Post by global ID

Extra fields

All models use extra="allow" -- core fields are typed, plugin fields land in model_extra:

topic.detail.model_extra.get("accepted_answer")   # solved plugin data
post.model_extra.get("accepted_answer")            # per-post flag

Rate limiting

Default: 4 requests/second. Configurable via constructor. Automatic 429 retry with Retry-After.

client = DiscourseClient("https://...", requests_per_second=2)    # slower
client = DiscourseClient("https://...", requests_per_second=None)  # no limit

Development

uv sync
uv run pre-commit install
uv run pre-commit run --all-files
uv run pytest

About

A simple Python wrapper for reading data from Discourse forums

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages