Skip to content

update from Gluejar#94

Open
eshellman wants to merge 1130 commits into
EbookFoundation:masterfrom
Gluejar:master
Open

update from Gluejar#94
eshellman wants to merge 1130 commits into
EbookFoundation:masterfrom
Gluejar:master

Conversation

@eshellman

Copy link
Copy Markdown

No description provided.

and don' call distinct unless needed
update tests, fix slow OPDS, optimize queryset access
based on work done for doab-check
rdhyee and others added 30 commits June 11, 2026 10:32
Fix #1155: TypeError in Work.publication_date from two-query race
Fix #1156: widget endpoint 500s on unknown/non-numeric/invalid ids
Staging boxes restored from a prod snapshot keep prod's Site row
(domain='unglue.it'), so every emailed link (password-reset, notices,
etc.) points at prod instead of the staging box's own host.

This command updates Site.objects.get_current() (the SITE_ID row) to
the supplied domain (and optional name).  It is idempotent: if the row
already matches, it prints a no-op message and exits cleanly.

Used by the provisioning repo's post-deploy Ansible task to localise
the Site to the box's own server_name on every deploy.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Mechanical, no-meaning-change corrections to the FAQ page surfaced by the
CC+Codex copy review on #1165. Typos, a broken URL, proper-noun/acronym
fixes, subject-verb agreement, and site-name casing — nothing factual.

- "that why" → "that's why"; "an non-profit" → "a non-profit"
- "do well be selling" → "by selling"; "the the copyright" → "the copyright"
- "right holder tools" → "rights holder tools"; "some interested" → "some interest"
- broken Facebook URL "facebook/com" → "facebook.com"
- "Wikisources/Hathi Trust/Github" → "Wikisource/HathiTrust/GitHub"
- "page.You'll" → "page. You'll"; mid-sentence "Let" → "let"
- "cannot not be obtained" → "cannot be obtained"; "They does" → "They do"
- "Authors' Guild" → "Authors Guild"; CC license styling "NoDerivatives, NonCommercial"
- site-name casing unglue.it → Unglue.it; "Thanks for Ungluing" → "Thanks-for-Ungluing"

Factual/staleness/voice items (fees, payouts, sender email, campaign
retirement, etc.) are handled separately in the judgment-call PR.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
FAQ copy: objective typo & grammar fixes (#1165)
Mechanical, no-meaning-change corrections to the FAQ page surfaced by the
CC+Codex copy review on #1165. Typos, a broken URL, proper-noun/acronym
fixes, subject-verb agreement, and site-name casing — nothing factual.

- "that why" → "that's why"; "an non-profit" → "a non-profit"
- "do well be selling" → "by selling"; "the the copyright" → "the copyright"
- "right holder tools" → "rights holder tools"; "some interested" → "some interest"
- broken Facebook URL "facebook/com" → "facebook.com"
- "Wikisources/Hathi Trust/Github" → "Wikisource/HathiTrust/GitHub"
- "page.You'll" → "page. You'll"; mid-sentence "Let" → "let"
- "cannot not be obtained" → "cannot be obtained"; "They does" → "They do"
- "Authors' Guild" → "Authors Guild"; CC license styling "NoDerivatives, NonCommercial"
- site-name casing unglue.it → Unglue.it; "Thanks for Ungluing" → "Thanks-for-Ungluing"

Factual/staleness/voice items (fees, payouts, sender email, campaign
retirement, etc.) are handled separately in the judgment-call PR.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
(cherry picked from commit 0c43233)
… not a hard fail

Codex review of #1164 fix: a fresh/scrubbed DB without a Site row for SITE_ID
would crash the post-deploy task with Site.DoesNotExist. get_or_create makes it
self-healing while staying idempotent on existing rows.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add set_site_domain management command (fix #1164)
Staging boxes restored from a prod snapshot keep prod's Site row
(domain='unglue.it'), so every emailed link (password-reset, notices,
etc.) points at prod instead of the staging box's own host.

This command updates Site.objects.get_current() (the SITE_ID row) to
the supplied domain (and optional name).  It is idempotent: if the row
already matches, it prints a no-op message and exits cleanly.

Used by the provisioning repo's post-deploy Ansible task to localise
the Site to the box's own server_name on every deploy.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… not a hard fail

Codex review of #1164 fix: a fresh/scrubbed DB without a Site row for SITE_ID
would crash the post-deploy task with Site.DoesNotExist. get_or_create makes it
self-healing while staying idempotent on existing rows.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…1170 (#1171)

Align master with deployed Django 4.2 (prod-green @ 751f781) — refs #1170
libraryauth defined its AppConfig (with ready() -> `from . import signals`) in
__init__.py and relied on `default_app_config`. Django 4.1 REMOVED
default_app_config, and an AppConfig in __init__.py is not auto-discovered
(Django only scans <app>/apps.py). So since the 2026-06-17 Django 4.2 cutover,
LibraryAuthConfig.ready() never ran, signals.py was never imported, and the
`@receiver(user_activated) handle_same_email_account` (same-email account dedup
on registration activation) was silently disconnected in production.

Fix: move LibraryAuthConfig to libraryauth/apps.py (auto-discovered) and drop the
dead default_app_config. Backward-compatible (valid on 4.2 and 5.2).

Proven empirically (minimal repro): with the config in __init__.py + no apps.py,
ready() does NOT fire on either Django 4.2.21 or 5.2.15; adding apps.py restores
it on both.

Scope: swept all first-party apps — only `core` and `libraryauth` define ready();
core already has apps.py (fine). libraryauth was the sole casualty.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011dumTGwdDfpJJMGC4ThisJ
Per Codex review of #1176: assert LibraryAuthConfig is the active app config
(so ready() runs) and that handle_same_email_account is connected to the
user_activated signal. Guards against the config drifting back out of apps.py.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011dumTGwdDfpJJMGC4ThisJ
Fix libraryauth signals: move AppConfig to apps.py so ready() fires
recommended_user (frontend/views/__init__.py:637) is a QuerySet
(User.objects.filter(...)). Passing it to the exact related lookup
wishlists__user=recommended_user raised, since Django 4.x:
  ValueError: The QuerySet value for an exact lookup must be limited to
  one result using slicing.
Django 1.11 tolerated a QuerySet here, so /lists/recommended has returned
HTTP 500 since the 1.11->4.2 cutover (2026-06-17).

Fix: wishlists__user__in=recommended_user. Behavior-preserving for the
intended single 'unglueit' user, and degrades gracefully (empty result,
not 500) if that user is absent. Valid on both Django 4.2 and 5.2.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011dumTGwdDfpJJMGC4ThisJ
…et-lookup

Fix /lists/recommended 500: use __in for QuerySet-valued lookup (fixes #1179)
…lean_email in django-registration 3.x)

RegistrationFormNoDisposableEmail.clean_email called
super().clean_email(), but django-registration 3.x removed clean_email
from RegistrationForm/RegistrationFormUniqueEmail (unique-email check is
now a field validator added in __init__). So every POST to
/accounts/register/ raised:
  AttributeError: 'super' object has no attribute 'clean_email'
Registration has been fully broken since the 1.11->4.2 cutover.

Fix: read self.cleaned_data['email'] directly. Django's _clean_fields
populates cleaned_data[name] (running field validators, incl. the unique
and confusable-email checks) BEFORE calling clean_<name>, so the disposable
check still runs on the already-validated value. Behavior-preserving;
valid on django-registration 3.4 under both Django 4.2 and 5.2.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011dumTGwdDfpJJMGC4ThisJ
…-super

Fix /accounts/register/ 500: read cleaned_data['email'] (django-registration 3.x) — fixes #1182
…1185)

Acq/Campaign/UserProfile .objects.get(<int>) raised
'TypeError: cannot unpack non-iterable int object' (verified live) on every
call, and the except DoesNotExist could not catch it. These tasks are actively
invoked, so the features failed silently:
- watermark_acq  -> ebook watermarking on borrow/acquire
- process_ebfs   -> rights-holder ebook processing
- ml_subscribe_task -> mailing-list subscribe

Fix: .get(id=...). Pre-existing (not cutover-specific); surfaced by the
post-cutover sweep.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011dumTGwdDfpJJMGC4ThisJ
…(refs #1185)

A1 MsgForm.full_clean: (1) used bare ValidationError, not imported in this module
-> NameError -> 500; AND (2) it raised from an overridden full_clean(), which
propagates out of is_valid() as a 500 even with a proper ValidationError (verified
empirically). Fixed by using self.add_error(None, ...) so the form is marked
invalid cleanly, and by catching ValueError/TypeError so a non-numeric supporter/
work id from POST doesn't crash the int lookup. Triggers on the 'message a
supporter' POST (frontend/views:1722) with missing/invalid supporter or work.

A2 EbookForm.set_provider: read self.cleaned_data['url'] unconditionally; when
clean_url() raises (e.g. duplicate URL) that key is removed -> KeyError -> 500 on
ebook add/edit with a duplicate URL. Use .get('url') and skip provider inference
when url is absent so the field error is reported normally.

Behavior-preserving for valid input; valid on Django 4.2 and 5.2.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011dumTGwdDfpJJMGC4ThisJ
core/tasks.py: positional .get() -> get(id=...) (3 tasks) — refs #1185
frontend/forms: stop 500s on invalid input (MsgForm, EbookForm) — refs #1185
django-registration 3.4's activation backend renders
django_registration/activation_email_body.txt, but the cutover wrapper commit
(ce1faa4) created the body under the OLD name activation_email.txt, so the
body template never resolved -> TemplateDoesNotExist -> every registration POST
500s (broken since the Jun 17 cutover). The subject wrapper was named correctly.

Add the body wrapper mirroring the working subject wrapper; reuses the intact
canonical body registration/activation_email.txt. Fixes #1190.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011dumTGwdDfpJJMGC4ThisJ
Django's default SafeExceptionReporterFilter cleanses names matching
API|TOKEN|KEY|SECRET|PASS|SIGNATURE — which lets STRIPE_SK (live Stripe secret)
and EMAIL_HOST_USER (SES SMTP username / AWS access-key id) through in the
settings dump emailed on every 500. Add a custom filter broadening the pattern
and wire it via DEFAULT_EXCEPTION_REPORTER_FILTER in common.py (inherited by all
envs incl. prod via prod.py.j2's `from .common import *`).

Refs EbookFoundation/security-private#22.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011dumTGwdDfpJJMGC4ThisJ
Mask STRIPE_SK / EMAIL_HOST_USER in error-report settings dumps
Fix registration 500: add missing django_registration/activation_email_body.txt (#1190)
The mechanical, no-judgment half of the FAQ copy review (#1165), re-applied
onto current master. Split out of the stale PR #1168 (63 commits behind);
all campaign-content removals + factual Q&A deletions are deferred to the
deliberate campaign-retirement work in #1195.

Why split: ebookfiles.html is {% include %}'d into terms.html (the legal
Terms of Service), and the ToS still defines Pledge / Buy-to-Unglue as
binding campaign types. Removing the B2U/Pledge file-requirement lines would
silently alter contractual criteria, so that change belongs with #1195 where
terms.html is updated coherently — not in a copyedit PR.

This PR therefore touches ONLY voice/grammar/casing, no structure, no
content removal:
- faq.html: "free … in the sense of freedom"; "stucked-ness" -> "constraints"
- faq_t4u.html: "Free Ebooks"; "no charge for creators"; "charges";
  "Unglue.it" casing (x2)
- ebookfiles.html: "Thanks-for-Ungluing" + "rights holder" spelling
- libraries.html: tightened "Starting an account is free"

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011dumTGwdDfpJJMGC4ThisJ
FAQ copy: safe copyedits (voice/grammar/casing) — re #1165
Two bugs, both introduced in 03e71be (2024-10-28, 18+ months ago):
1. Missing `if not success:` guard -- logged an ERROR on every single
   cover-thumbnail attempt, success or failure.
2. Format string had two %s placeholders but only one arg (url) was
   ever passed, so Python's own logging module threw
   "TypeError: not enough arguments for format string" on every call
   -- a "--- Logging error ---" traceback cascade, not the intended
   error message.

Found while investigating why /var/log/celery/w1.log had grown to
~800MB: this single bug accounts for 93,914+ occurrences in just the
last few days. Confirmed via git blame + live log inspection, not
guessed.

Codex-reviewed: LGTM. covers.make_cover_thumbnail (core/covers.py)
returns a strict bool, so `if not success:` is the correct guard.
Fix make_cover_thumbnail: unconditional error log spamming celery logs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants