update from Gluejar#94
Open
eshellman wants to merge 1130 commits into
Open
Conversation
update handling of DOAB coers
fix all the null doab covers!
and don' call distinct unless needed
update tests, fix slow OPDS, optimize queryset access
muse, ubiquity hosts
tecnum, update de gruyter
springer, sciello and cmp
Maintenance 2024
LT key, several omp sites
one more omp site
based on work done for doab-check
Fix #1155: TypeError in Work.publication_date from two-query race
Fix #1156: widget endpoint 500s on unknown/non-numeric/invalid ids
Staging boxes restored from a prod snapshot keep prod's Site row (domain='unglue.it'), so every emailed link (password-reset, notices, etc.) points at prod instead of the staging box's own host. This command updates Site.objects.get_current() (the SITE_ID row) to the supplied domain (and optional name). It is idempotent: if the row already matches, it prints a no-op message and exits cleanly. Used by the provisioning repo's post-deploy Ansible task to localise the Site to the box's own server_name on every deploy. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Mechanical, no-meaning-change corrections to the FAQ page surfaced by the CC+Codex copy review on #1165. Typos, a broken URL, proper-noun/acronym fixes, subject-verb agreement, and site-name casing — nothing factual. - "that why" → "that's why"; "an non-profit" → "a non-profit" - "do well be selling" → "by selling"; "the the copyright" → "the copyright" - "right holder tools" → "rights holder tools"; "some interested" → "some interest" - broken Facebook URL "facebook/com" → "facebook.com" - "Wikisources/Hathi Trust/Github" → "Wikisource/HathiTrust/GitHub" - "page.You'll" → "page. You'll"; mid-sentence "Let" → "let" - "cannot not be obtained" → "cannot be obtained"; "They does" → "They do" - "Authors' Guild" → "Authors Guild"; CC license styling "NoDerivatives, NonCommercial" - site-name casing unglue.it → Unglue.it; "Thanks for Ungluing" → "Thanks-for-Ungluing" Factual/staleness/voice items (fees, payouts, sender email, campaign retirement, etc.) are handled separately in the judgment-call PR. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
FAQ copy: objective typo & grammar fixes (#1165)
Mechanical, no-meaning-change corrections to the FAQ page surfaced by the CC+Codex copy review on #1165. Typos, a broken URL, proper-noun/acronym fixes, subject-verb agreement, and site-name casing — nothing factual. - "that why" → "that's why"; "an non-profit" → "a non-profit" - "do well be selling" → "by selling"; "the the copyright" → "the copyright" - "right holder tools" → "rights holder tools"; "some interested" → "some interest" - broken Facebook URL "facebook/com" → "facebook.com" - "Wikisources/Hathi Trust/Github" → "Wikisource/HathiTrust/GitHub" - "page.You'll" → "page. You'll"; mid-sentence "Let" → "let" - "cannot not be obtained" → "cannot be obtained"; "They does" → "They do" - "Authors' Guild" → "Authors Guild"; CC license styling "NoDerivatives, NonCommercial" - site-name casing unglue.it → Unglue.it; "Thanks for Ungluing" → "Thanks-for-Ungluing" Factual/staleness/voice items (fees, payouts, sender email, campaign retirement, etc.) are handled separately in the judgment-call PR. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> (cherry picked from commit 0c43233)
… not a hard fail Codex review of #1164 fix: a fresh/scrubbed DB without a Site row for SITE_ID would crash the post-deploy task with Site.DoesNotExist. get_or_create makes it self-healing while staying idempotent on existing rows. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add set_site_domain management command (fix #1164)
Staging boxes restored from a prod snapshot keep prod's Site row (domain='unglue.it'), so every emailed link (password-reset, notices, etc.) points at prod instead of the staging box's own host. This command updates Site.objects.get_current() (the SITE_ID row) to the supplied domain (and optional name). It is idempotent: if the row already matches, it prints a no-op message and exits cleanly. Used by the provisioning repo's post-deploy Ansible task to localise the Site to the box's own server_name on every deploy. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… not a hard fail Codex review of #1164 fix: a fresh/scrubbed DB without a Site row for SITE_ID would crash the post-deploy task with Site.DoesNotExist. get_or_create makes it self-healing while staying idempotent on existing rows. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
libraryauth defined its AppConfig (with ready() -> `from . import signals`) in __init__.py and relied on `default_app_config`. Django 4.1 REMOVED default_app_config, and an AppConfig in __init__.py is not auto-discovered (Django only scans <app>/apps.py). So since the 2026-06-17 Django 4.2 cutover, LibraryAuthConfig.ready() never ran, signals.py was never imported, and the `@receiver(user_activated) handle_same_email_account` (same-email account dedup on registration activation) was silently disconnected in production. Fix: move LibraryAuthConfig to libraryauth/apps.py (auto-discovered) and drop the dead default_app_config. Backward-compatible (valid on 4.2 and 5.2). Proven empirically (minimal repro): with the config in __init__.py + no apps.py, ready() does NOT fire on either Django 4.2.21 or 5.2.15; adding apps.py restores it on both. Scope: swept all first-party apps — only `core` and `libraryauth` define ready(); core already has apps.py (fine). libraryauth was the sole casualty. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011dumTGwdDfpJJMGC4ThisJ
Per Codex review of #1176: assert LibraryAuthConfig is the active app config (so ready() runs) and that handle_same_email_account is connected to the user_activated signal. Guards against the config drifting back out of apps.py. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011dumTGwdDfpJJMGC4ThisJ
Fix libraryauth signals: move AppConfig to apps.py so ready() fires
recommended_user (frontend/views/__init__.py:637) is a QuerySet (User.objects.filter(...)). Passing it to the exact related lookup wishlists__user=recommended_user raised, since Django 4.x: ValueError: The QuerySet value for an exact lookup must be limited to one result using slicing. Django 1.11 tolerated a QuerySet here, so /lists/recommended has returned HTTP 500 since the 1.11->4.2 cutover (2026-06-17). Fix: wishlists__user__in=recommended_user. Behavior-preserving for the intended single 'unglueit' user, and degrades gracefully (empty result, not 500) if that user is absent. Valid on both Django 4.2 and 5.2. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011dumTGwdDfpJJMGC4ThisJ
…et-lookup Fix /lists/recommended 500: use __in for QuerySet-valued lookup (fixes #1179)
…lean_email in django-registration 3.x) RegistrationFormNoDisposableEmail.clean_email called super().clean_email(), but django-registration 3.x removed clean_email from RegistrationForm/RegistrationFormUniqueEmail (unique-email check is now a field validator added in __init__). So every POST to /accounts/register/ raised: AttributeError: 'super' object has no attribute 'clean_email' Registration has been fully broken since the 1.11->4.2 cutover. Fix: read self.cleaned_data['email'] directly. Django's _clean_fields populates cleaned_data[name] (running field validators, incl. the unique and confusable-email checks) BEFORE calling clean_<name>, so the disposable check still runs on the already-validated value. Behavior-preserving; valid on django-registration 3.4 under both Django 4.2 and 5.2. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011dumTGwdDfpJJMGC4ThisJ
…-super Fix /accounts/register/ 500: read cleaned_data['email'] (django-registration 3.x) — fixes #1182
…1185) Acq/Campaign/UserProfile .objects.get(<int>) raised 'TypeError: cannot unpack non-iterable int object' (verified live) on every call, and the except DoesNotExist could not catch it. These tasks are actively invoked, so the features failed silently: - watermark_acq -> ebook watermarking on borrow/acquire - process_ebfs -> rights-holder ebook processing - ml_subscribe_task -> mailing-list subscribe Fix: .get(id=...). Pre-existing (not cutover-specific); surfaced by the post-cutover sweep. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011dumTGwdDfpJJMGC4ThisJ
…(refs #1185) A1 MsgForm.full_clean: (1) used bare ValidationError, not imported in this module -> NameError -> 500; AND (2) it raised from an overridden full_clean(), which propagates out of is_valid() as a 500 even with a proper ValidationError (verified empirically). Fixed by using self.add_error(None, ...) so the form is marked invalid cleanly, and by catching ValueError/TypeError so a non-numeric supporter/ work id from POST doesn't crash the int lookup. Triggers on the 'message a supporter' POST (frontend/views:1722) with missing/invalid supporter or work. A2 EbookForm.set_provider: read self.cleaned_data['url'] unconditionally; when clean_url() raises (e.g. duplicate URL) that key is removed -> KeyError -> 500 on ebook add/edit with a duplicate URL. Use .get('url') and skip provider inference when url is absent so the field error is reported normally. Behavior-preserving for valid input; valid on Django 4.2 and 5.2. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011dumTGwdDfpJJMGC4ThisJ
core/tasks.py: positional .get() -> get(id=...) (3 tasks) — refs #1185
frontend/forms: stop 500s on invalid input (MsgForm, EbookForm) — refs #1185
django-registration 3.4's activation backend renders django_registration/activation_email_body.txt, but the cutover wrapper commit (ce1faa4) created the body under the OLD name activation_email.txt, so the body template never resolved -> TemplateDoesNotExist -> every registration POST 500s (broken since the Jun 17 cutover). The subject wrapper was named correctly. Add the body wrapper mirroring the working subject wrapper; reuses the intact canonical body registration/activation_email.txt. Fixes #1190. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011dumTGwdDfpJJMGC4ThisJ
Django's default SafeExceptionReporterFilter cleanses names matching API|TOKEN|KEY|SECRET|PASS|SIGNATURE — which lets STRIPE_SK (live Stripe secret) and EMAIL_HOST_USER (SES SMTP username / AWS access-key id) through in the settings dump emailed on every 500. Add a custom filter broadening the pattern and wire it via DEFAULT_EXCEPTION_REPORTER_FILTER in common.py (inherited by all envs incl. prod via prod.py.j2's `from .common import *`). Refs EbookFoundation/security-private#22. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011dumTGwdDfpJJMGC4ThisJ
Mask STRIPE_SK / EMAIL_HOST_USER in error-report settings dumps
Fix registration 500: add missing django_registration/activation_email_body.txt (#1190)
The mechanical, no-judgment half of the FAQ copy review (#1165), re-applied onto current master. Split out of the stale PR #1168 (63 commits behind); all campaign-content removals + factual Q&A deletions are deferred to the deliberate campaign-retirement work in #1195. Why split: ebookfiles.html is {% include %}'d into terms.html (the legal Terms of Service), and the ToS still defines Pledge / Buy-to-Unglue as binding campaign types. Removing the B2U/Pledge file-requirement lines would silently alter contractual criteria, so that change belongs with #1195 where terms.html is updated coherently — not in a copyedit PR. This PR therefore touches ONLY voice/grammar/casing, no structure, no content removal: - faq.html: "free … in the sense of freedom"; "stucked-ness" -> "constraints" - faq_t4u.html: "Free Ebooks"; "no charge for creators"; "charges"; "Unglue.it" casing (x2) - ebookfiles.html: "Thanks-for-Ungluing" + "rights holder" spelling - libraries.html: tightened "Starting an account is free" Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011dumTGwdDfpJJMGC4ThisJ
FAQ copy: safe copyedits (voice/grammar/casing) — re #1165
Two bugs, both introduced in 03e71be (2024-10-28, 18+ months ago): 1. Missing `if not success:` guard -- logged an ERROR on every single cover-thumbnail attempt, success or failure. 2. Format string had two %s placeholders but only one arg (url) was ever passed, so Python's own logging module threw "TypeError: not enough arguments for format string" on every call -- a "--- Logging error ---" traceback cascade, not the intended error message. Found while investigating why /var/log/celery/w1.log had grown to ~800MB: this single bug accounts for 93,914+ occurrences in just the last few days. Confirmed via git blame + live log inspection, not guessed. Codex-reviewed: LGTM. covers.make_cover_thumbnail (core/covers.py) returns a strict bool, so `if not success:` is the correct guard.
Fix make_cover_thumbnail: unconditional error log spamming celery logs
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.