PT download: check "just use a link" with HEAD request by carylwyatt · Pull Request #192 · hathitrust/babel

carylwyatt · 2026-03-06T20:55:57Z

@aelkiss Question for you regarding the new "just use a link" choice for "simple" downloads that don't use the callback function from imgsrv.

If the link is broken/malformed or imgsrv gives a 403/404/500, there isn't anything we can do with that if it's "just a link." There isn't a way to catch an error or make a suggestion to the user to try again because it's a link, not a fetch.

We probably don't want to assume that if the link is correct that imgsrv will always return what is requested. I think we pursued this path because my implementation was fetching and returning a blob, and we were concerned about memory/performance if the download/images were large.

In my little bit of research on this, I've seen an intermediary option that checks the link via fetch with a HEAD request instead of GET request to just see if the request is valid. In the small bit of testing I did on this locally, the HEAD request is slowwww (like 3-4 seconds), but I think that could maybe work since we're using the download modal to "build" the download. In the current implementation, the "build" is instantaneous since it builds the link for the download button on the fly, but if we need a slight delay to send a HEAD request and check if the request is valid before allowing the download (and setting an error in the modal if we get an error), then the download modal is the perfect place to do that.

Will an extra request to imgsrv add significant problems? Any other gotchas I haven't thought through here?

Just some thoughts on this Friday afternoon. I know this process has already taken some time, but this is my last lingering question.

P.S. This is staged on dev-3 so I could test the download.

…load

carylwyatt · 2026-03-06T20:57:46Z

pt/web/firebird/src/js/components/DownloadPanel/index.svelte

+		return `${request}${newAction}?${params.toString()}`
+	})
+
+  async function checkFetch(url) {


@moseshll already reviewed the rest of the code, so I'll just comment on the parts that are new on this branch. Here's an async function with the HEAD request that also sends the error to matomo and updates some variables that will later make sure the user gets an error message and can't click the download button.

carylwyatt · 2026-03-06T20:59:04Z

pt/web/firebird/src/js/components/DownloadPanel/index.svelte

+    if (isSimpleDownload()) {
+      downloadInProgress = true;
+
+      checkFetch(simpleUrl)


Adding the checkFetch functionality to the submitDownload function fired by the Download button in the form in the sidebar.

carylwyatt · 2026-03-06T21:00:04Z

pt/web/firebird/src/js/components/DownloadPanel/index.svelte

    <div xxstyle="width: 30rem">
      <div>
-        {#if status.percent < 100}
+        {#if simpleDownload && downloadInProgress}


New error handling in the modal.

carylwyatt · 2026-03-06T21:00:27Z

pt/web/firebird/src/js/components/DownloadPanel/index.svelte

+        {/if}
        {#if downloadInProgress}
          <span class="btn btn-primary disabled"> Download </span>
+        {:else if simpleDownload && downloadError} 


Adding a "cancel" button if the download has an error.

carylwyatt · 2026-03-06T21:03:02Z

The UI on this needs some additional polishing, but I didn't want to go too far down that path if this isn't a viable option. Thanks!

aelkiss · 2026-03-10T13:19:18Z

Thoughts on general implementation:

What's the behavior in the interface if the GET returns a 403/404/500? Does the download just silently fail / download an empty file?
How does imgsrv handle a HEAD request? Ideally, it might check some things like if that volume/page exists and the user is authorized to get it?

Trying to think through what we'd learn about failures if we implement the HEAD option:

if imgsrv is in a bad state or generally timing out (although there's no guarantee it would be the same as a subsequent GET request)
if there's some problem with session & authentication (although I think pageturner already tries to compensate for this)
if requests to imgsrv are just getting blocked entirely (throttling, cloudflare, etc)

However:

The HEAD request should tell us if imgsrv thinks the download should succeed, but generally if the user is being presented a download link then at least pageturner thinks it should succeed.
We could certainly end up with cases where the HEAD request succeeds but the GET request ultimately fails.
We want to know about scenarios where the download ultimately doesn't succeed.

I'll think about this some more.

carylwyatt · 2026-03-10T15:44:07Z

What's the behavior in the interface if the GET returns a 403/404/500? Does the download just silently fail / download an empty file?

Before I added the download attribute to the link, the browser would load the page, which, in my case looked like a single word on an unstyled page: Restricted. (I also occasionally got an unstyled page with "Your download is in progress. Please try again later." message from _in_progress_alert. But that's a status 200, I just haven't seen it again since I added the download attribute).

After I added the download attribute to the link, yes, just a silent fail with an empty download (which looks like this in Brave-flavored Chrome).

With the download attribute on the link, it's not immediately obvious that anything went wrong with the download because the user is still presented with the "Save As" dialog (or at least I was presented with that window).

How does imgsrv handle a HEAD request? Ideally, it might check some things like if that volume/page exists and the user is authorized to get it?

For a valid request, it returned a 200 and all the response headers it would return with a GET request.

On this request, I removed the id param, so now I'm getting a 500:

Trying to think through what we'd learn about failures if we implement the HEAD option

For me, it's more about letting the user know that something went wrong, and the "just a link" failures don't make that obvious since something downloaded (kind of). We get complaints from people that they tried to download and "nothing happened" and we can't recreate that on our end. With this HEAD request step, I can update the UI to tell the user that something went wrong and to try again. This also gives me a place to push the matomo tag with the error code and the URL that was attempted for download so we can track when downloads fail. Without this step, I'm sure there's a way to tell from the logs that someone attempted to reach this URL but was denied... right?

We could certainly end up with cases where the HEAD request succeeds but the GET request ultimately fails.

I didn't think of that. But of course that's true! Hmm...

We want to know about scenarios where the download ultimately doesn't succeed.

Yes. And in the best case scenario also let the user know the download didn't succeed.

aelkiss · 2026-03-10T16:31:29Z

OK -- I think it probably makes sense to go ahead and try the HEAD option, probably combined with some work in imgsrv to optimize HEAD requests. I'm looking at some other things in imgsrv anyway so I can take a look at options there.

aelkiss · 2026-03-10T17:54:31Z

For imgsrv I think what we want to do is bail out before generating the actual result if it's a HEAD request -- that is, do all the setup & check access, but don't do the actual generation, for example:

https://github.com/hathitrust/babel/blob/main/imgsrv/lib/SRV/Volume/Base.pm#L129

We probably need to put such a check in every run method?

carylwyatt added 12 commits March 3, 2026 16:19

new tests, debugging download process

aa79dc1

rewrite download using fetch API

5ddfa5d

fix failing tests

dbefdca

add matomo custom data push for error logging

d7c4a1d

add event listener to handle download button state while buiding down…

c3ff217

…load

refactor and update UI; fix failing tests

92b6cfe

add download attribute to link so we don't redirect on error

19e90bf

clean up, attempt to fix edge cases

7545257

update error message UI and text

2ae94f8

fix failing tests

43d10e5

fix download modal title conditional

adccbe8

simple download check link with HEAD request

02e7c8b

carylwyatt requested a review from aelkiss March 6, 2026 20:56

carylwyatt commented Mar 6, 2026

View reviewed changes

carylwyatt marked this pull request as draft March 9, 2026 13:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PT download: check "just use a link" with HEAD request#192

PT download: check "just use a link" with HEAD request#192
carylwyatt wants to merge 12 commits intomainfrom
pt-download-head-request

carylwyatt commented Mar 6, 2026

Uh oh!

carylwyatt Mar 6, 2026

Uh oh!

carylwyatt Mar 6, 2026

Uh oh!

carylwyatt Mar 6, 2026

Uh oh!

carylwyatt Mar 6, 2026

Uh oh!

carylwyatt commented Mar 6, 2026

Uh oh!

aelkiss commented Mar 10, 2026 •

edited

Loading

Uh oh!

carylwyatt commented Mar 10, 2026

Uh oh!

aelkiss commented Mar 10, 2026

Uh oh!

aelkiss commented Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

carylwyatt commented Mar 6, 2026

Uh oh!

carylwyatt Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

carylwyatt Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

carylwyatt Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

carylwyatt Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

carylwyatt commented Mar 6, 2026

Uh oh!

aelkiss commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

carylwyatt commented Mar 10, 2026

Uh oh!

aelkiss commented Mar 10, 2026

Uh oh!

aelkiss commented Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

aelkiss commented Mar 10, 2026 •

edited

Loading