Skip to content

Conversation

@bhawkins
Copy link
Contributor

@bhawkins bhawkins commented Sep 17, 2025

There are options in the focus.py configuration file for an azimuth apodization window, but the feature was never actually implemented because the azimuth antenna pattern was considered sufficient. However, the feature is useful in off-nominal scenarios like processing reduced resolution or when dealing with high-contrast scenes.

DRAFT STATUS: This PR is currently an ugly hack that only implements a raised cosine window. A Kaiser window would take more work to implement, especially considering that polymorphism is sort of annoying in CUDA. It's also important not to add a bunch of extra calculations to the innermost backprojection loop, though I guess it could be implemented as a post-processing step instead. I'm thinking using a ChebyKernel to represent the window might be a reasonably generic and efficient solution, but that'll have to wait a bit.

Edit: I changed the implementation to use a polynomial approximation to the window. It also skips the window calculation when possible, so there should be negligible impact on runtime in nominal processing (verified in my limited testing but TBC on a bigger scene).

@bhawkins bhawkins marked this pull request as ready for review September 17, 2025 18:39
@gmgunter gmgunter self-requested a review September 18, 2025 16:11
Copy link
Member

@gmgunter gmgunter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is a nice improvement to the backprojection code! Few comments below, but overall looks good.

Comment on lines +467 to +473
// assume branch prediction works better than an unnecessary multiply
if (window.has_value()) {
const auto x = (k - kstart) * kscale - 0.5f;
batch_sum += window.value()(x) * z;
} else {
batch_sum += z;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implementation seems reasonable to me but FYI GPUs don't have branch predictors.

https://chatgpt.com/share/68d426b9-befc-8011-9e95-9c34d82a22b1

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to over-engineer this to avoid a conditional inside the loop, I could imagine a possible approach that templates sumCoherentBatch on the window type, kinda like this:

struct NullWindow {};

template<class DeviceWindowView>
__global__ void sumCoherentBatch(..., DeviceWindowView window, ...)
{
    ...
    for (int k = batch_start; k < batch_stop; ++k) {
        ...
        if constexpr (std::is_same_v<DeviceWindowView, NullWindow>) {
            batch_sum += z;
        } else {
            const auto x = (k - kstart) * kscale - 0.5f;
            batch_sum += window.value()(x) * z;
        }
    }
    ...
}

Then, on the host side, you'd have to call it in a conditional like this:

ErrorCode backproject(..., const std::optional<HostWindow> h_window, ...)
{
    ...
    if (h_window.has_value()) {
        const auto d_window = DeviceWindow(h_window);
        const auto dv_window = DeviceWindowView(d_window);
        sumCoherentBatch<<<...>>>(..., dv_window, ...);
    } else {
        sumCoherentBatch<<<...>>>(..., NullWindow{}, ...);
    }
    ...
}

Not sure if that'd cause a noticeable improvement in runtime to be worth the additional code complexity.

Comment on lines +469 to +470
const auto x = (k - kstart) * kscale - 0.5f;
batch_sum += window.value()(x) * z;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not obvious to me that the support of the window function is necessarily $[-\frac{1}{2}, \frac{1}{2}]$.

Based on reading the implementation of ChebyKernel, it seems like the support domain is $[-\frac{w}{2}, \frac{w}{2}]$, where $w$ is the width of the kernel. Is that correct?

In that case, should we define kscale like this instead?

-    const float kscale = 1.0f / (kstop - kstart - 1);
+    const float w = window.has_value() ? window.value().width() : 1.0f;
+    const float kscale = w / (kstop - kstart - 1);

And then define x as:

-            const auto x = (k - kstart) * kscale - 0.5f;
+            const auto x = (k - kstart) * kscale - window.value().halfwidth();

(Note that we'd have to add the halfwidth() member function to Kernel.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. How about making width==1 part of the interface and throwing an exception in the host code if it's not satisfied?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I guess that'd be OK with me if that's your preference.

I'm not sure why it'd be preferable to only support kernel width == 1, though. It doesn't seem like it'd be significantly less work to implement. Are you just thinking that the code will be cleaner if we require width == 1?

DryTroposphereModel dry_tropo_model,
const Rdr2GeoBracketParams& rdr2geo_params,
const Geo2RdrBracketParams& geo2rdr_params, int batch,
const std::optional<HostWindow> window,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General guidance is to pass inputs by const reference unless they're cheap to copy (typically 2-3 words in size).

https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#f16-for-in-parameters-pass-cheaply-copied-types-by-value-and-others-by-reference-to-const

HostWindow stores a std::vector internally so it's potentially expensive to copy. Let's pass it by const reference here.

Suggested change
const std::optional<HostWindow> window,
const std::optional<HostWindow>& window,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it'd be nice to avoid a copy. I would've thought you'd want an optional reference std::optional<HostWindow&> window rather than a reference to optional, but apparently that's not allowed. I guess I need to spend some time learning about optional semantics (and other C++ nuances).

const Geo2RdrBracketParams& geo2rdr_params = {},
int batch = 1024, float* height = nullptr);
int batch = 1024,
const std::optional<isce3::core::ChebyKernel<float>> window = std::nullopt,
Copy link
Member

@gmgunter gmgunter Sep 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I'm not seeing the same change in the corresponding .cu file. This overload seems to take a const std::optional<isce3::cuda::core::ChebyKernelView<float>> in that file, which doesn't match the type of the argument here in the header file. Did I miss something?

const std::optional<DeviceWindowView> window,

using DeviceRadarGeometry = isce3::cuda::container::RadarGeometry;

using HostWindow = isce3::core::ChebyKernel<float>;
using DeviceWindow= isce3::cuda::core::ChebyKernel<float>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
using DeviceWindow= isce3::cuda::core::ChebyKernel<float>;
using DeviceWindow = isce3::cuda::core::ChebyKernel<float>;


def get_azimuth_window(cfg: Struct):
kind, shape = check_window_input(cfg.processing.azimuth_window,
msg="Azimuth window ")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we update the log message for the range window to specify "Range window" now that there are two of them?

win_kind, win_shape = check_window_input(cfg.processing.range_window)

-    win_kind, win_shape = check_window_input(cfg.processing.range_window) 
+    win_kind, win_shape = check_window_input(cfg.processing.range_window, msg="Range window: ")


def get_azimuth_window(cfg: Struct):
kind, shape = check_window_input(cfg.processing.azimuth_window,
msg="Azimuth window ")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
msg="Azimuth window ")
msg="Azimuth window: ")

return swaths


def get_azimuth_window(cfg: Struct):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's annotate the return type as well.

Suggested change
def get_azimuth_window(cfg: Struct):
def get_azimuth_window(cfg: Struct) -> isce3.core.ChebyKernelF32 | None:

Comment on lines -159 to +160
kind: Kaiser
shape: 0.0
kind: Cosine
shape: 1.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't these both boxcars? Can you comment on why you changed the default azimuth window here?

@unique
class WindowKind(str, Enum):
KAISER = "kaiser"
COSINE = "cosine"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NBD, but I like to add something like this to str/Enum classes so that they behave a little bit more like Python 3.11 StrEnums.

Suggested change
COSINE = "cosine"
COSINE = "cosine"
def __str__(self) -> str:
return self.value

Without overriding the __str__ method, you'll get something like this when converting the enum to a string:

>>> class WindowKind(str, Enum):
...     KAISER = "kaiser"
...     COSINE = "cosine"
...
>>> str(WindowKind.KAISER)
'WindowKind.KAISER'

With this change, you'll instead get this:

>>> class WindowKind(str, Enum):
...     KAISER = "kaiser"
...     COSINE = "cosine"
...
...     def __str__(self):
...         return self.value
...
>>> str(WindowKind.KAISER)
'kaiser'

@gmgunter gmgunter added the GPU Enable GPU builds for this PR label Sep 25, 2025
@gmgunter
Copy link
Member

Just remembered to add the "GPU" label to this PR so that the CI system runs an additional job that builds & tests the CUDA code!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

GPU Enable GPU builds for this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants