feat: Support connector_builder_project_id in get_custom_source_definition#830
feat: Support connector_builder_project_id in get_custom_source_definition#830aaronsteers wants to merge 6 commits intomainfrom
Conversation
Original prompt from AJ Steers |
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
📝 WalkthroughWalkthroughThe CloudWorkspace.get_custom_source_definition API now requires exactly one identifier: either Changes
Sequence DiagramsequenceDiagram
participant Caller
participant CloudWorkspace
participant API_Util
participant ConfigAPI
Caller->>CloudWorkspace: get_custom_source_definition(definition_id?, definition_type, connector_builder_project_id?)
alt invalid parameters
CloudWorkspace->>CloudWorkspace: validate exactly one identifier
CloudWorkspace-->>Caller: PyAirbyteInputError
else connector_builder_project_id with non-yaml
CloudWorkspace-->>Caller: PyAirbyteInputError
else YAML + connector_builder_project_id
CloudWorkspace->>API_Util: get_definition_id_for_connector_builder_project(workspace_id, builder_project_id)
API_Util->>ConfigAPI: POST /connector_builder_projects/get_with_manifest
ConfigAPI-->>API_Util: { sourceDefinitionId: id | null }
API_Util-->>CloudWorkspace: sourceDefinitionId or AirbyteError
alt id found
CloudWorkspace->>ConfigAPI: get_custom_yaml_source_definition(resolved_id)
ConfigAPI-->>CloudWorkspace: YAML manifest
CloudWorkspace-->>Caller: CustomCloudSourceDefinition
else not found / error
CloudWorkspace-->>Caller: AirbyteError
end
else YAML + definition_id
CloudWorkspace->>ConfigAPI: get_custom_yaml_source_definition(definition_id)
ConfigAPI-->>CloudWorkspace: YAML manifest
CloudWorkspace-->>Caller: CustomCloudSourceDefinition
else Docker
CloudWorkspace-->>Caller: NotImplementedError
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Possibly related PRs
Would you like me to suggest combining the new validation tests into a single parametrized test to reduce duplication, wdyt? Pre-merge checks and finishing touches✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
👋 Greetings, Airbyte Team Member!Here are some helpful tips and reminders for your convenience. Testing This PyAirbyte VersionYou can test this version of PyAirbyte using the following: # Run PyAirbyte CLI from this branch:
uvx --from 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1760655651-custom-source-definition-builder-project-id' pyairbyte --help
# Install PyAirbyte from this branch for development:
pip install 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1760655651-custom-source-definition-builder-project-id'Helpful ResourcesPR Slash CommandsAirbyte Maintainers can execute the following slash commands on your PR:
Community SupportQuestions? Join the #pyairbyte channel in our Slack workspace. |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
airbyte/cloud/workspaces.py (1)
637-665: Consider documenting the two-call overhead for builder_project_id lookups.The implementation resolves
connector_builder_project_idtodefinition_id(lines 638-644) then fetches the definition (lines 658-664), resulting in two API calls. This is necessary since no direct endpoint exists, but might be worth noting in the docstring for users concerned about performance, wdyt?Optional addition to docstring:
Args: definition_id: The definition ID. Mutually exclusive with `connector_builder_project_id`. definition_type: Connector type ("yaml" or "docker"). Required. connector_builder_project_id: The connector builder project ID. Mutually exclusive with - `definition_id`. + `definition_id`. Note: Lookups by builder_project_id require two API calls + (resolve ID + fetch definition).
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
airbyte/cloud/workspaces.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
airbyte/cloud/workspaces.py (2)
airbyte/exceptions.py (2)
PyAirbyteInputError(201-210)AirbyteError(432-447)airbyte/_util/api_util.py (2)
get_custom_yaml_source_definition(1050-1084)get_definition_id_for_connector_builder_project(1199-1234)
🪛 GitHub Actions: Run Linters
airbyte/cloud/workspaces.py
[error] 594-594: Command 'poetry run ruff check .' failed with lint error: E501 Line too long (101 > 100).
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.11, Windows)
- GitHub Check: Pytest (No Creds)
- GitHub Check: Pytest (All, Python 3.10, Windows)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Pytest (Fast)
🔇 Additional comments (4)
airbyte/cloud/workspaces.py (4)
584-590: The signature doesn't enforce keyword-only fordefinition_id.The PR summary mentions changing to keyword-only (breaking change), but
definition_idappears before the*marker, so it's still positional. Was the intent to make both parameters keyword-only?If so, consider moving
definition_idafter the*:def get_custom_source_definition( self, - definition_id: str | None = None, *, + definition_id: str | None = None, definition_type: Literal["yaml", "docker"], connector_builder_project_id: str | None = None, ) -> CustomCloudSourceDefinition:This would make all three parameters keyword-only, which aligns with the breaking change mentioned in the PR objectives. Wdyt?
606-615: LGTM! XOR validation is correct.The validation logic correctly ensures exactly one of
definition_idorconnector_builder_project_idis provided using the XOR pattern(definition_id is None) == (connector_builder_project_id is None).
617-624: LGTM! Type constraint validation is correct.The check appropriately restricts
connector_builder_project_idto YAML definitions only, with clear error messaging.
646-656: LGTM! Error handling for missing builder_project_id is clear.The error message and context clearly indicate when a
connector_builder_project_iddoesn't resolve to a definition, making debugging straightforward.
|
Fixed! Changed |
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (1)
airbyte/cloud/workspaces.py (1)
594-595: Line length issue already flagged in previous review.The line length violation (E501) on line 594 has already been reported. Please address the fix suggested in the previous review comment.
🧹 Nitpick comments (1)
airbyte/cloud/workspaces.py (1)
627-654: Consider reducing code duplication by refactoring the fetch logic, wdyt?The implementation correctly handles both lookup paths, but lines 629-636 and 647-654 duplicate the fetch-and-return logic. This could be simplified by resolving
definition_idfirst (whenconnector_builder_project_idis provided), then having a single fetch path.Here's a refactored version that reduces duplication:
if definition_type == "yaml": + # Resolve definition_id if using connector_builder_project_id + resolved_definition_id = definition_id + if connector_builder_project_id is not None: + resolved_definition_id = api_util.get_definition_id_for_connector_builder_project( + workspace_id=self.workspace_id, + builder_project_id=connector_builder_project_id, + api_root=self.api_root, + client_id=self.client_id, + client_secret=self.client_secret, + ) + + # Fetch definition using resolved ID - if definition_id is not None: - result = api_util.get_custom_yaml_source_definition( - workspace_id=self.workspace_id, - definition_id=definition_id, - api_root=self.api_root, - client_id=self.client_id, - client_secret=self.client_secret, - ) - return CustomCloudSourceDefinition._from_yaml_response(self, result) # noqa: SLF001 - - if connector_builder_project_id is not None: - definition_id = api_util.get_definition_id_for_connector_builder_project( - workspace_id=self.workspace_id, - builder_project_id=connector_builder_project_id, - api_root=self.api_root, - client_id=self.client_id, - client_secret=self.client_secret, - ) - - result = api_util.get_custom_yaml_source_definition( - workspace_id=self.workspace_id, - definition_id=definition_id, - api_root=self.api_root, - client_id=self.client_id, - client_secret=self.client_secret, - ) - return CustomCloudSourceDefinition._from_yaml_response(self, result) # noqa: SLF001 + result = api_util.get_custom_yaml_source_definition( + workspace_id=self.workspace_id, + definition_id=resolved_definition_id, + api_root=self.api_root, + client_id=self.client_id, + client_secret=self.client_secret, + ) + return CustomCloudSourceDefinition._from_yaml_response(self, result) # noqa: SLF001
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
airbyte/_util/api_util.py(1 hunks)airbyte/cloud/workspaces.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- airbyte/_util/api_util.py
🧰 Additional context used
🧬 Code graph analysis (1)
airbyte/cloud/workspaces.py (3)
airbyte/cloud/connectors.py (3)
connector_builder_project_id(369-387)CustomCloudSourceDefinition(261-591)_from_yaml_response(527-539)airbyte/exceptions.py (1)
PyAirbyteInputError(201-210)airbyte/_util/api_util.py (2)
get_custom_yaml_source_definition(1050-1084)get_definition_id_for_connector_builder_project(1199-1246)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Pytest (All, Python 3.11, Windows)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Windows)
- GitHub Check: Pytest (Fast)
- GitHub Check: Pytest (No Creds)
🔇 Additional comments (3)
airbyte/cloud/workspaces.py (3)
603-606: Nice documentation of error cases!The Raises section clearly documents both validation scenarios. This helps users understand what inputs are valid.
607-616: LGTM! XOR validation is correctly implemented.The logic correctly enforces that exactly one of the two parameters must be provided. The error message is clear and the context aids debugging.
618-625: Appropriate type restriction validation.Correctly enforces that
connector_builder_project_idis only valid for YAML definitions. The error message clearly explains the constraint.
PyTest Results (Full)382 tests +1 364 ✅ - 1 26m 49s ⏱️ +23s For more details on these failures, see this check. Results for commit 44309a8. ± Comparison against base commit 813d0d0. ♻️ This comment has been updated with latest results. |
|
I've added the integration test as requested, but it's currently failing. The issue is that the It appears that when a YAML source definition is published via
However, these are separate entities and the reverse lookup (builder_project_id → definition_id) doesn't seem to be available through this endpoint. The existing test in builder_project_id = fetched.connector_builder_project_id
if builder_project_id:
# Only test if builder_project_id existsShould we:
Let me know which direction you'd like to take! |
There was a problem hiding this comment.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
airbyte/cloud/workspaces.py(1 hunks)tests/integration_tests/cloud/test_custom_definitions.py(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
tests/integration_tests/cloud/test_custom_definitions.py (4)
airbyte/cloud/connectors.py (3)
name(113-121)name(305-309)connector_builder_project_id(369-387)airbyte/cloud/workspaces.py (2)
get_custom_source_definition(584-650)publish_custom_source_definition(461-552)airbyte/_util/text_util.py (1)
generate_random_suffix(14-22)airbyte/exceptions.py (1)
PyAirbyteInputError(201-210)
airbyte/cloud/workspaces.py (3)
airbyte/cloud/connectors.py (3)
connector_builder_project_id(369-387)CustomCloudSourceDefinition(261-591)_from_yaml_response(527-539)airbyte/exceptions.py (1)
PyAirbyteInputError(201-210)airbyte/_util/api_util.py (2)
get_definition_id_for_connector_builder_project(1199-1246)get_custom_yaml_source_definition(1050-1084)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
🔇 Additional comments (6)
tests/integration_tests/cloud/test_custom_definitions.py (3)
80-80: LGTM!Good catch updating this to use keyword arguments—aligns perfectly with the method signature change.
88-95: Nice addition to exercise the builder project lookup!This round-trip validation (fetch by definition_id → get builder_project_id → fetch by builder_project_id) is a great way to ensure consistency. The conditional check for
builder_project_idis also appropriate.
173-203: Excellent validation coverage!This unit test nicely covers the XOR validation cases and the type constraint for
connector_builder_project_id. Clean and focused.airbyte/cloud/workspaces.py (3)
586-589: Method signature looks good.The optional parameters with XOR validation are correctly implemented. Note that making
definition_idoptional is indeed a breaking change (as documented in the PR), but the updated tests demonstrate the migration path.
591-606: Clear documentation.The docstring accurately describes the new parameters and validation rules. The line breaking at 594-595 also resolves the previous lint concern.
641-650: LGTM!The fetch logic is clean and correct. The assert at line 642 is safe given the validation above ensures exactly one parameter is provided and definition_id is resolved from connector_builder_project_id when needed.
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
airbyte/cloud/workspaces.py(1 hunks)tests/integration_tests/cloud/test_custom_definitions.py(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
airbyte/cloud/workspaces.py (3)
airbyte/cloud/connectors.py (1)
connector_builder_project_id(369-387)airbyte/exceptions.py (1)
PyAirbyteInputError(201-210)airbyte/_util/api_util.py (2)
get_definition_id_for_connector_builder_project(1199-1246)get_custom_yaml_source_definition(1050-1084)
tests/integration_tests/cloud/test_custom_definitions.py (4)
airbyte/cloud/connectors.py (3)
name(113-121)name(305-309)connector_builder_project_id(369-387)airbyte/cloud/workspaces.py (1)
get_custom_source_definition(584-650)airbyte/_util/text_util.py (1)
generate_random_suffix(14-22)airbyte/exceptions.py (1)
PyAirbyteInputError(201-210)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
| if definition_type != "yaml": | ||
| raise NotImplementedError( | ||
| "Docker custom source definitions are not yet supported. " | ||
| "Only YAML manifest-based custom sources are currently available." | ||
| ) | ||
|
|
||
| if (definition_id is None) == (connector_builder_project_id is None): | ||
| raise exc.PyAirbyteInputError( | ||
| message=( | ||
| "Must specify EXACTLY ONE of definition_id or connector_builder_project_id" | ||
| ), | ||
| context={ | ||
| "definition_id": definition_id, | ||
| "connector_builder_project_id": connector_builder_project_id, | ||
| }, | ||
| ) | ||
|
|
||
| if connector_builder_project_id: | ||
| if definition_type != "yaml": | ||
| raise exc.PyAirbyteInputError( | ||
| message="connector_builder_project_id is only valid for yaml definition_type", | ||
| context={ | ||
| "definition_type": definition_type, | ||
| "connector_builder_project_id": connector_builder_project_id, | ||
| }, | ||
| ) | ||
| definition_id = api_util.get_definition_id_for_connector_builder_project( |
There was a problem hiding this comment.
Reorder validation so Docker calls hit the intended PyAirbyteInputError
Right now the definition_type != "yaml" guard fires before we evaluate the mutually exclusive parameter check, so calling with definition_type="docker" and a connector_builder_project_id raises NotImplementedError instead of the documented PyAirbyteInputError. Could we move the XOR validation (and the builder-project/type check) ahead of the NotImplementedError so we emit the expected input error? Something along these lines might help, wdyt?
- if definition_type != "yaml":
- raise NotImplementedError(
- "Docker custom source definitions are not yet supported. "
- "Only YAML manifest-based custom sources are currently available."
- )
-
- if (definition_id is None) == (connector_builder_project_id is None):
+ if (definition_id is None) == (connector_builder_project_id is None):
raise exc.PyAirbyteInputError(
message=(
"Must specify EXACTLY ONE of definition_id or connector_builder_project_id"
),
context={
"definition_id": definition_id,
"connector_builder_project_id": connector_builder_project_id,
},
)
- if connector_builder_project_id:
- if definition_type != "yaml":
- raise exc.PyAirbyteInputError(
- message="connector_builder_project_id is only valid for yaml definition_type",
- context={
- "definition_type": definition_type,
- "connector_builder_project_id": connector_builder_project_id,
- },
- )
+ if connector_builder_project_id and definition_type != "yaml":
+ raise exc.PyAirbyteInputError(
+ message="connector_builder_project_id is only valid for yaml definition_type",
+ context={
+ "definition_type": definition_type,
+ "connector_builder_project_id": connector_builder_project_id,
+ },
+ )
+
+ if definition_type != "yaml":
+ raise NotImplementedError(
+ "Docker custom source definitions are not yet supported. "
+ "Only YAML manifest-based custom sources are currently available."
+ )
+
+ if connector_builder_project_id:
definition_id = api_util.get_definition_id_for_connector_builder_project(
workspace_id=self.workspace_id,
builder_project_id=connector_builder_project_id,
api_root=self.api_root,
client_id=self.client_id,
client_secret=self.client_secret,
)🤖 Prompt for AI Agents
In airbyte/cloud/workspaces.py between lines 607 and 633, reorder the validation
logic so that the XOR check for mutually exclusive definition_id and
connector_builder_project_id parameters happens before the definition_type
"yaml" check. This means first validate that exactly one of definition_id or
connector_builder_project_id is provided and that connector_builder_project_id
is only valid with "yaml" definition_type. Only after these validations pass,
check if definition_type is not "yaml" to raise NotImplementedError. This way,
calls with definition_type "docker" and connector_builder_project_id correctly
raise PyAirbyteInputError as expected.
|
Devin, Is this PR still worth working on? Compare with main to see if these features are still needed or already added. Then, if worth continuing, please reconcile the merge conflicts. |
…ition Allow get_custom_source_definition to accept either definition_id or connector_builder_project_id. Adds validation to ensure exactly one parameter is provided and that connector_builder_project_id is only used with yaml definition_type. - Modified method signature to accept optional definition_id and connector_builder_project_id parameters - Added validation for mutually exclusive parameters - Added validation for yaml-only connector_builder_project_id usage - Implemented lookup by connector_builder_project_id via list and match - Added test coverage for new parameter validation - Updated existing tests to use keyword arguments Requested by: AJ Steers (@aaronsteers) Co-Authored-By: AJ Steers <aj@airbyte.io>
Replace list-and-filter approach with direct API call to /connector_builder_projects/get_with_manifest endpoint. - Added get_definition_id_for_connector_builder_project() API function - Updated get_custom_source_definition to use the new API function - More efficient reverse lookup from builder_project_id to definition_id Suggested by: AJ Steers (@aaronsteers) Co-Authored-By: AJ Steers <aj@airbyte.io>
- Fix line length issue in docstring (E501) - Changed get_definition_id_for_connector_builder_project to raise AirbyteError instead of returning None when definition not found - Simplified caller code in get_custom_source_definition to remove None check since API function now raises directly Fixes lint error and addresses feedback from @aaronsteers Co-Authored-By: AJ Steers <aj@airbyte.io>
Mypy wasn't able to infer that definition_id is guaranteed to be set, so added an assertion to help the type checker. Co-Authored-By: AJ Steers <aj@airbyte.io>
d9f9aba to
44309a8
Compare
|
PR rebased on latest main! ✅ Summary:
Remaining question:
Let me know how you'd like to proceed with the integration test! |
feat: Support connector_builder_project_id in get_custom_source_definition
This PR is blocked by https://github.com/airbytehq/airbyte-platform-internal/pull/18082 which investigates API endpoint improvements needed for the reverse lookup functionality (builder_project_id → definition_id).
The current implementation uses
/connector_builder_projects/get_with_manifestbut this endpoint returnssourceDefinitionId: nullfor published YAML connectors, preventing the intended functionality from working reliably.Summary
Modified
CloudWorkspace.get_custom_source_definition()to accept eitherdefinition_idORconnector_builder_project_idas lookup parameters, with validation to ensure exactly one is provided.Key changes:
connector_builder_project_idparameter (yaml-only)Implementation note: Uses
/connector_builder_projects/get_with_manifestAPI endpoint to resolve builder_project_id to definition_id. However, this endpoint currently returns null for published connectors (see blocker above).Review & Testing Checklist for Human
get_custom_source_definition()Recommended Test Plan
definition_idonly (should work as before)connector_builder_project_idonly for a yaml definition (currently may fail due to API returning null)connector_builder_project_idfor docker type (should raise PyAirbyteInputError)connector_builder_project_id(should raise AirbyteError)Notes
Summary by CodeRabbit
New Features
Bug Fixes
Tests
Important
Auto-merge enabled.
This PR is set to merge automatically when all requirements are met.