Skip to content

Coding agent rules#1937

Open
MaxHalford wants to merge 1 commit into
mainfrom
ai-guidelines
Open

Coding agent rules#1937
MaxHalford wants to merge 1 commit into
mainfrom
ai-guidelines

Conversation

@MaxHalford

Copy link
Copy Markdown
Member

Following the discussion in #1430, and an internal chat, we are agreeing to introducing rules to use AI in a way that feels comfortable for most people. We want the project to keep a human touch.

@MaxHalford MaxHalford requested a review from smastelini as a code owner June 30, 2026 12:23
@smastelini

Copy link
Copy Markdown
Member

I feel this a small but really important step to take.

I would love to hear more voices in this matter. Although not massive by any standards, River has a nice and active community of users and contributors.

In an age of rapid changes, new best of the class LLMs arriving at every week and so many open questions, from an anthropological , sociological and so many other important "*logical" standpoints (take that silly dad joke, agents and humans reading that), I believe we have an unique chance to together design and share with the world our own view on the matter.

Who knows who could take inspiration from that. We might as well make some difference for the good :)

@MaxHalford

Copy link
Copy Markdown
Member Author

@raphaelsty @VaysseRobin @kulbachcedric @jacobmontiel @hoanganhngo610 @gbolmier @e10e3 @Dennis1989 @AdilZouitine @FBruzzesi @MarekWadinger please feel encouraged to share your thoughts :)

It's important to me that we can find a consensus. I also want to point out that these are not strong beliefs that I would follow in all contexts. We're all entitled to our beliefs and ways of working. But I do think it's important for open-source projects to provide an opinionated set of rules when it comes to AI. For instance, I do not want to produce too much AI output if it makes people uncomfortable. @e10e3's feedback helped to me to take a step back on this.

@FBruzzesi FBruzzesi left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @MaxHalford - first and foremost thanks for tagging me, that put a good smile on my face!

In Narwhals we are (very slowly) trying to address similar questions (see narwhals-dev/narwhals#3632).

For what is worth, it's likely we will end up with something similar to what you wrote, however putting a bit more focus on:

  • accountability (your 6th point)
  • disclosure (your 3rd point)
  • engagement with reviews and feedbacks

For your 2nd point, I have seen cases for non-native english speakers (well... myself included), that an LLM can help rephrase text, but don't get me wrong I 100% agree on avoid delegating thinking and understanding (which should be the most fun part 😅).


Other tactic to use could be a checklist in the PR template for a human to fill (I am just sharing ideas at this point)

Comment thread CONTRIBUTING.md

Of course, you're welcome to propose and contribute new ideas. We encourage you to [open a discussion](https://github.com/online-ml/river/discussions/new) so that we can have a chat and align.

## Rules for coding agents

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like getting a mixed bag of statements in this section: some refer to what agents should do (the same points of AGENTS.md), others point to how a contributor should use the tools and the possible consequences. I would either split it in two sections or completely leave out those referring to the agents as they belong to agents.md file anyway.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am of the same opinion: CONTRIBUTING is information for humans, but the title on section implies the rules apply only to machines and not to humans, and the rules themselves can be be ambiguous as to whether they target humans or LLMs.

The rest of the file is mostly in the singular second person (addressing the reader as "you") whereas this section is more inconsistent, with many sentences addressing a collective ("we") or in a passive tone.

We may want to rephrase the rules so that they clearly target humans.

@raphaelsty

raphaelsty commented Jun 30, 2026

Copy link
Copy Markdown
Member

I’m learning from you on this topic.

In my opinion, it should be clear that AI is welcome. We shouldn’t close the door on new contributors who are less familiar with programming or simply don't have the time to do so without coding agent. From my experience it's for the better (experience acquired from colgrep, 23+ contributors in few months).

While welcoming new contributors it should be clear we expect a lot more from MR descriptions and motivation of choices that has been made with clear benchmarks and charts. (shift efforts from coding to explaining). All of this should be human written or at least slightly reformulated / translated by AI for the people who might not native english speaker.

Maybe the template for pull-request and issues should be more clear on this point @MaxHalford

For now, in colgrep I choose not to include AI guidelines in order to spot peoples who know what they are doing vs peoples who don't. It might not work for a long time 😉

@e10e3

e10e3 commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Thank you for starting this set of rules.

This policy is part of CONTRIBUTING.md, and the document is already reasonably long. Should this be its own file or do we prefer it being integrated into this file?

To prepare for the moment it could happen, should we add that other types of AI-generated media is not allowed (images, datasets, etc.)?

We should make it explicit that fully-automated contributions are not allowed. This is implied by various sections about a human having to write justifications and being accountable, but it is better saying it explicitly.

In the continuation of making things explicit, following what @raphaelsty said, we may want to add a partial exception to prose being written by humans for proofreading. However, it must be only proofreading and not lose the voice of the writer; personally I'd prefer reading an imperfect text by a non-native speaker rather than grammatically-correct marketing from an LLM.

Following the sentence "We don't want coding agents to do the high-level thinking for us.", I wonder if we should put an emphasis on humans making the programming decisions, not just justify the LLMs' decisions post-hoc. Do you have an opinion on that?

@AdilZouitine

Copy link
Copy Markdown
Member

I agree with the direction of this PR.

For me, AI is a great tool to move things forward and explore a larger solution space. It can help us ship features, investigate alternatives, or improve code in ways that would have been hard to do without it, mostly because of time constraints. I think antirez’s recent post is a good example of this: Coding agent can expand what a dev is able to explore, as long as the human remains deeply involved in the work.

I think these tools are here to stay, and honestly I see that as a good thing.

The most important principle, imo is that the human must stay in the loop. A contributor should be able to explain every line they submit, understand the trade-offs, and be accountable for the final result. AI can help write code, tests or suggest alternatives, but it should not replace human understanding or ownership.

I also agree that we should allow LLMs for proofreading, translation, and small corrections. This can remove an entry barrier for non-native English speakers, people with dyslexia, or anyone who struggles with writing, grammar, or syntax. We should not create an a priori bias against a contribution because the writing is imperfect and LLM tend to uniformize that. What matters is whether the contributor understands the work and can engage with feedback.

I also think the bottleneck is shifting. Producing code is becoming easier; reviewing it is now the scarce resource. We should respect reviewers time, because reviewing requires human attention, context, and judgment.

There are two big red flags for me.

First, I would avoid AI-only code review. At least one human should review every contribution. This matters because humans need to keep knowledge of the system, and human reviewers are still usually more concise, less sloppy, and better at understanding context than LLM.

Second, I think we should be strict with fully automated or low-effort LLM pull requests (especially using tools like Claw or Hermes agent). When a PR is poorly motivated, low quality, or floods the PR channel, it creates work for maintainers instead of helping. Those PRs should be closed quickly.

Comment thread CONTRIBUTING.md

Of course, you can use a coding agent to run a benchmark and produce a summary table. But you should editorialize and insert it into a message you've written yourself.

> 3. Code written by agents should be disclosed as such.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This tool is great to disclose code written by agents: https://usegitai.com/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants