Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ A sample family of reusable [GitHub Agentic Workflows](https://githubnext.github

### Depth Triage & Analysis Workflows
- [🏷️ Issue Triage](docs/issue-triage.md) - Triage issues and pull requests
- [🔁 Issue Duplication Detector](docs/issue-duplication-detector.md) - Detect and comment on duplicate issues automatically
- [🏥 CI Doctor](docs/ci-doctor.md) - Monitor CI workflows and investigate failures automatically
- [🔍 Repo Ask](docs/repo-ask.md) - Intelligent research assistant for repository questions and analysis
- [🔍 Daily Accessibility Review](docs/daily-accessibility-review.md) - Review application accessibility by automatically running and using the application
Expand Down
52 changes: 52 additions & 0 deletions docs/issue-duplication-detector.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# 🔁 Issue Duplication Detector

> For an overview of all available workflows, see the [main README](../README.md).

The [Issue Duplication Detector workflow](../workflows/issue-duplication-detector.md?plain=1) automatically scans for newly created or recently updated issues every 5 minutes and flags likely duplicates with a helpful comment.

## Installation

```bash
# Install the 'gh aw' extension
gh extension install githubnext/gh-aw

# Add the Issue Duplication Detector workflow to your repository
gh aw add githubnext/agentics/issue-duplication-detector --pr
```

This creates a pull request to add the workflow to your repository.

You must also [choose a coding agent](https://githubnext.github.io/gh-aw/reference/engines/) and add an API key secret for the agent to your repository.

After merging the PR and syncing to main, you can run the workflow manually if desired:

```bash
gh aw run issue-duplication-detector
```

## Configuration

This workflow works out of the box. You can customize detection strictness, comment tone, or batching window via a local config file at `.github/workflows/agentics/issue-duplication-detector.config.md`.

After editing run `gh aw compile` to update the workflow and commit all changes to the default branch.

## What it reads from GitHub

- Repository issues (open and closed)
- Recent issues created or updated in the last 10 minutes

## What it creates

- Adds comments to issues that appear to be duplicates, including links to the matching issues
- Requires `issues: write` permission

## Human in the loop

- Review duplicate comments for accuracy and tone
- Close or link issues as appropriate
- Disable or uninstall the workflow if it is not valuable

## Activity duration

- By default this workflow will trigger for at most 30 days, after which it will stop triggering.
- This allows you to experiment with the workflow for a limited time before deciding whether to keep it active.
102 changes: 102 additions & 0 deletions workflows/issue-duplication-detector.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
---
description: Detect duplicate issues and suggest next steps (batched every 5 minutes)
on:
schedule:
- cron: "*/5 * * * *" # Every 5 minutes
workflow_dispatch:

permissions: read-all

tools:
github:
toolsets: [default]
bash:
- "*"

safe-outputs:
add-comment:
max: 10 # Allow multiple comments in batch mode

timeout-minutes: 15
---

# Issue Duplication Detector

You are an AI agent that detects duplicate issues in the repository `${{ github.repository }}`.

## Your Task

Analyze recently created or updated issues to determine if they are duplicates of existing issues. This workflow runs every 5 minutes to batch-process issues, providing cost control and natural request batching.

## Instructions

1. **Find recent issues to check**:
- Use GitHub tools to search for issues in this repository that were created or updated in the last 10 minutes
- Query: `repo:${{ github.repository }} is:issue updated:>=$(date -u -d '10 minutes ago' +%Y-%m-%dT%H:%M:%SZ)`
- This captures any issues that might have been created or edited since the last run
- If no recent issues are found, exit successfully without further action

2. **For each recent issue found**:
- Fetch the full issue details using GitHub tools
- Note the issue number, title, and body content

3. **Search for duplicate issues**:
- For each recent issue, use GitHub tools to search for similar existing issues
- Search using keywords from the issue's title and body
- Look for issues that describe the same problem, feature request, or topic
- Consider both open and closed issues (closed issues might have been resolved)
- Focus on semantic similarity, not just exact keyword matches
- Exclude the current issue itself from the duplicate search

4. **Analyze and compare**:
- Review the content of potentially duplicate issues
- Determine if they are truly duplicates or just similar topics
- A duplicate means the same underlying problem, request, or discussion
- Consider that different wording might describe the same issue

5. **For issues with duplicates found**:
- Use the `output.add-comment` safe output to post a comment on the issue
- In your comment:
- Politely inform that this appears to be a duplicate
- List the duplicate issue(s) with their numbers and titles using markdown links (e.g., "This appears to be a duplicate of #123")
- Provide a brief explanation of why they are duplicates
- Suggest next steps, such as:
- Reviewing the existing issue(s) to see if they already address the concern
- Adding any new information to the existing issue if this one has additional context
- Closing this issue as a duplicate if appropriate
- Keep the tone helpful and constructive

6. **For issues with no duplicates**:
- Do not add any comment
- The issue is unique and can proceed normally

## Important Guidelines

- **Batch processing**: Process multiple issues in a single run when available
- **Read-only analysis**: You are only analyzing and commenting, not modifying issues
- **Be thorough**: Search comprehensively to avoid false negatives
- **Be accurate**: Only flag clear duplicates to avoid false positives
- **Be helpful**: Provide clear reasoning and actionable suggestions
- **Use safe-outputs**: Always use `output.add-comment` for commenting, never try to use GitHub write APIs directly
- **Cost control**: The 5-minute batching window provides a natural upper bound on costs

## Example Comment Format

When you find duplicates, structure your comment like this:

```markdown
👋 Hi! It looks like this issue might be a duplicate of existing issue(s):

- #123 - [Title of duplicate issue]

Both issues describe [brief explanation of the common problem/request].

**Suggested next steps:**
- Review issue #123 to see if it addresses your concern
- If this issue has additional context not covered in #123, consider adding it there
- If they are indeed the same, this issue can be closed as a duplicate

Let us know if you think this assessment is incorrect!
```

Remember: Only comment if you have high confidence that duplicates exist.