React to CLI update and Anthropic response format changes#27
Merged
SteveSandersonMS merged 3 commits intomainfrom Jan 16, 2026
Merged
React to CLI update and Anthropic response format changes#27SteveSandersonMS merged 3 commits intomainfrom
SteveSandersonMS merged 3 commits intomainfrom
Conversation
…ots for extended thinking
This was referenced Jan 16, 2026
Contributor
There was a problem hiding this comment.
Pull request overview
This PR introduces two improvements to the E2E test infrastructure: preventing corrupted snapshots on test failure and updating snapshots to reflect CLI changes for Anthropic extended thinking compatibility.
Changes:
- Implements test failure detection in each language's test framework to skip writing snapshots when tests fail
- Updates snapshot files to reflect new CLI behavior that coalesces tool calls into single assistant messages for Anthropic extended thinking
Reviewed changes
Copilot reviewed 26 out of 27 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| test/harness/replayingCapiProxy.ts | Adds support for skipWritingCache parameter to prevent snapshot writes on test failure |
| python/e2e/testharness/proxy.py | Adds skip_writing_cache parameter to stop() method |
| python/e2e/testharness/context.py | Passes test_failed flag to proxy teardown to conditionally skip snapshot writes |
| python/e2e/conftest.py | Implements pytest hook to track test failures and skip snapshot writes |
| nodejs/test/e2e/harness/sdkTestContext.ts | Uses Vitest's onTestFailed hook to track failures and adds COPILOT_CLI_PATH env var support |
| nodejs/test/e2e/harness/CapiProxy.ts | Adds skipWritingCache parameter to stop() method |
| go/e2e/testharness/proxy.go | Adds StopWithOptions method with skipWritingCache parameter |
| go/e2e/testharness/context.go | Uses Go's t.Failed() to detect test failures and skip snapshot writes |
| dotnet/test/Harness/E2ETestContext.cs | Checks CI environment variable to skip snapshot writes (xUnit limitation) |
| dotnet/test/Harness/CapiProxy.cs | Adds skipWritingCache parameter to StopAsync method |
| test/snapshots/tools/invokes_built_in_tools.yaml | Updated snapshot reflecting coalesced tool calls format |
| test/snapshots/session/*.yaml | Updated snapshots with minor variations in assistant responses |
| test/snapshots/permissions/*.yaml | Updated snapshots with coalesced tool calls and response variations |
| test/snapshots/mcp-and-agents/*.yaml | Updated snapshots with response variations |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
d82a2c3 to
40de300
Compare
2055997 to
456dc7d
Compare
456dc7d to
4388a4e
Compare
jmoseley
approved these changes
Jan 16, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
As of updating to latest CLI, the E2E tests failed because of response format changes.
1. Update snapshots for Anthropic extended thinking
The CLI now coalesces tool calls into single assistant messages for Anthropic extended thinking compatibility. This updates all affected snapshots to match the new format.
2. Skip writing snapshots on test failure
Prevents corrupted snapshots from being written when tests fail. Each language uses its native test framework hooks: