Sign in
chromium
/
germanium
/
HEAD
65cda24
Add a new evaluation for class refactoring.
by Jie Sheng
· 3 weeks ago
main
5d780c0
Set system prompts correctly
by Struan Shrimpton
· 3 weeks ago
f3e3b91
Revert "Update promptfoo cipd tags"
by Struan Shrimpton
· 3 weeks ago
2609ee7
Update promptfoo cipd tags
by Struan Shrimpton
· 3 weeks ago
607be8b
Include the prompt/response in the rdb test result
by Struan Shrimpton
· 4 weeks ago
e55a048
Support tool call checks in gemini cli eval framework.
by Jie Sheng
· 4 weeks ago
46d60b6
Make owners explicit
by Struan Shrimpton
· 5 weeks ago
776b148
Improve the test log output in gemini cli eval framework.
by Jie Sheng
· 5 weeks ago
3d5bab8
Include the test tags in the rdb tags
by Struan Shrimpton
· 5 weeks ago
81c4be3
Update prompt eval metric reporting
by Brian Sheedy
· 5 weeks ago
4fabd70
Create a new gemini cli eval to build a file.
by Jie Sheng
· 5 weeks ago
e856061
Tag rdb results with their metrics
by Struan Shrimpton
· 5 weeks ago
f343053
Update run_tests_in_file
by Struan Shrimpton
· 5 weeks ago
0c04b80
Clean workdirs regardless of forced
by Struan Shrimpton
· 5 weeks ago
a135030
[agents][eval] Store system prompt in `//GEMINI.md`
by Jonathan Lee
· 6 weeks ago
17cd0aa
Create a new gemini cli eval to verify run tests in file.
by Jie Sheng
· 6 weeks ago
c092057
Move perf uploading to handler
by Brian Sheedy
· 6 weeks ago
2930733
eval: Add negative tag filtering for tests
by James Woo
· 6 weeks ago
65da9b6
Move ResultDB reporting to handler
by Brian Sheedy
· 6 weeks ago
2389503
Support user-provided result handlers
by Brian Sheedy
· 6 weeks ago
ba2d83a
Separate reporting for each iteration
by Struan Shrimpton
· 6 weeks ago
04cbefc
eval: Add test filtering on metadata
by James Woo
· 6 weeks ago
d2778a5
Adjust prompt eval perf upload location
by Brian Sheedy
· 6 weeks ago
3a58f91
Reland "Eval Prompts: Add custom node support and cipd option for gcli"
by Struan Shrimpton
· 6 weeks ago
e634b2f
Revert "Eval Prompts: Add custom node support and cipd option for gcli"
by Struan Shrimpton
· 6 weeks ago
4a0111a
Fix perf uploading location
by Brian Sheedy
· 6 weeks ago
91b5c41
Eval Prompts: Add custom node support and cipd option for gcli
by Struan Shrimpton
· 6 weeks ago
0e4368b
Adjust perf upload format
by Brian Sheedy
· 6 weeks ago
c401587
Add prompt eval metric uploading
by Brian Sheedy
· 6 weeks ago
228eed9
Prompt Evals: specify precompile_targets on the test
by Struan Shrimpton
· 6 weeks ago
934b4d7
Eval Prompts: Remove promptfoo npm/src
by Struan Shrimpton
· 6 weeks ago
18d80c2
Prompt Eval: Improve test reliability
by Struan Shrimpton
· 6 weeks ago
3641d75
Eval Prompts: Improve readability of test results
by Struan Shrimpton
· 6 weeks ago
bbec6df
Refactor test/iteration results
by Brian Sheedy
· 6 weeks ago
a09ab25
Add pass@k support to prompt evals
by James Woo
· 6 weeks ago
d2c74f5
Adjust args for perf dashboard uploading
by Brian Sheedy
· 6 weeks ago
2fa6db2
Prevent unit tests from pulling cipd packages
by Struan Shrimpton
· 6 weeks ago
498a5bd
Eval prompts: fix default gemini bin for 1P
by Struan Shrimpton
· 6 weeks ago
4dbddcb
Eval Prompts: Add trusted folders for temp HOME dir
by Struan Shrimpton
· 6 weeks ago
02bdd58
Add and use by default cipd promptfoo
by Struan Shrimpton
· 6 weeks ago
356a0f3
Use settings.json for gemini-cli telemetry
by Brian Sheedy
· 6 weeks ago
8b4da35
Extract prompt eval scores
by Brian Sheedy
· 6 weeks ago
4220cc7
Update //agents/testing/README
by Struan Shrimpton
· 6 weeks ago
eb5f995
Surface gemini-cli token usage
by Brian Sheedy
· 6 weeks ago
f37d63c
Update eval_prompts for structured test ids
by Struan Shrimpton
· 6 weeks ago
f9cc421
agents: Add pass@k configuration parsing
by James Woo
· 6 weeks ago
183dad9
Fix gemini_provider console width parsing
by Brian Sheedy
· 6 weeks ago
cc91996
Refactor gemini_provider's call_api
by Brian Sheedy
· 6 weeks ago
fad9ca4
Docs: Improve btrfs setup instructions
by James Woo
· 6 weeks ago
52c65dc
agents: Centralize Gemini helper functions
by James Woo
· 6 weeks ago
9a59afd
feat(eval): Set sandbox PATH from container image
by James Woo
· 6 weeks ago
e5537eb
Add an unrestricted parallel option
by Struan Shrimpton
· 6 weeks ago
069ef68
[agents][eval] Introduce reusable `check_gtests.py` assert
by Jonathan Lee
· 6 weeks ago
935f99d
[agents][eval] Enable `use_remoteexec` to speed up builds
by Jonathan Lee
· 6 weeks ago
f7162c1
feat(eval): Mount depot_tools in sandbox
by James Woo
· 6 weeks ago
583c967
Add missing prompt eval arg
by Brian Sheedy
· 6 weeks ago
cb6d335
refactor(eval): Directly fetch sandbox image
by James Woo
· 6 weeks ago
458560c
Add remaining isolated script args + validation
by Brian Sheedy
· 6 weeks ago
e692bd3
Add prompt eval --isolated-script-test-repeat
by Brian Sheedy
· 6 weeks ago
850bc85
feat(eval): Use temp home for test environments
by James Woo
· 6 weeks ago
45403f3
Update prompt eval test filtering
by Brian Sheedy
· 6 weeks ago
858723a
Rename example and test extensions with underscores
by Struan Shrimpton
· 6 weeks ago
76e79dc
Refactor prompt eval argument parsing
by Brian Sheedy
· 6 weeks ago
ad78f25
Update build_information MCP to build-information
by Struan Shrimpton
· 6 weeks ago
f91b762
Add OWNERS to extensions and testing
by Struan Shrimpton
· 6 weeks ago
74e36b1
Add missing //agents/testing tests
by Brian Sheedy
· 6 weeks ago
cd484d4
feat(eval): Add support for local dev binaries
by James Woo
· 6 weeks ago
8e5c311
Reland "Add presubmit checks for `promptfoo.yaml` files."
by Jiamei Liu
· 6 weeks ago
8f6a2b5
[6/6] Parallel worker cleanup
by Brian Sheedy
· 6 weeks ago
3ced775
[5/6?] Support multiple parallel workers
by Brian Sheedy
· 6 weeks ago
6cba49a
feat(testing): Add test retries for flaky tests
by James Woo
· 6 weeks ago
68b0b0d
[4/?] Move promptfoo installation code to new file
by Brian Sheedy
· 6 weeks ago
3572449
[3/?] Move WorkDir to new file
by Brian Sheedy
· 6 weeks ago
124fe1b
feat(agents): Add flag for including test extensions
by James Woo
· 6 weeks ago
ecd6686
fix(eval): Fix input prompt when fetching sandbox
by James Woo
· 6 weeks ago
a352339
[2/?] Move result reporting to separate thread
by Brian Sheedy
· 6 weeks ago
0ba62c1
[1/?] Move agents result-related code
by Brian Sheedy
· 6 weeks ago
2415af3
Revert "Add presubmit checks for `promptfoo.yaml` files."
by Wenbo Jie
· 6 weeks ago
82d4265
Add presubmit checks for `promptfoo.yaml` files.
by Jiamei Liu
· 6 weeks ago
5f8c6d4
Consolidate RunPromptEvalTestsUnittest mocking
by Brian Sheedy
· 6 weeks ago
fd967e2
Add prompt eval ResultDB integration
by Brian Sheedy
· 6 weeks ago
8fb4620
feat(eval): Add test_landmines extension
by James Woo
· 6 weeks ago
9b5374f
feat(testing): Enable sandboxed prompt evaluations
by James Woo
· 6 weeks ago
93a1f19
Refactor eval_prompts.py main() and add test coverage
by Brian Sheedy
· 6 weeks ago
279456f
Fix build_information tests for multiple platforms
by Struan Shrimpton
· 6 weeks ago
2731573
Add //agents/testing helper function unittests
by Brian Sheedy
· 6 weeks ago
ca60729
Add WorkDir unittests
by Brian Sheedy
· 6 weeks ago
d5ab987
Add promptfoo installation unittests
by Brian Sheedy
· 6 weeks ago
676d4e3
feat(testing): Add custom promptfoo.yaml options
by James Woo
· 6 weeks ago
8e5d0d5
Add prompt eval unittests
by Brian Sheedy
· 6 weeks ago
6f76b5b
Print provider output to console
by Struan Shrimpton
· 6 weeks ago
5a3dba7
Build out/Default in eval_prompts
by Struan Shrimpton
· 6 weeks ago
9253315
Add //agents pylint coverage
by Brian Sheedy
· 6 weeks ago
1c2a112
Add sharding support for prompt evaluation
by Brian Sheedy
· 6 weeks ago
dbcd0d4
Add a source check to the eval prompts
by Struan Shrimpton
· 6 weeks ago
4867887
Improve eval_prompt workdirs
by Struan Shrimpton
· 6 weeks ago
36630d7
Automate prompt eval test discovery
by Brian Sheedy
· 6 weeks ago
8ea3f96
Check for stale workdirs and add --force flag
by Struan Shrimpton
· 6 weeks ago
e6475a2
Fix cwd for promptfoo from source
by Struan Shrimpton
· 6 weeks ago
9c2b75c
Update readme for btrfs setup
by Struan Shrimpton
· 6 weeks ago
Next »