Name | Modified | Size | Downloads / Week |
---|---|---|---|
Parent folder | |||
Berkeley Function Calling Leaderboard Updates (v1.2) source code.tar.gz | 2025-01-04 | 44.5 MB | |
Berkeley Function Calling Leaderboard Updates (v1.2) source code.zip | 2025-01-04 | 44.8 MB | |
README.md | 2025-01-04 | 19.4 kB | |
Totals: 3 Items | 89.4 MB | 0 |
Highlights
🏆 Berkeley Function Calling Leaderboard V3 with Multi-step and Multi-turn function call evaluation
What's Changed
- [BFCL] Package the Codebase by @devanshamin in https://github.com/ShishirPatil/gorilla/pull/565
- Added python script named as raft_local.py to raft directory to run script completely locally using HF models by @himanshushukla12 in https://github.com/ShishirPatil/gorilla/pull/605
- RAFT Enhancements: Improved robustness, logging, checkpointing, threading, Llama support, Azure auth and eval by @cedricvidal in https://github.com/ShishirPatil/gorilla/pull/604
- Fix/merge commit [#605] and [#604] by @ShishirPatil in https://github.com/ShishirPatil/gorilla/pull/609
- Fix issue [#614]: [BFCL] ModuleNotFoundError after commit 70d6722 by @kobe0938 in https://github.com/ShishirPatil/gorilla/pull/615
- Fix some bugs in test case prompts/ground truths by @aw632 in https://github.com/ShishirPatil/gorilla/pull/608
- [BFCL] Dataset and Possible Answer Fix by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/600
- Add Salesforce xLAM model series by @zuxin666 in https://github.com/ShishirPatil/gorilla/pull/616
- Update gemini_handler.py to better handle NL+FC model output by @vandyxiaowei in https://github.com/ShishirPatil/gorilla/pull/617
- [BFCL] Fix Decoding Issue in Nvidia Handler by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/623
- [BFCL] Fix Llama Handler by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/626
- [BFCL] add MadeAgents/Hammer-7b handler by @linqq9 in https://github.com/ShishirPatil/gorilla/pull/627
- [BFCL] Refactor Model Handler into OSS and Proprietary Components by @devanshamin in https://github.com/ShishirPatil/gorilla/pull/612
- [BFCL] Hot Fix to Remove Extra Parameters for NoAPIKeyError by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/636
- fix: bug for glm prompt format by @zhangch-ss in https://github.com/ShishirPatil/gorilla/pull/638
- [BFCL] Add New Model
o1-preview-2024-09-12
ando1-mini-2024-09-12
by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/635 - [BFCL] BFCL v3 by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/644
- removed unnecessary comments in raft/raft_local.py by @himanshushukla12 in https://github.com/ShishirPatil/gorilla/pull/654
- [BFCL] Chore: Separate Change Log. by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/648
- [BFCL] Bug Fix inference_single_turn_FC function for base_handler by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/656
- [BFCL] Bug Fix parse_nested_value function for model_handler utils by @VishnuSuresh27 in https://github.com/ShishirPatil/gorilla/pull/660
- added Phi-3 handlers by @AndyChenYH in https://github.com/ShishirPatil/gorilla/pull/640
- Update agent arena frontend and evals by @NithikYekollu in https://github.com/ShishirPatil/gorilla/pull/666
- [BFCL] Speed Up Locally-hosted Model Inference Process by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/671
- [BFCL] Fix Hanging Inference for OSS Models on GPU Platforms by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/663
- [BFCL] Add gemini-1.5-pro-002, gemini-1.5-pro-002-FC, gemini-1.5-pro-001, gemini-1.5-pro-001-FC, gemini-1.5-flash-002, gemini-1.5-flash-002-FC, gemini-1.0-pro-002, gemini-1.0-pro-002-FC by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/658
- [BFCL] Add Llama-3.2-1B-Instruct, Llama-3.2-3B-Instruct, Llama-3.1-8B-Instruct, Llama-3.1-70B-Instruct by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/657
- [BFCL] Add ToolACE handler for BFCL-v3 by @XuHwang in https://github.com/ShishirPatil/gorilla/pull/653
- Add Qwen handler and fix mean_latency calculation error for OSS models by @zhangch-ss in https://github.com/ShishirPatil/gorilla/pull/642
- update README.md by @leosun12 in https://github.com/ShishirPatil/gorilla/pull/669
- [BFCL] Chore: Various Improvements and Adjustments by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/673
- [BFCL] Chore: Refactor File Path Handling and Automate apply_function_credential_config.py by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/675
- docs: update README.md by @eltociear in https://github.com/ShishirPatil/gorilla/pull/676
- [BFCL-v3] Multi-Turn Possible Answer Order Change by @Fanjia-Yan in https://github.com/ShishirPatil/gorilla/pull/679
- update hammer handler and add Hammer2.0 model by @linqq9 in https://github.com/ShishirPatil/gorilla/pull/667
- [BFCL] Chore: Improve Multi Turn Error Logs by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/689
- Update google-cloud-aiplatform dependency by @jieru-hu in https://github.com/ShishirPatil/gorilla/pull/677
- add minicpm3 4b by @Cppowboy in https://github.com/ShishirPatil/gorilla/pull/633
- [BFCL-v2] Dataset and Possible Answer Fix by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/661
- [BFCL] Add Gemma-2 models by @jacovkim in https://github.com/ShishirPatil/gorilla/pull/696
- add a basic bfcl command-line interface by @mattf in https://github.com/ShishirPatil/gorilla/pull/621
- Fixing BFCL-v3 multi-turn apps by @virginie-do in https://github.com/ShishirPatil/gorilla/pull/701
- [BFCL v1] Update Executable Ground Truth for REST Category by @CharlieJCJ in https://github.com/ShishirPatil/gorilla/pull/708
- [BFCL v1] Rephrase Question for Better Clarity for Java & JavaScript Categories by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/709
- [BFCL] Add SGLang Backend Support for OSS Local Inference by @hnyls2002 in https://github.com/ShishirPatil/gorilla/pull/587
- (typo):I've made some corrections to your repository to improve clarity by @PrathameshSPawar in https://github.com/ShishirPatil/gorilla/pull/713
- docs: Centered the Image by @bhargavshirin in https://github.com/ShishirPatil/gorilla/pull/680
- [BFCL] Multi Turn Dataset and Possible Answer Fix by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/683
- [BFCL] Chore: Separate out Func Doc for Multi-Turn Categories by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/717
- [BFCL] Multi Turn Dataset and Possible Answer Fix (Base Category) by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/719
- [BFCL] Multi Turn Dataset Fix (Function Doc) by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/722
- [BFCL] Multi Turn Dataset Fix (Base Category) by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/723
- [BFCL] Multi Turn Pipeline Robustness Patch by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/724
- [BFCL] Small typo in variable name in travel_booking.py by @daanaea in https://github.com/ShishirPatil/gorilla/pull/731
- [BFCL] Patch [#724] by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/730
- [BFCL] Multi Turn Dataset Fix (Miss Func & Long Context) by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/728
- [BFCL] Multi Turn Dataset Fix (Miss Param) by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/732
- [BFCL] Update Eval Metric for Multi Turn Irrelevance Scenarios by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/725
- [BFCL] Remove duplicate in eval_runner.py by @ThomasRochefortB in https://github.com/ShishirPatil/gorilla/pull/735
- [BFCL] Support Dynamic max_tokens for Locally-Hosted Models by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/712
- [BFCL] Refine Evaluation Metric for Multi Turn Categories by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/733
- [BFCL] Adding New Model GoGoAgent by @RogueTensor in https://github.com/ShishirPatil/gorilla/pull/720
- [BFCL] Chore: Improve Inference Log Readability by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/746
- [BFCL Dataset Revamp 1/n] Multi-Turn (Part 1) by @Fanjia-Yan in https://github.com/ShishirPatil/gorilla/pull/740
- [BFCL] Robustness Patch for
_multi_threaded_inference
by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/754 - [BFCL] Prompt Caching for Claude Models by @VishnuSuresh27 in https://github.com/ShishirPatil/gorilla/pull/751
- [BFCL Dataset Revamp 2/n] Live Dataset Fix (Simple, Parallel, Parallel Multiple) by @Fanjia-Yan in https://github.com/ShishirPatil/gorilla/pull/737
- [BFCL Dataset Revamp 3/n] Live Dataset Fix (Multiple) by @Fanjia-Yan in https://github.com/ShishirPatil/gorilla/pull/739
- Update google-cloud-aiplatform version to 1.72.0 by @gabrielibagon in https://github.com/ShishirPatil/gorilla/pull/760
- [BFCL] Minor Grammatical Corrections to DEFAULT_SYSTEM_PROMPT by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/747
- [BFCL] Remove
Llama-3.2-3B-Instruct-FC
andLlama-3.2-1B-Instruct-FC
from Leaderboard by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/749 - [BFCL Chore] Supply
data_multi_turn.csv
for Multi-Turn Evaluation Results by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/762 - [BFCL] Remove Workaround Patch for Vertex AI Package by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/761
- Add exponential retry logic for gemini models by @gabrielibagon in https://github.com/ShishirPatil/gorilla/pull/764
- [BFCL] Remove Duplicate Line in
record_cost_latency
by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/767 - Fix handling of examples with no tools in Gemini by @gabrielibagon in https://github.com/ShishirPatil/gorilla/pull/770
- Remove stop condition in gemini retry logic by @gabrielibagon in https://github.com/ShishirPatil/gorilla/pull/769
- Skip adding empty content from gemini by @gabrielibagon in https://github.com/ShishirPatil/gorilla/pull/768
- [BFCL] Add the option to log to WandB during bfcl evaluate by @ThomasRochefortB in https://github.com/ShishirPatil/gorilla/pull/736
- [BFCL] Add
claude-3-5-haiku-20241022
,claude-3-5-haiku-20241022-FC
,claude-3-5-sonnet-20241022
,claude-3-5-sonnet-20241022-FC
by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/750 - [BFCL Dataset Revamp 4/n] Live Irrelevance by @Fanjia-Yan in https://github.com/ShishirPatil/gorilla/pull/763
- [BFCL Dataset Revamp 5/n] Multi-Turn Base WrapUp by @Fanjia-Yan in https://github.com/ShishirPatil/gorilla/pull/772
- [BFCL] Add Unit Test to Check for Illegal Python Parameter Name by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/777
- [BFCL] Dataset and Possible Answer Fix (Live Categories) for Illegal Python Parameter Name by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/778
- [BFCL] Add Support for Regeneration, Specific Test Entry IDs, and Custom Directory Locations by @Raymond112514 in https://github.com/ShishirPatil/gorilla/pull/743
- [BFCL] some tiny fix in possible_answer by @zhangch-ss in https://github.com/ShishirPatil/gorilla/pull/786
- [RAFT] Add link to Azure RAFT Distillation Recipe by @cedricvidal in https://github.com/ShishirPatil/gorilla/pull/758
- [BFCL] Add New Model
Qwen/Qwen2.5-72B-Instruct
by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/787 - [BFCL] Add DeepSeek-V2.5, DeepSeek-Coder-V2-Instruct-0724, DeepSeek-Coder-V2-Lite-Instruct, DeepSeek-V2-Chat-0628, DeepSeek-V2-Lite-Chat by @moonlight1431 in https://github.com/ShishirPatil/gorilla/pull/697
- Add minicpm3 4b FC model handler by @Cppowboy in https://github.com/ShishirPatil/gorilla/pull/718
- [BFCL] Add support for Writer models and Palmyra X 004 by @samjulien in https://github.com/ShishirPatil/gorilla/pull/755
- [BFCL Chore] Add
@final
and@overrides
Decorators to Class Methods in Model Handler by @VishnuSuresh27 in https://github.com/ShishirPatil/gorilla/pull/790 - [BFCL Chore] Support Multiple Models and Test Category Input for BFCL CLI by @vsvaidya27 in https://github.com/ShishirPatil/gorilla/pull/795
- [BFCL] Fix Irrelevance Category Performance for DeepSeek Coder Handler by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/796
- [BFCL Chore] Quick fix change of decorators from
@overrides
to@override
by @VishnuSuresh27 in https://github.com/ShishirPatil/gorilla/pull/797 - [BFCL Chore] Add Retry Mechanism with Backoff for Rate Limit Handling Across Proprietary Models by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/781
- [BFCL] Bug Fix for Execution_Result_Message Construction for Prompt Caching Feature in Claude Handler by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/805
- [BFCL Dataset Revamp 7/n] Augmented Multi-turn Dataset Fix by @Fanjia-Yan in https://github.com/ShishirPatil/gorilla/pull/804
- [BFCL Dataset Revamp 6/n] Live Relevance Data Fix by @Fanjia-Yan in https://github.com/ShishirPatil/gorilla/pull/789
- Add Weaviate APIs to Gorilla API Zoo by @CShorten in https://github.com/ShishirPatil/gorilla/pull/783
- [BFCL] Improve Latency Measurement Accuracy and Enable Default State Logging by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/808
- [BFCL] Replace 'class' with '_class' to Avoid Function Calling Formatting Error by @Fanjia-Yan in https://github.com/ShishirPatil/gorilla/pull/811
- [BFCL] Added Grok Handler by @amitojsingh2022 in https://github.com/ShishirPatil/gorilla/pull/810
- [BFCL] Resolve Issue in Gemini Model When No Model Output by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/809
- [BFCL] Add Amazon Models
nova-pro-v1.0
,nova-lite-v1.0
, andnova-micro-v1.0
by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/815 - [BFCL Chore] Revamp
README.md
for Clearer Instructions by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/819 - [BFCL] Update gpt-4o Snapshot Version from 2024-08-06 to 2024-11-20 by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/822
- fix some enum type errors in datasets by @zhangch-ss in https://github.com/ShishirPatil/gorilla/pull/826
- Fix Merge Conflict From [#826] by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/829
- [BFCL Chore] Add Unit Test for Valid Func Doc Format by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/828
- update hammer handler and add Hammer2.1 model by @linqq9 in https://github.com/ShishirPatil/gorilla/pull/832
- [BFCL] Add New Model
Llama-3.3-70B-Instruct
,Llama-3.3-70B-Instruct-FC
by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/837 - [BFCL] Add
o1-2024-12-17
ando1-2024-12-17-FC
by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/840 - Add Cohere Command R7B, replace older Command R+ handler by @harry-cohere in https://github.com/ShishirPatil/gorilla/pull/835
- [BFCL Dataset] Ground Truth Error Fix by @Fanjia-Yan in https://github.com/ShishirPatil/gorilla/pull/846
- [BFCL] Add
Qwen2.5-0.5B-Instruct
,Qwen2.5-3B-Instruct
,Qwen2.5-14B-Instruct
,Qwen2.5-32B-Instruct
by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/842 - [BFCL] Add New Model
watt-tool-8B
andwatt-tool-70B
by @zhanghanduo in https://github.com/ShishirPatil/gorilla/pull/847 - [BFCL] Skip Executable Categories When API Keys Missing by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/848
- [BFCL] Add
gemini-2.0-flash-exp-FC
,gemini-2.0-flash-exp
,gemini-exp-1206-FC
,gemini-exp-1206
by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/843 - Check and fix some parameter type errors in possible answers by @zhangch-ss in https://github.com/ShishirPatil/gorilla/pull/838
- [BFCL] Use
N/A
in Score Report for Unevaluated Categories by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/849 - [BFCL] possible answer fix: - reigion ->region by @sghyan16 in https://github.com/ShishirPatil/gorilla/pull/852
- [BFCL] Add Mistral Local Serving Handler and Add New Model
mistralai/Ministral-8B-Instruct-2410
by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/855 - [BFCL] Add New Model
DeepSeek-V3
by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/857 - [BFCL] Rename Directories:
proprietary_model
->api_inference
,oss_model
->local_inference
for Better Clarity by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/859 - [BFCL] Support for pre-existing completion endpoint by @ThomasRochefortB in https://github.com/ShishirPatil/gorilla/pull/864
- [BFCL Chore] Ensure Correct Input Format for Eval Checker by @HuanzhiMao in https://github.com/ShishirPatil/gorilla/pull/860
New Contributors
- @devanshamin made their first contribution in https://github.com/ShishirPatil/gorilla/pull/565
- @himanshushukla12 made their first contribution in https://github.com/ShishirPatil/gorilla/pull/605
- @kobe0938 made their first contribution in https://github.com/ShishirPatil/gorilla/pull/615
- @aw632 made their first contribution in https://github.com/ShishirPatil/gorilla/pull/608
- @linqq9 made their first contribution in https://github.com/ShishirPatil/gorilla/pull/627
- @zhangch-ss made their first contribution in https://github.com/ShishirPatil/gorilla/pull/638
- @VishnuSuresh27 made their first contribution in https://github.com/ShishirPatil/gorilla/pull/660
- @AndyChenYH made their first contribution in https://github.com/ShishirPatil/gorilla/pull/640
- @XuHwang made their first contribution in https://github.com/ShishirPatil/gorilla/pull/653
- @leosun12 made their first contribution in https://github.com/ShishirPatil/gorilla/pull/669
- @jieru-hu made their first contribution in https://github.com/ShishirPatil/gorilla/pull/677
- @Cppowboy made their first contribution in https://github.com/ShishirPatil/gorilla/pull/633
- @jacovkim made their first contribution in https://github.com/ShishirPatil/gorilla/pull/696
- @mattf made their first contribution in https://github.com/ShishirPatil/gorilla/pull/621
- @virginie-do made their first contribution in https://github.com/ShishirPatil/gorilla/pull/701
- @hnyls2002 made their first contribution in https://github.com/ShishirPatil/gorilla/pull/587
- @PrathameshSPawar made their first contribution in https://github.com/ShishirPatil/gorilla/pull/713
- @bhargavshirin made their first contribution in https://github.com/ShishirPatil/gorilla/pull/680
- @daanaea made their first contribution in https://github.com/ShishirPatil/gorilla/pull/731
- @ThomasRochefortB made their first contribution in https://github.com/ShishirPatil/gorilla/pull/735
- @RogueTensor made their first contribution in https://github.com/ShishirPatil/gorilla/pull/720
- @gabrielibagon made their first contribution in https://github.com/ShishirPatil/gorilla/pull/760
- @moonlight1431 made their first contribution in https://github.com/ShishirPatil/gorilla/pull/697
- @samjulien made their first contribution in https://github.com/ShishirPatil/gorilla/pull/755
- @vsvaidya27 made their first contribution in https://github.com/ShishirPatil/gorilla/pull/795
- @CShorten made their first contribution in https://github.com/ShishirPatil/gorilla/pull/783
- @amitojsingh2022 made their first contribution in https://github.com/ShishirPatil/gorilla/pull/810
- @zhanghanduo made their first contribution in https://github.com/ShishirPatil/gorilla/pull/847
- @sghyan16 made their first contribution in https://github.com/ShishirPatil/gorilla/pull/852
Full Changelog: https://github.com/ShishirPatil/gorilla/compare/v1.1...v1.2