evaluation-framework

Here are 127 public repositories matching this topic...

CalebGartner / PyExpr

A simple template module for evaluating user/runtime-unknown value expressions in a safe manner, using Python's 'eval'.

python template python-library recipe evaluation python3 expression-template expression-parser eval expression-evaluator evaluate-expressions evaluation-framework expression-parsing expression-analysis expression-trees runtime-unknown

Updated Oct 25, 2018

Galactic-FaaS / BMR-Harness

Star

An Eval Harness for BLEU, METEOR, and ROUGE

transformers evaluation-framework

Updated Apr 3, 2024

brettdidonato / BSD_Evals

Star

LLM evaluation framework

bigquery evaluations gcp google-cloud openai evaluation-metrics evaluation-framework nl2sql text2sql llms generative-ai anthropic gemini-pro

Updated Apr 13, 2024
Jupyter Notebook

ics-unisg / aqudem

Star

Activity and Sequence Detection Performance Measures: A package to evaluate activity detection results, including the sequence of events given multiple activity types.

events evaluation activity-recognition event-log sequence-recognition evaluation-metrics evaluation-framework business-process-management activity-detection xes sequence-detection

Updated Jun 19, 2024
Python

AI4Bharat / Dhruva-Evaluation-Suite

Star

A tool to perform functional testing and performance testing of the Dhruva Platform

nlp tts locust nmt evaluation-metrics asr evaluation-framework

Updated Oct 18, 2023
Python

gauravpatil93 / evaluation-framework

Star

evaluation-framework

Updated Apr 3, 2017
Python

gplhegde / Object-Detection-Metrics

Star

Most popular metrics used to evaluate object detection algorithms.

xml-format object-detection pascal-voc evaluation-framework xml-annotation

Updated Apr 11, 2019
Python

t170815518 / control_dialogue_summarization_evaluation_benchmark

Star

This repository contains codes for NLP project "In-context Learning of Pre-trained Language Models for Controlled Dialogue Summarization: A Holistic Benchmark and Empirical Analysis"

dialogue summarization emnlp language-model evaluation-metrics evaluation-framework in-context-learning

Updated Jul 21, 2023
Jupyter Notebook

ulysses-camara / ulysses-senteval

Star

Benchmark for assessing contextual-semantic sentence models in Brazilian legal domain.

legal brazil evaluation datasets evaluation-framework brazilian-portuguese sentence-transformers sbert legal-domain

Updated Feb 6, 2024
Python

Code, model and data for our paper: K. Tsigos, E. Apostolidis, S. Baxevanakis, S. Papadopoulos, V. Mezaris, "Towards Quantitative Evaluation of Explainable AI Methods for Deepfake Detection", Proc. ACM Int. Workshop on Multimedia AI against Disinformation (MAD’24) at the ACM Int. Conf. on Multimedia Retrieval (ICMR’24), Thailand, June 2024.

evaluation-framework adversarial-attacks explainable-ai visual-explanations deepfake-detection