A simple template module for evaluating user/runtime-unknown value expressions in a safe manner, using Python's 'eval'.
-
Updated
Oct 25, 2018
A simple template module for evaluating user/runtime-unknown value expressions in a safe manner, using Python's 'eval'.
LLM evaluation framework
Activity and Sequence Detection Performance Measures: A package to evaluate activity detection results, including the sequence of events given multiple activity types.
A tool to perform functional testing and performance testing of the Dhruva Platform
Most popular metrics used to evaluate object detection algorithms.
This repository contains codes for NLP project "In-context Learning of Pre-trained Language Models for Controlled Dialogue Summarization: A Holistic Benchmark and Empirical Analysis"
Benchmark for assessing contextual-semantic sentence models in Brazilian legal domain.
Code, model and data for our paper: K. Tsigos, E. Apostolidis, S. Baxevanakis, S. Papadopoulos, V. Mezaris, "Towards Quantitative Evaluation of Explainable AI Methods for Deepfake Detection", Proc. ACM Int. Workshop on Multimedia AI against Disinformation (MAD’24) at the ACM Int. Conf. on Multimedia Retrieval (ICMR’24), Thailand, June 2024.
CHECKLIST-style test cases and the testing of three Hungarian Named Entity Recognition tools.
Flight Delay using Machine Learning
Integrated Evaluation Framework - Front-End Web Application
Official repository for the paper *Are Models Biased on Text Without Gender-related Language?*, published in ICLR 2024.
Implementation and analysis toolkit for language models across different task types, domains, and reasoning types using multiple prompt styles.
Calculate calibration of a model on DataSHIELD servers
drug repositioning method evaluation
N-Compariw: End-to-End Workflow for Neural Networks Comparison
A hybrid search engine based on the BM25 and VSM retrieval models.
Add a description, image, and links to the evaluation-framework topic page so that developers can more easily learn about it.
To associate your repository with the evaluation-framework topic, visit your repo's landing page and select "manage topics."