Skip to content

MLSP2024/MLSP_LS_LLM_Baseline

Repository files navigation

MLSP_LS_LLM_Baseline

An LS baseline for the Multilingual Lexical Simplification Pipeline 2024 Shared Task based on zero-shot prompting a large language model. We employ the chat-finetuned Llama 2 70B model in 4-bit quantisation. We use the following zero-shot prompt template and temperature 0.3 to generate a maximum of 256 new tokens.

Context: {context}
Question: Given the above context, list ten alternative {lang_space}words for "{word}" that are easier to understand. List only the words without translations, transcriptions or explanations.
Answer:

To construct the prompt, the placeholders in curly braces are replaced by the context, the language of the instance, and the target word to be simplified. For English, the placeholder {language} and the subsequent space is omitted. The prompt is identical to a zero-shot prompt employed for lexical simplification using a ChatGPT model by Aumiller and Gertz (2022), except for the the sentence “List onlyexplanations.”, which we have added to reduce unnecessary translations to English, transcriptions to Latin alphabet, or explanations. Such extra input was generated frequently when we applied the original prompt to trial data. The addition of the sentence results in both faster inference and higher accuracy.

Our postprocessing also builds on the work by Aumiller and Gertz (2022). Based on an examination of outputs using the trial data, we made minor changes reflecting a broader array of languages and scripts as well as a different model. For instance, we allow words to be separated by ideographic commas (、) commonly used in Japanese, or lists enumerated using letters (e.g. a), b), …), which occurred in Llama 2 output.

Reproducing the baseline

Note that the output of the baseline is already included in the repository. You can reproduce it by following the steps below.

  1. Install the Git submodule for MLSP_Data:

    git submodule init && git submodule update

  2. Install the requirements:

    python -m pip install -r requirements.txt

  3. Run the baseline:

    bash experiments.sh

Links

Releases

No releases published

Packages

No packages published