AI Testing

SPONSOR TRACK TALK

The Rabbit Hole of Grammar: LLMs on Trial with QA Adventures

Traditional QA pipelines struggle with brittle scripts, dynamic requirements, and the absence of reliable test oracles for AI-generated outputs. The team has developed an automated pipeline with LLMs to do functional testing of web applications. The approach applies metamorphic testing for analysing LLMs and Chomsky’s universal grammar rules for language-agnostic testing of the developed pipeline in the presence of textual mutations. The results show that this approach provides a theoretical backbone for checking semantic fidelity in natural languages, while Gherkin offers a structured, machine-readable counterpart for robust validation.

This talk, Jalpa Soni will demonstrate an end-to-end LLM pipeline showing:

1. Explanation of universal grammar rules used for validating cross-lingual performance of LLMs in language translations.

2. Validation checks with controlled textual mutations based on Universal Grammar for natural languages (English > Spanish).

3. The effect of the length of text to be translated on the robustness of LLMs.

4. Demonstration of the limit of realistic text mutations by humans in general and how LLMs can outperform even in extreme cases of text mutations.

What you’ll learn

A foundational understanding of universal grammar and its relevance to LLMs.

A theoretical explanation of why LLMs perform effectively in QA automation, grounded in universal-grammar principles.

A demonstration that natural-language requirements follow deep structural rules that can also be applied to Gherkin, the rule-based language for behavior-driven development.

Validation of LLM consistency across linguistic variations and formal syntactic structures.

An understanding of how to structure inputs for more reliable AI-driven QA results, illustrated through a practical metamorphic-testing example based on grammatical rules.

Session details

Track 3

12:00h - 12:45h · May 27th

40 minute talk + 5 minutes Q&A

AI Testing

General Level

Session in English

A short note from Jalpa on her talk

Jalpa Soni

Jalpa Soni acts as a Senior Data Scientist at the AI Innovation Lab at Capitole, leading the research components of multiple projects. She has a background in physics and a long experience of working in data science applied to a wide range of applications in different sectors.

Simply enter your keyword and we will help you find what you need.

What are you looking for?

AI Testing

SPONSOR TRACK TALK

The Rabbit Hole of Grammar: LLMs on Trial with QA Adventures

What you’ll learn

Session details

A short note from Jalpa on her talk

Jalpa Soni

Newsletter Sign-up