Formal verification of the properties of coreferent resolution model based on decision trees

The paper examines the problem of coreference resolution, which involves identifying objects - words or phrases in a text, that refer to the same real or imaginary entity. The solution of this task is explored for Ukrainianlanguage texts using decision trees, which autonomously structure themselves...

Full description

Saved in:
Bibliographic Details
Date:2024
Main Authors: Pogorilyy, S.D., Slynko, M. S., Biletskyi, P.V.
Format: Article
Language:Ukrainian
Published: Інститут програмних систем НАН України 2024
Subjects:
Online Access:https://pp.isofts.kiev.ua/index.php/ojs1/article/view/652
Tags: Add Tag
No Tags, Be the first to tag this record!
Journal Title:Problems in programming

Institution

Problems in programming
Description
Summary:The paper examines the problem of coreference resolution, which involves identifying objects - words or phrases in a text, that refer to the same real or imaginary entity. The solution of this task is explored for Ukrainianlanguage texts using decision trees, which autonomously structure themselves based on training data. Unlike other machine learning algorithms such as neural networks, decision trees allow for analysis of their internal structure through graphical representation. This feature facilitates explaining individual results produced by the tree, significantly easing formal verification of their properties. To create decision trees, vector representations of words (such as Elmo) and other linguistic features are used. After formation, decision trees are employed for binary classification of input pairs potentially referring to the same coreferent objects. Based on the obtained binary classifier, coreferent objects are grouped into clusters, followed by an evaluation of the clustering accuracy using specialized metrics. The paper provides a detailed description of the implemented application and the structure of the formed decision tree, which serves as the basis for further analysis. Additionally, the use of transition systems is proposed to construct a high-level specification model for coreference resolution. The transition system-based model enables analysis of application behavior on infinite state sequences, ensuring errorfree execution. Formalization is carried out, and automata models along with linear-temporal logic are used to verify a set of properties of the obtained specification. Büchi automata are created to accept words confirming the properties, and examples as well as counterexamples of the analyzed properties are found. The method defined in the paper serves as the foundation for creating automated analyzers for coreference resolution applications based on decision trees.Prombles in programming 2024; 2-3: 319-325