WiMarka Documentation

WiMarka is a comprehensive Python library and CLI tool designed for evaluating machine translations with advanced syntactic and semantic analysis, providing detailed interpretability for Philippine Languages.

Python Version License: MIT

Overview

WiMarka addresses the critical need for accurate machine translation evaluation in Philippine languages. It goes beyond simple metrics by providing:

  • Error Detection: Identifies specific translation errors between source and target texts

  • Multi-dimensional Scoring: Evaluates translations across fluency, adequacy, and overall quality

  • Explainability: Generates human-readable explanations for detected errors

  • Correction Suggestions: Provides corrected translation alternatives

  • Philippine Language Focus: Specialized support for Cebuano (CEB), Ilocano (ILO), and Tagalog (TGT)

Key Features

Advanced Error Detection

Sophisticated algorithms identify translation inconsistencies and errors

📊 Multi-dimensional Scoring
  • Fluency Score: Measures how natural the translation reads

  • Adequacy Score: Evaluates semantic completeness and accuracy

  • Overall Quality Score: Comprehensive translation quality assessment

💡 Explainable Results

Detailed explanations for each detected error

🔧 Correction Suggestions

AI-powered suggestions for improving translations

🖥️ Dual Interface

Both Python library and CLI for flexible integration

🌏 Philippine Language Support

Specialized models for CEB, ILO, and TGT

Quick Start

Installation

pip install git+https://github.com/wimarka-uic/WiMarka.git

Basic Usage

Python Library:

from wimarka.main import wmk_eval

wmk_eval(
    src_file_path='source_file.txt',
    src_lang='EN',
    tgt_file_path='target_file.txt',
    tgt_lang='CEB'
)

Command Line:

wimarka --src_file_path source_file.txt \\
        --src_lang EN \\
        --tgt_file_path target_file.txt \\
        --tgt_lang CEB

Documentation Structure

This documentation is organized into two main sections:

User Manual

Complete guide for using WiMarka, including installation, usage examples, and best practices.

Technical Manual

In-depth technical documentation covering architecture, API reference, and development guidelines.

Support & Contributing

For questions, issues, or suggestions:

We welcome contributions! See the Development Guide guide for details.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Citation

If you use WiMarka in your research, please cite:

@software{wimarka2025,
  title={WiMarka: A Reference-free Evaluation Metric for Machine
Translation of Philippine Languages},
  author={University of the Immaculate Conception},
  year={2025},
  url={https://github.com/wimarka-uic/WiMarka}
}

Indices and tables