Extending WiMarka

This guide covers extending WiMarka with new features, languages, and custom evaluation logic.

Adding New Languages

To add support for a new Philippine language:

Step 1: Update Language Codes

Add language code to config.py:

SUPPORTED_LANGUAGES = ['EN', 'CEB', 'ILO', 'TGT', 'HIL']  # Add HIL

Step 2: Update Helper Functions

Modify utils/helper.py to recognize the new language:

LANGUAGE_NAMES = {
    'EN': 'English',
    'CEB': 'Cebuano',
    'ILO': 'Ilocano',
    'TGT': 'Tagalog',
    'HIL': 'Hiligaynon'  # Add new language
}

Step 3: Add Language-Specific Models

If using language-specific models, update model.py:

LANGUAGE_MODELS = {
    'CEB': 'model-id-for-cebuano',
    'ILO': 'model-id-for-ilocano',
    'TGT': 'model-id-for-tagalog',
    'HIL': 'model-id-for-hiligaynon'  # Add new model
}

Step 4: Test

Create test files and validate:

wimarka --src_file_path test_en.txt \\
        --src_lang EN \\
        --tgt_file_path test_hil.txt \\
        --tgt_lang HIL

Adding Custom Tasks

To add a new evaluation task:

Step 1: Create Task Module

Create wimarka/tasks/sentiment_analysis.py:

def analyze_sentiment(src_line: str, tgt_line: str) -> str:
    """Analyze sentiment preservation in translation."""
    # Implementation
    return sentiment_report

Step 2: Integrate into Pipeline

Modify main.py:

from wimarka.tasks import sentiment_analysis

# In wmk_eval function
sentiment = sentiment_analysis.analyze_sentiment(src_line, tgt_line)
results['sentiment'].append(sentiment)

Step 3: Update Results Structure

Add new field to results dictionary:

results = {
    'source': [],
    'target': [],
    # ... existing fields ...
    'sentiment': []  # New field
}

Custom Scoring Algorithms

To implement custom scoring:

def custom_scoring(src: str, tgt: str, errors: List[str]) -> Tuple[float, float, float]:
    """Custom scoring logic."""
    # Your algorithm
    fluency = calculate_custom_fluency(src, tgt)
    adequacy = calculate_custom_adequacy(src, tgt, errors)
    overall = (fluency + adequacy) / 2

    return fluency, adequacy, overall

# Replace in pipeline
from wimarka.tasks import scoring
scoring.scoring = custom_scoring

Creating Plugins

For distributable extensions:

# wimarka_plugin_myfeature/__init__.py

def register_plugin():
    """Register plugin with WiMarka."""
    from wimarka import main
    main.register_task('myfeature', my_task_function)

Alternative Interfaces

Web API Example

from flask import Flask, request, jsonify
from wimarka.main import wmk_eval, results

app = Flask(__name__)

@app.route('/evaluate', methods=['POST'])
def evaluate():
    data = request.json
    # Process and return results
    return jsonify(results)

GUI Example

import tkinter as tk
from wimarka.main import wmk_eval

# Create GUI application
# ...

Best Practices

Modular Design: Keep extensions independent
Documentation: Document all custom features
Testing: Test extensions thoroughly
Compatibility: Ensure backwards compatibility