API Reference ============= This page provides complete API documentation for WiMarka's public and internal interfaces. Main Module (``wimarka.main``) ------------------------------- .. automodule:: wimarka.main :members: :undoc-members: :show-inheritance: wmk_eval ~~~~~~~~ .. autofunction:: wimarka.main.wmk_eval Main evaluation function that orchestrates the entire pipeline. **Function Signature**: .. code-block:: python def wmk_eval(src_file_path: str, src_lang: str, tgt_file_path: str, tgt_lang: str) -> None **Parameters**: * ``src_file_path`` (str): Absolute or relative path to source text file * ``src_lang`` (str): Source language code (EN, CEB, ILO, TGT) * ``tgt_file_path`` (str): Absolute or relative path to target translation file * ``tgt_lang`` (str): Target language code (CEB, ILO, TGT) **Returns**: None (results stored in global ``results`` dictionary and printed) **Raises**: * ``ValueError``: If source and target files have different line counts * ``FileNotFoundError``: If input files don't exist **Example**: .. code-block:: python from wimarka.main import wmk_eval wmk_eval( src_file_path='data/english.txt', src_lang='EN', tgt_file_path='data/cebuano.txt', tgt_lang='CEB' ) results Dictionary ~~~~~~~~~~~~~~~~~~ Global dictionary storing evaluation results. **Structure**: .. code-block:: python results = { 'source': List[str], # Source sentences (with tags) 'target': List[str], # Target sentences (with tags) 'errors': List[List[str]], # Detected errors per sentence 'fluency_score': List[float], # Fluency scores (0-100) 'adequacy_score': List[float], # Adequacy scores (0-100) 'overall_score': List[float], # Overall scores (0-100) 'explanation': List[str], # Human-readable explanations 'corrected_translation': List[str] # Suggested corrections } **Access Pattern**: .. code-block:: python from wimarka.main import wmk_eval, results wmk_eval('src.txt', 'EN', 'tgt.txt', 'CEB') # Access results for i in range(len(results['source'])): print(f"Score: {results['overall_score'][i]}") CLI Module (``wimarka.cli``) ----------------------------- .. automodule:: wimarka.cli :members: :undoc-members: main ~~~~ .. code-block:: python @click.command() @click.option('--src_file_path', required=True, help='Path to source text file') @click.option('--src_lang', required=True, help='Source language code') @click.option('--tgt_file_path', required=True, help='Path to target text file') @click.option('--tgt_lang', required=True, help='Target language code') def main(src_file_path, src_lang, tgt_file_path, tgt_lang): """Command-line interface for WiMarka evaluation.""" Entry point for CLI execution. Wraps ``wmk_eval()`` with Click decorators. Task Modules ------------ error_detection Module ~~~~~~~~~~~~~~~~~~~~~~ .. automodule:: wimarka.tasks.error_detection :members: :undoc-members: **Main Function**: .. code-block:: python def error_detection(src_line: str, tgt_line: str) -> List[str] Detects translation errors between source and target sentences. **Parameters**: * ``src_line``: Source sentence with language tag * ``tgt_line``: Target sentence with language tag **Returns**: List of detected error descriptions **Error Types**: * Lexical errors (wrong word choice) * Syntactic errors (grammar issues) * Semantic errors (meaning loss) * Morphological errors (wrong affixes) * Omissions/additions scoring Module ~~~~~~~~~~~~~~ .. automodule:: wimarka.tasks.scoring :members: :undoc-members: **Main Function**: .. code-block:: python def scoring(src_line: str, tgt_line: str, errors: List[str]) -> Tuple[float, float, float] Calculates quality scores for the translation. **Parameters**: * ``src_line``: Source sentence with tag * ``tgt_line``: Target sentence with tag * ``errors``: List of detected errors from error_detection **Returns**: Tuple of (fluency_score, adequacy_score, overall_score) * All scores are floats in range [0, 100] * Overall score = (fluency + adequacy) / 2 explanation Module ~~~~~~~~~~~~~~~~~~ .. automodule:: wimarka.tasks.explanation :members: :undoc-members: **Main Function**: .. code-block:: python def generate_explanation(src_line: str, tgt_line: str, errors: List[str], fluency: float, adequacy: float, overall: float) -> str Generates human-readable explanation of the evaluation. **Parameters**: * ``src_line``: Source sentence * ``tgt_line``: Target sentence * ``errors``: Detected errors * ``fluency``: Fluency score * ``adequacy``: Adequacy score * ``overall``: Overall score **Returns**: Natural language explanation string correction Module ~~~~~~~~~~~~~~~~~ .. automodule:: wimarka.tasks.correction :members: :undoc-members: **Main Function**: .. code-block:: python def generate_correction(src_line: str, tgt_line: str, errors: List[str], comments: str) -> str Generates corrected translation suggestion. **Parameters**: * ``src_line``: Source sentence * ``tgt_line``: Target sentence * ``errors``: Detected errors * ``comments``: Explanation from explanation module **Returns**: Suggested corrected translation Utility Modules --------------- helper Module ~~~~~~~~~~~~~ .. automodule:: wimarka.utils.helper :members: :undoc-members: **Key Functions**: .. code-block:: python def check_tag(src_lang: str, tgt_lang: str) -> None: """Validate language codes.""" def add_tag(sentence: str, lang: str) -> str: """Add language tag to sentence.""" def printEvaluationResults(results: dict) -> None: """Print formatted evaluation results.""" logger Module ~~~~~~~~~~~~~ .. automodule:: wimarka.utils.logger :members: :undoc-members: **Main Function**: .. code-block:: python def setup_logger() -> logging.Logger: """Configure and return logger instance.""" **Usage**: .. code-block:: python from wimarka.utils.logger import setup_logger logger = setup_logger() logger.info("Processing started") logger.error("An error occurred") model Module ~~~~~~~~~~~~ .. automodule:: wimarka.utils.model :members: :undoc-members: **Key Functions**: .. code-block:: python def load_model(model_name: str): """Load model from HuggingFace Hub or cache.""" def get_model_path(model_name: str) -> str: """Get local path to cached model.""" cache Module ~~~~~~~~~~~~ .. automodule:: wimarka.utils.cache :members: :undoc-members: Caching utilities for model responses and intermediate results. torch Module ~~~~~~~~~~~~ .. automodule:: wimarka.utils.torch :members: :undoc-members: **Device Management**: .. code-block:: python import torch device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') Type Hints ---------- Common type hints used throughout WiMarka: .. code-block:: python from typing import List, Tuple, Dict, Optional # Sentence with language tag TaggedSentence = str # Format: "[LANG] sentence text" # Error list ErrorList = List[str] # Scores tuple Scores = Tuple[float, float, float] # (fluency, adequacy, overall) # Results structure ResultsDict = Dict[str, List] Constants --------- Language Codes ~~~~~~~~~~~~~~ .. code-block:: python SUPPORTED_LANGUAGES = ['EN', 'CEB', 'ILO', 'TGT'] LANGUAGE_NAMES = { 'EN': 'English', 'CEB': 'Cebuano', 'ILO': 'Ilocano', 'TGT': 'Tagalog' } Score Ranges ~~~~~~~~~~~~ .. code-block:: python MIN_SCORE = 0 MAX_SCORE = 100 # Quality thresholds EXCELLENT_THRESHOLD = 90 GOOD_THRESHOLD = 75 FAIR_THRESHOLD = 60 Configuration ------------- Model Identifiers ~~~~~~~~~~~~~~~~~ Models are identified by HuggingFace repository names or local paths. See ``wimarka/config.py`` for complete configuration. Best Practices -------------- Using the API ~~~~~~~~~~~~~ 1. **Import Correctly**: .. code-block:: python from wimarka.main import wmk_eval, results # Correct import wimarka # Less efficient 2. **Handle Results**: .. code-block:: python # Copy results if needed for multiple evaluations from copy import deepcopy wmk_eval('src1.txt', 'EN', 'tgt1.txt', 'CEB') results1 = deepcopy(results) wmk_eval('src2.txt', 'EN', 'tgt2.txt', 'CEB') results2 = deepcopy(results) 3. **Error Handling**: .. code-block:: python try: wmk_eval('src.txt', 'EN', 'tgt.txt', 'CEB') except ValueError as e: print(f"Validation error: {e}") except FileNotFoundError as e: print(f"File not found: {e}") Extending the API ~~~~~~~~~~~~~~~~~ To add custom processing: .. code-block:: python from wimarka.main import wmk_eval, results def custom_evaluation(src, tgt, lang, custom_metric): """Custom evaluation with additional metric.""" # Run standard evaluation wmk_eval(src, 'EN', tgt, lang) # Add custom processing custom_scores = [] for i in range(len(results['source'])): score = custom_metric( results['source'][i], results['target'][i] ) custom_scores.append(score) # Add to results results['custom_score'] = custom_scores return results See Also -------- * :doc:`tasks` - Detailed task module documentation * :doc:`utils` - Utility module internals * :doc:`architecture` - System architecture overview * :doc:`extending` - Extension guides