API Reference
=============

This page provides complete API documentation for WiMarka's public and internal interfaces.

Main Module (``wimarka.main``)
-------------------------------

.. automodule:: wimarka.main
   :members:
   :undoc-members:
   :show-inheritance:

wmk_eval
~~~~~~~~

.. autofunction:: wimarka.main.wmk_eval

Main evaluation function that orchestrates the entire pipeline.

**Function Signature**:

.. code-block:: python

   def wmk_eval(src_file_path: str, src_lang: str, 
                tgt_file_path: str, tgt_lang: str) -> None

**Parameters**:

* ``src_file_path`` (str): Absolute or relative path to source text file
* ``src_lang`` (str): Source language code (EN, CEB, ILO, TGT)
* ``tgt_file_path`` (str): Absolute or relative path to target translation file  
* ``tgt_lang`` (str): Target language code (CEB, ILO, TGT)

**Returns**: None (results stored in global ``results`` dictionary and printed)

**Raises**:

* ``ValueError``: If source and target files have different line counts
* ``FileNotFoundError``: If input files don't exist

**Example**:

.. code-block:: python

   from wimarka.main import wmk_eval

   wmk_eval(
       src_file_path='data/english.txt',
       src_lang='EN',
       tgt_file_path='data/cebuano.txt',
       tgt_lang='CEB'
   )

results Dictionary
~~~~~~~~~~~~~~~~~~

Global dictionary storing evaluation results.

**Structure**:

.. code-block:: python

   results = {
       'source': List[str],                  # Source sentences (with tags)
       'target': List[str],                  # Target sentences (with tags)
       'errors': List[List[str]],            # Detected errors per sentence
       'fluency_score': List[float],         # Fluency scores (0-100)
       'adequacy_score': List[float],        # Adequacy scores (0-100)
       'overall_score': List[float],         # Overall scores (0-100)
       'explanation': List[str],             # Human-readable explanations
       'corrected_translation': List[str]    # Suggested corrections
   }

**Access Pattern**:

.. code-block:: python

   from wimarka.main import wmk_eval, results

   wmk_eval('src.txt', 'EN', 'tgt.txt', 'CEB')
   
   # Access results
   for i in range(len(results['source'])):
       print(f"Score: {results['overall_score'][i]}")

CLI Module (``wimarka.cli``)
-----------------------------

.. automodule:: wimarka.cli
   :members:
   :undoc-members:

main
~~~~

.. code-block:: python

   @click.command()
   @click.option('--src_file_path', required=True, 
                 help='Path to source text file')
   @click.option('--src_lang', required=True,
                 help='Source language code')
   @click.option('--tgt_file_path', required=True,
                 help='Path to target text file')
   @click.option('--tgt_lang', required=True,
                 help='Target language code')
   def main(src_file_path, src_lang, tgt_file_path, tgt_lang):
       """Command-line interface for WiMarka evaluation."""

Entry point for CLI execution. Wraps ``wmk_eval()`` with Click decorators.

Task Modules
------------

error_detection Module
~~~~~~~~~~~~~~~~~~~~~~

.. automodule:: wimarka.tasks.error_detection
   :members:
   :undoc-members:

**Main Function**:

.. code-block:: python

   def error_detection(src_line: str, tgt_line: str) -> List[str]

Detects translation errors between source and target sentences.

**Parameters**:

* ``src_line``: Source sentence with language tag
* ``tgt_line``: Target sentence with language tag

**Returns**: List of detected error descriptions

**Error Types**:

* Lexical errors (wrong word choice)
* Syntactic errors (grammar issues)
* Semantic errors (meaning loss)
* Morphological errors (wrong affixes)
* Omissions/additions

scoring Module
~~~~~~~~~~~~~~

.. automodule:: wimarka.tasks.scoring
   :members:
   :undoc-members:

**Main Function**:

.. code-block:: python

   def scoring(src_line: str, tgt_line: str, errors: List[str]) -> Tuple[float, float, float]

Calculates quality scores for the translation.

**Parameters**:

* ``src_line``: Source sentence with tag
* ``tgt_line``: Target sentence with tag
* ``errors``: List of detected errors from error_detection

**Returns**: Tuple of (fluency_score, adequacy_score, overall_score)

* All scores are floats in range [0, 100]
* Overall score = (fluency + adequacy) / 2

explanation Module
~~~~~~~~~~~~~~~~~~

.. automodule:: wimarka.tasks.explanation
   :members:
   :undoc-members:

**Main Function**:

.. code-block:: python

   def generate_explanation(src_line: str, tgt_line: str, errors: List[str],
                           fluency: float, adequacy: float, overall: float) -> str

Generates human-readable explanation of the evaluation.

**Parameters**:

* ``src_line``: Source sentence
* ``tgt_line``: Target sentence
* ``errors``: Detected errors
* ``fluency``: Fluency score
* ``adequacy``: Adequacy score
* ``overall``: Overall score

**Returns**: Natural language explanation string

correction Module
~~~~~~~~~~~~~~~~~

.. automodule:: wimarka.tasks.correction
   :members:
   :undoc-members:

**Main Function**:

.. code-block:: python

   def generate_correction(src_line: str, tgt_line: str, 
                          errors: List[str], comments: str) -> str

Generates corrected translation suggestion.

**Parameters**:

* ``src_line``: Source sentence
* ``tgt_line``: Target sentence
* ``errors``: Detected errors
* ``comments``: Explanation from explanation module

**Returns**: Suggested corrected translation

Utility Modules
---------------

helper Module
~~~~~~~~~~~~~

.. automodule:: wimarka.utils.helper
   :members:
   :undoc-members:

**Key Functions**:

.. code-block:: python

   def check_tag(src_lang: str, tgt_lang: str) -> None:
       """Validate language codes."""

   def add_tag(sentence: str, lang: str) -> str:
       """Add language tag to sentence."""

   def printEvaluationResults(results: dict) -> None:
       """Print formatted evaluation results."""

logger Module
~~~~~~~~~~~~~

..  automodule:: wimarka.utils.logger
   :members:
   :undoc-members:

**Main Function**:

.. code-block:: python

   def setup_logger() -> logging.Logger:
       """Configure and return logger instance."""

**Usage**:

.. code-block:: python

   from wimarka.utils.logger import setup_logger
   
   logger = setup_logger()
   logger.info("Processing started")
   logger.error("An error occurred")

model Module
~~~~~~~~~~~~

.. automodule:: wimarka.utils.model
   :members:
   :undoc-members:

**Key Functions**:

.. code-block:: python

   def load_model(model_name: str):
       """Load model from HuggingFace Hub or cache."""

   def get_model_path(model_name: str) -> str:
       """Get local path to cached model."""

cache Module
~~~~~~~~~~~~

.. automodule:: wimarka.utils.cache
   :members:
   :undoc-members:

Caching utilities for model responses and intermediate results.

torch Module
~~~~~~~~~~~~

.. automodule:: wimarka.utils.torch
   :members:
   :undoc-members:

**Device Management**:

.. code-block:: python

   import torch
   
   device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

Type Hints
----------

Common type hints used throughout WiMarka:

.. code-block:: python

   from typing import List, Tuple, Dict, Optional

   # Sentence with language tag
   TaggedSentence = str  # Format: "[LANG] sentence text"
   
   # Error list
   ErrorList = List[str]
   
   # Scores tuple
   Scores = Tuple[float, float, float]  # (fluency, adequacy, overall)
   
   # Results structure
   ResultsDict = Dict[str, List]

Constants
---------

Language Codes
~~~~~~~~~~~~~~

.. code-block:: python

   SUPPORTED_LANGUAGES = ['EN', 'CEB', 'ILO', 'TGT']
   
   LANGUAGE_NAMES = {
       'EN': 'English',
       'CEB': 'Cebuano',
       'ILO': 'Ilocano',
       'TGT': 'Tagalog'
   }

Score Ranges
~~~~~~~~~~~~

.. code-block:: python

   MIN_SCORE = 0
   MAX_SCORE = 100
   
   # Quality thresholds
   EXCELLENT_THRESHOLD = 90
   GOOD_THRESHOLD = 75
   FAIR_THRESHOLD = 60

Configuration
-------------

Model Identifiers
~~~~~~~~~~~~~~~~~

Models are identified by HuggingFace repository names or local paths.

See ``wimarka/config.py`` for complete configuration.

Best Practices
--------------

Using the API
~~~~~~~~~~~~~

1. **Import Correctly**:

   .. code-block:: python

      from wimarka.main import wmk_eval, results  # Correct
      import wimarka  # Less efficient

2. **Handle Results**:

   .. code-block:: python

      # Copy results if needed for multiple evaluations
      from copy import deepcopy
      
      wmk_eval('src1.txt', 'EN', 'tgt1.txt', 'CEB')
      results1 = deepcopy(results)
      
      wmk_eval('src2.txt', 'EN', 'tgt2.txt', 'CEB')
      results2 = deepcopy(results)

3. **Error Handling**:

   .. code-block:: python

      try:
          wmk_eval('src.txt', 'EN', 'tgt.txt', 'CEB')
      except ValueError as e:
          print(f"Validation error: {e}")
      except FileNotFoundError as e:
          print(f"File not found: {e}")

Extending the API
~~~~~~~~~~~~~~~~~

To add custom processing:

.. code-block:: python

   from wimarka.main import wmk_eval, results
   
   def custom_evaluation(src, tgt, lang, custom_metric):
       """Custom evaluation with additional metric."""
       # Run standard evaluation
       wmk_eval(src, 'EN', tgt, lang)
       
       # Add custom processing
       custom_scores = []
       for i in range(len(results['source'])):
           score = custom_metric(
               results['source'][i],
               results['target'][i]
           )
           custom_scores.append(score)
       
       # Add to results
       results['custom_score'] = custom_scores
       
       return results

See Also
--------

* :doc:`tasks` - Detailed task module documentation
* :doc:`utils` - Utility module internals
* :doc:`architecture` - System architecture overview
* :doc:`extending` - Extension guides