Understanding Output Format
===========================

This guide explains how to interpret WiMarka's evaluation output and understand the metrics it provides.

Output Structure
----------------

For each evaluated sentence pair, WiMarka provides:

.. code-block:: text

   Line X:
     Source: <source sentence>
     Target: <target sentence>
     Errors: <list of detected errors>
     Fluency Score: <0-100>
     Adequacy Score: <0-100>
     Overall Score: <0-100>
     Explanation: <human-readable explanation>
     Suggested Correction: <corrected translation>

Example Output
~~~~~~~~~~~~~~

.. code-block:: text

   Line 1:
     Source: Good morning!
     Target: Maayong buntag!
     Errors: []
     Fluency Score: 100/100
     Adequacy Score: 100/100
     Overall Score: 100/100
     Explanation: Perfect translation with correct meaning and natural grammar.
     Suggested Correction: Maayong buntag!

Evaluation Metrics
------------------

WiMarka provides three primary metrics for each translation:

Fluency Score
~~~~~~~~~~~~~

**Range**: 0-100

**Definition**: Measures how natural, grammatical, and readable the translation is in the target language.

**What It Evaluates**:
   * Grammatical correctness
   * Natural word order
   * Proper use of particles and markers
   * Idiomatic expression
   * Overall readability

**Interpretation**:

.. list-table::
   :header-rows: 1
   :widths: 20 80

   * - Score Range
     - Interpretation
   * - **90-100**
     - Excellent fluency. Translation reads like native text with natural grammar and word choice.
   * - **75-89**
     - Good fluency. Minor grammatical issues or slightly awkward phrasing, but generally understandable.
   * - **60-74**
     - Acceptable fluency. Noticeable grammatical errors or unnatural constructions, but meaning is clear.
   * - **40-59**
     - Poor fluency. Significant grammatical problems making the text difficult to read.
   * - **0-39**
     - Very poor fluency. Severe grammatical errors; text may be incomprehensible.

**Example High Fluency** (Score: 95):

.. code-block:: text

   EN:  The weather is beautiful today.
   CEB: Nindot kaayo ang panahon karon.
   # Natural Cebuano with proper word order and particles

**Example Low Fluency** (Score: 45):

.. code-block:: text

   EN:  The weather is beautiful today.
   CEB: Ang panahon nindot ka sa karon.
   # Awkward word order, incorrect particle usage

Adequacy Score
~~~~~~~~~~~~~~

**Range**: 0-100

**Definition**: Measures how completely and accurately the translation conveys the meaning of the source text.

**What It Evaluates**:
   * Semantic completeness
   * Preservation of meaning
   * No critical omissions
   * No added information
   * Correct interpretation

**Interpretation**:

.. list-table::
   :header-rows: 1
   :widths: 20 80

   * - Score Range
     - Interpretation
   * - **90-100**
     - Excellent adequacy. All meaning fully preserved with correct interpretation.
   * - **75-89**
     - Good adequacy. Most meaning conveyed; minor details may be slightly different.
   * - **60-74**
     - Acceptable adequacy. Core meaning present but some information loss or distortion.
   * - **40-59**
     - Poor adequacy. Significant meaning loss; important information missing or wrong.
   * - **0-39**
     - Very poor adequacy. Most meaning lost or severely distorted.

**Example High Adequacy** (Score: 98):

.. code-block:: text

   EN:  I bought three red apples at the market.
   TGT: Bumili ako ng tatlong pulang mansanas sa palengke.
   # All information preserved: quantity, color, item, location

**Example Low Adequacy** (Score: 50):

.. code-block:: text

   EN:  I bought three red apples at the market.
   TGT: Bumili ako ng mansanas.
   # Missing: quantity, color, location

Overall Score
~~~~~~~~~~~~~

**Range**: 0-100

**Definition**: Combined metric representing overall translation quality.

**Calculation**:

.. code-block:: python

   Overall Score = (Fluency Score + Adequacy Score) / 2

**Interpretation**:

.. list-table::
   :header-rows: 1
   :widths: 20 80

   * - Score Range
     - Quality Level
   * - **90-100**
     - Excellent - Publication-ready, professional quality
   * - **75-89**
     - Good - Minor improvements possible, generally acceptable
   * - **60-74**
     - Fair - Usable but needs revision
   * - **40-59**
     - Poor - Significant revision required
   * - **0-39**
     - Unacceptable - Major rework needed

**Trade-offs**:

A translation can have different fluency and adequacy scores:

**High Fluency, Low Adequacy**:

.. code-block:: text

   EN:  I need to finish this report by Friday.
   CEB: Kinahanglan nakong human kini nga semana.
        (I need to finish this week)
   
   Fluency: 90 (grammatically correct Cebuano)
   Adequacy: 60 (loses specificity - "Friday" vs "this week")
   Overall: 75

**Low Fluency, High Adequacy**:

.. code-block:: text

   EN:  I need to finish this report by Friday.
   CEB: Ako kinahanglan finish kini report by Biyernes.
   
   Fluency: 55 (code-switching, unnatural)
   Adequacy: 95 (all meaning preserved)
   Overall: 75

Error Detection
---------------

Error Types
~~~~~~~~~~~

WiMarka detects various error categories:

.. list-table::
   :header-rows: 1
   :widths: 25 75

   * - Error Type
     - Description
   * - **Lexical errors**
     - Wrong word choice, mistranslation
   * - **Syntactic errors**
     - Incorrect grammar, word order issues
   * - **Semantic errors**
     - Meaning distortion or loss
   * - **Morphological errors**
     - Wrong affixes, inflections
   * - **Pragmatic errors**
     - Incorrect formality, register
   * - **Omissions**
     - Missing information
   * - **Additions**
     - Unnecessary extra information

Error Format
~~~~~~~~~~~~

Errors are listed as a Python list:

.. code-block:: python

   # No errors
   Errors: []
   
   # Single error
   Errors: ['Semantic mismatch: time of day']
   
   # Multiple errors
   Errors: ['Lexical error: wrong verb choice', 
            'Omission: quantity not specified']

Understanding Errors
~~~~~~~~~~~~~~~~~~~~

**Example 1: Semantic Mismatch**

.. code-block:: text

   Source: Good morning!
   Target: Maayong gabii!  (Good evening!)
   Errors: ['Semantic mismatch: time of day']
   Explanation: Incorrect time reference - 'morning' vs 'evening'

**Example 2: Omission**

.. code-block:: text

   Source: I bought three apples.
   Target: Bumili ako ng mansanas.  (I bought apples.)
   Errors: ['Omission: quantity not specified']
   Explanation: The number 'three' was not translated

**Example 3: Syntactic Error**

.. code-block:: text

   Source: The book is on the table.
   Target: Ang libro sa lamesa.  (The book of table.)
   Errors: ['Syntactic error: missing verb']
   Explanation: Location verb 'nasa' missing

Explanations
------------

Natural Language Explanations
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

WiMarka generates human-readable explanations for the evaluation:

**Perfect Translation**:

.. code-block:: text

   Explanation: Excellent translation with accurate meaning and 
                natural grammar. No errors detected.

**Minor Issues**:

.. code-block:: text

   Explanation: Good translation overall. Minor fluency issue with 
                word order, but meaning is fully preserved.

**Significant Problems**:

.. code-block:: text

   Explanation: Translation has semantic error - wrong time of day. 
                'Morning' incorrectly translated as 'evening'. 
                Grammar is correct but meaning is incorrect.

Contextual Information
~~~~~~~~~~~~~~~~~~~~~~

Explanations may include:

* Specific error locations
* Linguistic reasoning
* Cultural or pragmatic considerations
* Alternative phrasings

Suggested Corrections
---------------------

How Corrections Work
~~~~~~~~~~~~~~~~~~~~

WiMarka provides improved translation suggestions:

.. code-block:: text

   Target: Maayong gabii!
   Errors: ['Semantic mismatch: time of day']
   Suggested Correction: Maayong buntag!

**Correction Quality**:
   * Addresses detected errors
   * Maintains semantic accuracy
   * Improves fluency when possible
   * Preserves general style

When to Use Corrections
~~~~~~~~~~~~~~~~~~~~~~~

.. note::
   Suggested corrections are **recommendations**, not absolute truth. 
   Always have corrections reviewed by native speakers for production use.

**Good Use Cases**:
   * Quick fixes for obvious errors
   * Learning from mistakes
   * Identifying problem patterns

**Exercise Caution**:
   * Critical translations (legal, medical)
   * Cultural/contextual nuances
   * Creative or literary content

Output Examples by Quality
---------------------------

Excellent Translation (90-100)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. code-block:: text

   Line 1:
     Source: Thank you for your help.
     Target: Salamat sa imong tabang.
     Errors: []
     Fluency Score: 98/100
     Adequacy Score: 100/100
     Overall Score: 99/100
     Explanation: Perfect translation with natural phrasing and 
                  complete meaning preservation.
     Suggested Correction: Salamat sa imong tabang.

Good Translation (75-89)
~~~~~~~~~~~~~~~~~~~~~~~~

.. code-block:: text

   Line 1:
     Source: I will call you tomorrow.
     Target: Tawagan ko ikaw ugma.
     Errors: []
     Fluency Score: 85/100
     Adequacy Score: 98/100
     Overall Score: 91.5/100
     Explanation: Good translation. Slightly more natural would be 
                  'Tawagan ta ka ugma' but current form is acceptable.
     Suggested Correction: Tawagan ta ka ugma.

Fair Translation (60-74)
~~~~~~~~~~~~~~~~~~~~~~~~

.. code-block:: text

   Line 1:
     Source: The meeting starts at 2 PM.
     Target: Ang meeting magsugod sa 2.
     Errors: ['Code-switching: English word "meeting"',
              'Missing: PM specification']
     Fluency Score: 70/100
     Adequacy Score: 75/100
     Overall Score: 72.5/100
     Explanation: Acceptable but could be improved. Use 'miting' 
                  instead of 'meeting'. Add 'sa hapon' for PM.
     Suggested Correction: Ang miting magsugod sa alas 2 sa hapon.

Poor Translation (Below 60)
~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. code-block:: text

   Line 1:
     Source: Good morning!
     Target: Magandang gabi!
     Errors: ['Wrong language: Tagalog instead of Cebuano',
              'Semantic error: evening instead of morning']
     Fluency Score: 40/100
     Adequacy Score: 30/100
     Overall Score: 35/100
     Explanation: Major errors. Wrong language used (Tagalog not 
                  Cebuano) and wrong time of day (evening not morning).
     Suggested Correction: Maayong buntag!

Best Practices for Interpretation
----------------------------------

1. **Consider Both Scores**
   
   Don't rely only on the overall score. Check both fluency and adequacy separately.

2. **Read Explanations**
   
   The explanation provides crucial context for understanding scores.

3. **Review Errors**
   
   Pay attention to the types of errors detected.

4. **Context Matters**
   
   Scores should be interpreted based on your use case (casual vs. professional).

5. **Use corrections Wisely**
   
   Treat suggestions as guidance, not absolute fixes.

6. **Batch Analysis**
   
   For multiple sentences, look at average scores and error patterns.

Programmatic Access
-------------------

Accessing Results in Python
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. code-block:: python

   from wimarka.main import wmk_eval, results

   # Run evaluation
   wmk_eval('source.txt', 'EN', 'target.txt', 'CEB')

   # Access individual results
   for i in range(len(results['source'])):
       if results['overall_score'][i] < 70:
           print(f"Low quality at line {i+1}:")
           print(f"  Score: {results['overall_score'][i]}")
           print(f"  Errors: {results['errors'][i]}")

Exporting for Analysis
~~~~~~~~~~~~~~~~~~~~~~

.. code-block:: python

   import pandas as pd
   from wimarka.main import results

   # Create DataFrame
   df = pd.DataFrame(results)

   # Calculate statistics
   print(f"Average Overall Score: {df['overall_score'].mean():.2f}")
   print(f"Std Dev: {df['overall_score'].std():.2f}")

   # Export
   df.to_csv('detailed_results.csv', index=False)

Next Steps
----------

* See :doc:`examples` for complete evaluation workflows
* See :doc:`usage_library` for programmatic result processing
* See :doc:`usage_cli` for output redirection techniques