Command-Line Interface (CLI) Usage ================================== This guide covers using WiMarka from the command line for quick and efficient translation evaluation. Basic Command ------------- The basic syntax for the WiMarka CLI is: .. code-block:: bash wimarka --src_file_path \\ --src_lang \\ --tgt_file_path \\ --tgt_lang Example: .. code-block:: bash wimarka --src_file_path english.txt \\ --src_lang EN \\ --tgt_file_path cebuano.txt \\ --tgt_lang CEB Command Options --------------- Required Options ~~~~~~~~~~~~~~~~ .. list-table:: :header-rows: 1 :widths: 25 15 60 * - Option - Type - Description * - ``--src_file_path`` - String - Path to the source text file * - ``--src_lang`` - String - Source language code (EN, CEB, ILO, TGT) * - ``--tgt_file_path`` - String - Path to the target translation file * - ``--tgt_lang`` - String - Target language code (CEB, ILO, TGT) Optional Options ~~~~~~~~~~~~~~~~ .. list-table:: :header-rows: 1 :widths: 25 15 60 * - Option - Type - Description * - ``-h, --help`` - Flag - Show help message and exit Getting Help ------------ Display the help message: .. code-block:: bash wimarka --help Output: .. code-block:: text Usage: wimarka [OPTIONS] Evaluate machine translation quality using WiMarka. Options: --src_file_path TEXT Path to source text file [required] --src_lang TEXT Source language code (EN, CEB, ILO, TGT) [required] --tgt_file_path TEXT Path to target text file [required] --tgt_lang TEXT Target language code (CEB, ILO, TGT) [required] -h, --help Show this message and exit. CLI Examples ------------ Example 1: English to Cebuano ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: bash wimarka --src_file_path data/english.txt \\ --src_lang EN \\ --tgt_file_path data/cebuano.txt \\ --tgt_lang CEB Example 2: English to Ilocano ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: bash wimarka --src_file_path sources/en_sentences.txt \\ --src_lang EN \\ --tgt_file_path translations/ilo_sentences.txt \\ --tgt_lang ILO Example 3: English to Tagalog ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: bash wimarka --src_file_path ~/documents/english.txt \\ --src_lang EN \\ --tgt_file_path ~/documents/tagalog.txt \\ --tgt_lang TGT Example 4: Relative Paths ~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: bash # Using relative paths wimarka --src_file_path ./test/source.txt \\ --src_lang EN \\ --tgt_file_path ./test/target.txt \\ --tgt_lang CEB Example 5: Absolute Paths ~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: bash # Using absolute paths (recommended for scripts) wimarka --src_file_path /home/user/data/source.txt \\ --src_lang EN \\ --tgt_file_path /home/user/data/translation.txt \\ --tgt_lang CEB Working with Output ------------------- Console Output ~~~~~~~~~~~~~~ WiMarka prints evaluation progress and results to the console: .. code-block:: text INFO - Starting evaluation... INFO - Evaluating line 1/3 INFO - Detecting errors... INFO - Scoring translation... INFO - Generating explanation... INFO - Correcting translation... === Evaluation Results === ---------------------------------------- Line 1: Source: Good morning! Target: Maayong buntag! Errors: [] Fluency Score: 100/100 Adequacy Score: 100/100 Overall Score: 100/100 Explanation: Perfect translation with correct meaning and grammar. Suggested Correction: Maayong buntag! ---------------------------------------- Redirecting Output to File ~~~~~~~~~~~~~~~~~~~~~~~~~~~ Save evaluation results to a file: .. code-block:: bash wimarka --src_file_path source.txt \\ --src_lang EN \\ --tgt_file_path target.txt \\ --tgt_lang CEB > results.txt Append to existing file: .. code-block:: bash wimarka --src_file_path source.txt \\ --src_lang EN \\ --tgt_file_path target.txt \\ --tgt_lang CEB >> all_results.txt Suppressing Progress Messages ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ To save only results without progress messages: .. code-block:: bash wimarka --src_file_path source.txt \\ --src_lang EN \\ --tgt_file_path target.txt \\ --tgt_lang CEB 2>/dev/null > results.txt Batch Processing ---------------- Process Multiple File Pairs (Bash) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: bash #!/bin/bash # List of file pairs pairs=( "file1_en.txt:file1_ceb.txt:CEB" "file2_en.txt:file2_ilo.txt:ILO" "file3_en.txt:file3_tgt.txt:TGT" ) # Process each pair for pair in "${pairs[@]}"; do IFS=':' read -r src_file tgt_file tgt_lang <<< "$pair" echo "Evaluating $src_file -> $tgt_file" wimarka --src_file_path "$src_file" \\ --src_lang EN \\ --tgt_file_path "$tgt_file" \\ --tgt_lang "$tgt_lang" echo "---" done Process All Files in Directory ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: bash #!/bin/bash # Process all English files and their Cebuano translations for src_file in data/en/*.txt; do # Get base filename base=$(basename "$src_file" .txt) tgt_file="data/ceb/${base}.txt" if [ -f "$tgt_file" ]; then echo "Evaluating: $base" wimarka --src_file_path "$src_file" \\ --src_lang EN \\ --tgt_file_path "$tgt_file" \\ --tgt_lang CEB else echo "Warning: Translation not found for $base" fi done Parallel Processing (GNU Parallel) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ For faster processing of multiple file pairs: .. code-block:: bash # Create a file list cat > filelist.txt < $(RESULTS_DIR)/$$base.txt; \\ \t\tfi; \\ \tdone clean: \trm -rf $(RESULTS_DIR) Usage: .. code-block:: bash make evaluate Integration with Python Scripts ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Call WiMarka CLI from Python: .. code-block:: python import subprocess import sys def run_wimarka(src_file, tgt_file, src_lang='EN', tgt_lang='CEB'): """Run WiMarka CLI from Python.""" cmd = [ 'wimarka', '--src_file_path', src_file, '--src_lang', src_lang, '--tgt_file_path', tgt_file, '--tgt_lang', tgt_lang ] try: result = subprocess.run( cmd, capture_output=True, text=True, check=True ) print(result.stdout) return result.returncode == 0 except subprocess.CalledProcessError as e: print(f"Error: {e.stderr}", file=sys.stderr) return False # Usage success = run_wimarka('source.txt', 'translation.txt') if success: print("Evaluation completed") Error Handling -------------- Common Errors and Solutions ~~~~~~~~~~~~~~~~~~~~~~~~~~~ **File Not Found:** .. code-block:: text Error: [Errno 2] No such file or directory: 'source.txt' Solution: Check file paths and ensure files exist .. code-block:: bash ls -la source.txt target.txt **Line Count Mismatch:** .. code-block:: text ValueError: Source and target files must have the same number of lines. Solution: Verify both files have equal line counts .. code-block:: bash wc -l source.txt target.txt **Invalid Language Code:** .. code-block:: text Error: Invalid language code Solution: Use valid codes (EN, CEB, ILO, TGT) Exit Codes ~~~~~~~~~~ * ``0``: Success * ``1``: Error (file not found, invalid arguments, etc.) * ``2``: Command line usage error Check exit code in scripts: .. code-block:: bash wimarka --src_file_path source.txt \\ --src_lang EN \\ --tgt_file_path target.txt \\ --tgt_lang CEB if [ $? -eq 0 ]; then echo "Success" else echo "Failed" exit 1 fi Best Practices -------------- 1. **Use Absolute Paths in Scripts** .. code-block:: bash # Good wimarka --src_file_path /home/user/data/source.txt ... # Avoid in scripts (relative paths can be ambiguous) wimarka --src_file_path ../data/source.txt ... 2. **Validate Inputs Before Running** .. code-block:: bash if [ ! -f "$src_file" ]; then echo "Error: Source file not found" exit 1 fi 3. **Log Results for Reproducibility** .. code-block:: bash timestamp=$(date +%Y%m%d_%H%M%S) wimarka ... > "results_${timestamp}.txt" 4. **Use Meaningful File Names** .. code-block:: bash # Good wimarka --src_file_path en_news_articles.txt \\ --tgt_file_path ceb_news_articles.txt ... # Avoid wimarka --src_file_path file1.txt --tgt_file_path file2.txt ... Tips and Tricks --------------- Quick Evaluation of Single Sentence ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: bash # Create temporary files echo "Good morning!" > /tmp/src.txt echo "Maayong buntag!" > /tmp/tgt.txt # Evaluate wimarka --src_file_path /tmp/src.txt \\ --src_lang EN \\ --tgt_file_path /tmp/tgt.txt \\ --tgt_lang CEB Comparing Translation Systems ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: bash # Evaluate System A wimarka --src_file_path source.txt \\ --src_lang EN \\ --tgt_file_path system_a.txt \\ --tgt_lang CEB > results_a.txt # Evaluate System B wimarka --src_file_path source.txt \\ --src_lang EN \\ --tgt_file_path system_b.txt \\ --tgt_lang CEB > results_b.txt # Compare diff results_a.txt results_b.txt Next Steps ---------- * See :doc:`output_format` to understand the evaluation output * See :doc:`examples` for more real-world scenarios * See :doc:`usage_library` for Python library usage