Bidirectional Text Demo
The Bidirectional Text Demo is an interactive tool that demonstrates how Arabic OS handles bidirectional (BiDi) text processing. This tool showcases the breakthrough UTF-8 implementation and illustrates how Arabic (RTL) and Latin (LTR) text can coexist seamlessly in a single system.
Overview
Bidirectional text processing is one of the most complex challenges in internationalization. Arabic script flows right-to-left (RTL), while Latin script flows left-to-right (LTR). When these scripts appear together, sophisticated algorithms determine the proper display order while maintaining readability for both script types.
Key Learning Objectives
By using this tool, you will understand:
The Unicode Bidirectional Algorithm (UBA) fundamentals
How mixed RTL/LTR text is processed and displayed
Character classification for directional processing
The Arabic OS text rendering pipeline
Real-world challenges in multilingual text display
Interactive Features
Live Bidirectional Text Processor
The main interface provides real-time bidirectional text processing:
Text Input: Enter any combination of Arabic and Latin text
Instant Rendering: See immediate BiDi algorithm results
Character Analysis: Detailed breakdown of each character’s properties
Direction Statistics: Metrics on text composition and flow
UTF-8 Encoding Info: Byte-level analysis of input text
Preset Text Examples
Quick access to common bidirectional text scenarios:
“مرحبا Hello” - Basic Arabic-English mixing
“Arabic OS نظام تشغيل عربي” - Technical terminology mixing
“Version 2.0 الإصدار الثاني” - Versioning with translations
“UTF-8 يدعم العربية” - Encoding demonstration
Pure Arabic - Right-to-left only text
Pure English - Left-to-right only text
Character Analysis Panel
For each character in your input, view:
Character Display: Visual representation with directional indicators
Unicode Category: Strong RTL, Strong LTR, Neutral, etc.
Script Classification: Arabic, Latin, Common, etc.
Directional Properties: How the character affects text flow
Position in Algorithm: Role in bidirectional processing
Arabic OS Text Processing Pipeline
Step-by-step visualization of the kernel’s text processing:
Input Detection: Character encoding analysis and validation
Character Classification: Assignment of directional properties
Direction Resolution: Application of the Unicode Bidirectional Algorithm
Display Output: Final rendering with proper text flow
Technical Specifications
Unicode Bidirectional Algorithm
The tool implements key UBA concepts:
Character Types:
Strong RTL: Arabic letters (ا, ب, ت, etc.)
Strong LTR: Latin letters (A, B, C, etc.)
Weak: Numbers (0-9, ٠-٩) and some punctuation
Neutral: Spaces, symbols, punctuation marks
Processing Rules:
Paragraph Direction: Determined by first strong character
Run Segmentation: Text divided into directional runs
Weak Character Resolution: Numbers and punctuation inherit direction
Neutral Character Resolution: Symbols follow surrounding text
Level Resolution: Final ordering based on embedding levels
Arabic Text Direction Rules
Specific rules for Arabic text processing:
Text Type |
Direction |
Example |
|---|---|---|
Pure Arabic |
RTL |
مرحبا بالعالم |
Arabic + Numbers |
RTL + LTR numbers |
الصفحة 123 |
Arabic + Latin |
RTL + LTR words |
مرحبا Hello عالم |
Technical Terms |
Context-dependent |
HTML في صفحة الويب |
Mixed Script Processing
Complex scenarios handled by the algorithm:
Nested Directions:
`
English sentence with مرحبا Arabic text inside.
`
Number Processing:
`
الصفحة 123 من 456 صفحة
`
Punctuation Handling:
`
"مرحبا Hello!" قال الرجل.
`
Practical Exercises
Exercise 1: Basic Direction Analysis
Experiment with simple mixed text:
Test cases: 1. “Hello مرحبا” 2. “مرحبا Hello” 3. “Hello مرحبا World”
Observations to make: * How paragraph direction is determined * Where each word appears visually * How punctuation behaves * The role of spaces in direction resolution
Exercise 2: Number and Punctuation
Analyze how numbers behave in different contexts:
Test cases: 1. “Page 123” 2. “الصفحة 123” 3. “Version 2.0 الإصدار 2.0”
Analysis points: * Number direction in LTR vs RTL context * How punctuation affects direction * Mixed numbering systems (123 vs ١٢٣)
Exercise 3: Complex Nested Text
Explore challenging bidirectional scenarios:
Test cases: 1. “The word مرحبا means hello” 2. “HTML tags like <div> في البرمجة” 3. “Email: user@domain.com البريد الإلكتروني”
Advanced concepts: * Embedding levels * Override characters * Technical term handling
Common BiDi Challenges
Text Alignment Issues
Problem: Text doesn’t align properly in mixed content
Causes: * Incorrect paragraph direction detection * CSS direction conflicts * Missing RTL styling
Solutions:
.bidi-content {
direction: rtl;
text-align: right;
unicode-bidi: bidi-override;
}
.ltr-embedded {
direction: ltr;
unicode-bidi: embed;
}
Number Display Problems
Problem: Numbers appear in wrong positions
Example: Instead of “الصفحة 123” showing as intended, it might display incorrectly
Solutions: * Use proper BiDi marks (LRM, RLM) * Apply CSS direction explicitly * Consider Unicode directional formatting
<!-- Force number direction -->
الصفحة ‭123‬
Punctuation Confusion
Problem: Punctuation appears on wrong side of text
Examples: * Question marks in wrong position * Quotes opening/closing incorrectly
Solutions:
.arabic-text {
direction: rtl;
}
.arabic-text:before {
content: "«";
}
.arabic-text:after {
content: "»";
}
Implementation Examples
Web Development
HTML with proper BiDi support:
<!DOCTYPE html>
<html lang="ar" dir="rtl">
<head>
<meta charset="UTF-8">
<title>موقع ثنائي الاتجاه</title>
</head>
<body>
<h1>Arabic OS نظام التشغيل العربي</h1>
<p>This is English text. هذا نص عربي.</p>
</body>
</html>
CSS for mixed content:
.mixed-content {
direction: rtl;
text-align: start;
unicode-bidi: plaintext;
}
.english-phrase {
direction: ltr;
unicode-bidi: embed;
display: inline;
}
JavaScript BiDi Detection
Automatic direction detection:
function detectTextDirection(text) {
const rtlChars = /[\u0590-\u08FF]/;
const ltrChars = /[A-Za-z]/;
const rtlCount = (text.match(rtlChars) || []).length;
const ltrCount = (text.match(ltrChars) || []).length;
if (rtlCount > ltrCount) {
return 'rtl';
} else if (ltrCount > rtlCount) {
return 'ltr';
}
return 'auto';
}
// Apply direction automatically
function applyBiDi(element) {
const text = element.textContent;
const direction = detectTextDirection(text);
element.style.direction = direction;
element.setAttribute('dir', direction);
}
Database Considerations
MySQL UTF-8 configuration for BiDi:
CREATE TABLE articles (
id INT PRIMARY KEY,
title TEXT CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci,
content LONGTEXT CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci,
direction ENUM('ltr', 'rtl', 'auto') DEFAULT 'auto'
);
Arabic OS Integration
Kernel-Level BiDi Support
The Arabic OS kernel provides:
Hardware-accelerated BiDi processing
Font shaping for connected Arabic letters
Optimized memory layout for RTL text
System-wide directional consistency
Terminal and Console
Arabic OS terminal features:
RTL command prompt support
Mixed-direction file paths
Arabic command aliases
Bidirectional text editing
File System
BiDi support throughout the file system:
RTL file and folder names
Mixed-script path handling
Proper sorting of Arabic names
Metadata with directional information
API Reference
For developers working with BiDi text:
JavaScript BiDi API:
// Check if text contains RTL characters
function hasRTLCharacters(text) {
return /[\u0590-\u08FF]/.test(text);
}
// Get paragraph direction
function getParagraphDirection(text) {
const firstStrong = text.match(/[\u0590-\u08FF\u0041-\u005A\u0061-\u007A]/);
if (firstStrong) {
return /[\u0590-\u08FF]/.test(firstStrong[0]) ? 'rtl' : 'ltr';
}
return 'neutral';
}
// Apply BiDi formatting
function formatBiDiText(text, options = {}) {
return {
text: text,
direction: options.direction || getParagraphDirection(text),
hasRTL: hasRTLCharacters(text),
processedText: applyBiDiAlgorithm(text)
};
}
Python BiDi Processing:
import unicodedata
from bidi.algorithm import get_display
def analyze_bidi_text(text):
"""Analyze bidirectional properties of text."""
rtl_chars = sum(1 for c in text if unicodedata.bidirectional(c) in ['R', 'AL'])
ltr_chars = sum(1 for c in text if unicodedata.bidirectional(c) == 'L')
return {
'original': text,
'display': get_display(text),
'rtl_chars': rtl_chars,
'ltr_chars': ltr_chars,
'direction': 'rtl' if rtl_chars > ltr_chars else 'ltr'
}
Performance Considerations
BiDi Algorithm Optimization
Best practices for efficient BiDi processing:
Cache Results: Store processed text for repeated use
Segment Processing: Handle long texts in chunks
Early Detection: Quick RTL/LTR classification
Lazy Evaluation: Process only visible text portions
Memory Management
RTL text requires special memory considerations:
Buffer Allocation: Account for reordering overhead
Cache Efficiency: Optimize for BiDi access patterns
Garbage Collection: Manage temporary reordering arrays
Real-World Applications
Content Management Systems
BiDi considerations for CMS development:
Editor interfaces that handle mixed content
Template systems with automatic direction detection
Search functionality that works across scripts
URL handling for RTL domain names
E-commerce Applications
BiDi requirements for online stores:
Product names in multiple scripts
Pricing with appropriate number formatting
Address fields supporting RTL input
Review systems handling mixed-language content
Integration with Other Tools
This BiDi demo complements other Arabic OS tools:
Start with UTF-8 Byte Visualizer to understand character encoding
Use Virtual Arabic Keyboard to explore input methods
Continue to Arabic Font Renderer Demo for display rendering details
Advanced users can explore x86 Assembly Simulator for low-level implementation
Understanding bidirectional text processing is essential for creating truly multilingual applications. This demo provides hands-on experience with the algorithms and challenges involved in RTL/LTR text coexistence.
Further Learning
Continue your BiDi journey with:
Arabic Font Renderer Demo - How fonts handle bidirectional text
Virtual Arabic Keyboard - Input methods for RTL languages
../../../tutorials/advanced/bidi-algorithm - Deep dive into UBA implementation
../../../developer-guide/api/text-rendering - System-level BiDi APIs
Master bidirectional text concepts with this demo before tackling more advanced Arabic text processing challenges.
Social Media Platforms
Challenges in social media BiDi support:
User-generated content with unpredictable mixing
Hashtags and mentions in mixed scripts
Reply threading with varying directions
Embedded media with BiDi captions