Conversion System
=================

This document provides detailed information about arraybridge's conversion system, including
the algorithms, optimization strategies, and implementation details.

Overview
--------

arraybridge provides automatic conversion between six array/tensor frameworks using a combination
of zero-copy operations (via DLPack) and NumPy-based fallback conversions. The conversion
system is designed to be:

- **Fast**: Uses zero-copy when possible via DLPack
- **Reliable**: Falls back to NumPy bridge when needed
- **Type-preserving**: Maintains dtypes across conversions
- **Device-aware**: Handles CPU/GPU device management

Core Functions
--------------

detect_memory_type
~~~~~~~~~~~~~~~~~~

Detect the framework and memory type of an array or tensor.

.. code-block:: python

   from arraybridge import detect_memory_type

   # Returns: 'numpy', 'cupy', 'torch', 'tensorflow', 'jax', 'pyclesperanto', or None
   mem_type = detect_memory_type(array)

**Implementation Details:**

- Uses ``isinstance()`` checks for known types
- Returns ``None`` for unsupported types
- Handles both CPU and GPU arrays
- Thread-safe and performant

**Supported Types:**

- NumPy: ``numpy.ndarray``
- CuPy: ``cupy.ndarray``
- PyTorch: ``torch.Tensor``
- TensorFlow: ``tensorflow.Tensor``, ``tensorflow.EagerTensor``
- JAX: ``jax.Array``, ``jaxlib.xla_extension.DeviceArray``
- pyclesperanto: ``pyclesperanto_prototype._tier0._pycl.OCLArray``

convert_memory
~~~~~~~~~~~~~~

Convert arrays between different memory types and devices.

.. code-block:: python

   from arraybridge import convert_memory

   result = convert_memory(
       data,
       source_type='numpy',
       target_type='torch',
       gpu_id=0
   )

**Parameters:**

- ``data``: Input array/tensor
- ``source_type``: Source memory type (MemoryType or str)
- ``target_type``: Target memory type (MemoryType or str)
- ``gpu_id``: GPU device ID (default: 0), None for CPU

**Returns:**

- Converted array/tensor in the target format

**Raises:**

- ``MemoryConversionError``: If conversion fails

Conversion Strategies
---------------------

DLPack (Zero-Copy)
~~~~~~~~~~~~~~~~~~

DLPack enables zero-copy sharing of GPU memory between frameworks.

**Supported Paths:**

- CuPy ↔ PyTorch (GPU only)
- CuPy ↔ JAX (GPU only)
- PyTorch ↔ JAX (GPU only)

**Example:**

.. code-block:: python

   import cupy as cp
   from arraybridge import convert_memory

   # Create CuPy array on GPU
   cupy_data = cp.random.rand(1000, 1000)

   # Zero-copy conversion to PyTorch
   torch_data = convert_memory(cupy_data, 'cupy', 'torch', gpu_id=0)

   # Same memory location - zero copy!
   assert torch_data.data_ptr() == cupy_data.data.ptr

**When DLPack is Used:**

1. Source and target are both GPU-based
2. Both frameworks support DLPack
3. Arrays are contiguous in memory
4. No dtype conversion is needed

NumPy Bridge
~~~~~~~~~~~~

When DLPack is not available, arraybridge uses NumPy as an intermediate format.

**Conversion Path:**

1. Source → NumPy (using ``__array__()`` or ``.cpu().numpy()``)
2. NumPy → Target (using framework-specific constructors)

**Example:**

.. code-block:: python

   # PyTorch (GPU) → NumPy → TensorFlow (GPU)
   torch_data = torch.rand(100, 100).cuda()
   tf_data = convert_memory(torch_data, 'torch', 'tensorflow', gpu_id=0)

**When NumPy Bridge is Used:**

- Source or target is CPU-based
- DLPack not supported for the framework pair
- Dtype conversion is required
- Non-contiguous arrays

Framework-Specific Conversions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Some frameworks require special handling:

**TensorFlow:**

.. code-block:: python

   # Uses tf.constant() or tf.identity()
   # Handles eager vs graph mode
   tf_data = convert_memory(np_data, 'numpy', 'tensorflow')

**pyclesperanto:**

.. code-block:: python

   # Uses push() and pull() operations
   # Always GPU-based
   cle_data = convert_memory(np_data, 'numpy', 'pyclesperanto', gpu_id=0)

**JAX:**

.. code-block:: python

   # Uses jax.device_put() for device placement
   # Respects JAX's device management
   jax_data = convert_memory(np_data, 'numpy', 'jax', gpu_id=0)

Device Management
-----------------

GPU Device Selection
~~~~~~~~~~~~~~~~~~~~

arraybridge handles device selection for GPU operations:

.. code-block:: python

   # Move to GPU 0
   gpu0_data = convert_memory(data, 'numpy', 'cupy', gpu_id=0)

   # Move to GPU 1
   gpu1_data = convert_memory(data, 'numpy', 'cupy', gpu_id=1)

**Framework-Specific Behavior:**

- **CuPy**: Uses ``cp.cuda.Device(gpu_id)``
- **PyTorch**: Uses ``torch.cuda.device(gpu_id)``
- **JAX**: Uses ``jax.devices('gpu')[gpu_id]``
- **TensorFlow**: Uses ``tf.device(f'/GPU:{gpu_id}')``

CPU Operations
~~~~~~~~~~~~~~

When ``gpu_id=None``, operations are CPU-only:

.. code-block:: python

   # CPU-only conversion
   torch_cpu = convert_memory(np_data, 'numpy', 'torch', gpu_id=None)
   print(torch_cpu.device)  # cpu

Cross-Device Transfers
~~~~~~~~~~~~~~~~~~~~~~

Moving data between devices:

.. code-block:: python

   # GPU 0 to GPU 1
   gpu0_data = convert_memory(data, 'numpy', 'torch', gpu_id=0)
   gpu1_data = convert_memory(gpu0_data, 'torch', 'torch', gpu_id=1)

   # GPU to CPU
   cpu_data = convert_memory(gpu0_data, 'torch', 'numpy', gpu_id=None)

Dtype Handling
--------------

Dtype Preservation
~~~~~~~~~~~~~~~~~~

arraybridge preserves dtypes during conversion:

.. code-block:: python

   import numpy as np
   from arraybridge import convert_memory

   # Float32 preservation
   f32_data = np.array([1, 2, 3], dtype=np.float32)
   torch_data = convert_memory(f32_data, 'numpy', 'torch')
   assert torch_data.dtype == torch.float32

   # Int64 preservation
   i64_data = np.array([1, 2, 3], dtype=np.int64)
   torch_data = convert_memory(i64_data, 'numpy', 'torch')
   assert torch_data.dtype == torch.int64

Dtype Mapping
~~~~~~~~~~~~~

Framework-specific dtype mappings:

**NumPy → PyTorch:**

- ``float32`` → ``torch.float32``
- ``float64`` → ``torch.float64``
- ``int32`` → ``torch.int32``
- ``int64`` → ``torch.int64``

**NumPy → TensorFlow:**

- ``float32`` → ``tf.float32``
- ``float64`` → ``tf.float64``
- ``int32`` → ``tf.int32``
- ``int64`` → ``tf.int64``

**NumPy → CuPy:**

- Exact dtype mapping (CuPy follows NumPy)

Conversion Performance
----------------------

Benchmarks
~~~~~~~~~~

Relative performance of different conversion strategies:

1. **DLPack (Zero-Copy)**: ~0.001 ms (pointer sharing)
2. **NumPy Bridge (Small)**: ~0.1-1 ms (< 1 MB)
3. **NumPy Bridge (Large)**: ~10-100 ms (> 100 MB)
4. **CPU-GPU Transfer**: 10-1000 ms (depends on size and PCIe)

Optimization Tips
~~~~~~~~~~~~~~~~~

1. **Use Zero-Copy When Possible:**

   .. code-block:: python

      # Fast: DLPack zero-copy
      cupy_data = cp.random.rand(1000, 1000)
      torch_data = convert_memory(cupy_data, 'cupy', 'torch')

2. **Avoid Unnecessary Conversions:**

   .. code-block:: python

      # Bad: Convert in loop
      for i in range(100):
          torch_data = convert_memory(np_data, 'numpy', 'torch')
          result = process(torch_data)

      # Good: Convert once
      torch_data = convert_memory(np_data, 'numpy', 'torch')
      for i in range(100):
          result = process(torch_data)

3. **Batch CPU-GPU Transfers:**

   .. code-block:: python

      # Transfer multiple arrays together
      batch = np.stack([arr1, arr2, arr3])
      gpu_batch = convert_memory(batch, 'numpy', 'torch', gpu_id=0)

4. **Use Pinned Memory for Large Transfers:**

   .. code-block:: python

      # PyTorch pinned memory for faster CPU-GPU transfer
      import torch
      pinned_data = torch.from_numpy(np_data).pin_memory()
      gpu_data = pinned_data.cuda()

Error Handling
--------------

Common Errors
~~~~~~~~~~~~~

**Framework Not Available:**

.. code-block:: python

   try:
       result = convert_memory(data, 'numpy', 'torch')
   except MemoryConversionError as e:
       if "not available" in str(e):
           print("PyTorch is not installed")

**Invalid Memory Type:**

.. code-block:: python

   try:
       result = convert_memory(data, 'invalid', 'numpy')
   except MemoryConversionError as e:
       print(f"Invalid memory type: {e}")

**GPU Not Available:**

.. code-block:: python

   try:
       result = convert_memory(data, 'numpy', 'cupy', gpu_id=0)
   except MemoryConversionError as e:
       if "CUDA" in str(e):
           print("GPU not available, fallback to CPU")
           result = convert_memory(data, 'numpy', 'numpy')

Recovery Strategies
~~~~~~~~~~~~~~~~~~~

Implement fallback logic:

.. code-block:: python

   def safe_convert(data, target_type, gpu_id=0):
       """Convert with automatic fallback."""
       source_type = detect_memory_type(data)

       try:
           # Try GPU conversion
           return convert_memory(data, source_type, target_type, gpu_id=gpu_id)
       except MemoryConversionError:
           # Fallback to CPU
           print("GPU conversion failed, using CPU")
           return convert_memory(data, source_type, target_type, gpu_id=None)

Advanced Topics
---------------

Custom Memory Types
~~~~~~~~~~~~~~~~~~~

arraybridge supports a fixed set of memory types. To add custom types,
you would need to extend the ``MemoryType`` enum and implement conversion
logic in the converters module.

Thread Safety
~~~~~~~~~~~~~

The conversion functions are thread-safe, but:

- GPU contexts are thread-local
- Framework-specific thread safety applies
- Use locks when sharing GPU devices across threads

Memory Management
~~~~~~~~~~~~~~~~~

- Converted arrays are independent copies (except DLPack)
- Source arrays are not modified
- Garbage collection works normally
- GPU memory is released when arrays are deleted

Conversion Matrix
-----------------

Full conversion support matrix:

.. list-table:: Conversion Support
   :header-rows: 1
   :stub-columns: 1

   * - Source\\Target
     - NumPy
     - CuPy
     - PyTorch
     - TensorFlow
     - JAX
     - pyclesperanto
   * - NumPy
     - ✓
     - ✓
     - ✓
     - ✓
     - ✓
     - ✓
   * - CuPy
     - ✓
     - ✓
     - ✓ (DLPack)
     - ✓
     - ✓ (DLPack)
     - ✓
   * - PyTorch
     - ✓
     - ✓ (DLPack)
     - ✓
     - ✓
     - ✓ (DLPack)
     - ✓
   * - TensorFlow
     - ✓
     - ✓
     - ✓
     - ✓
     - ✓
     - ✓
   * - JAX
     - ✓
     - ✓ (DLPack)
     - ✓ (DLPack)
     - ✓
     - ✓
     - ✓
   * - pyclesperanto
     - ✓
     - ✓
     - ✓
     - ✓
     - ✓
     - ✓

✓ = Supported, DLPack = Zero-copy via DLPack

API Reference
-------------

See :doc:`api_reference` for complete function signatures and parameters.

Next Steps
----------

- Learn about :doc:`decorators` for automatic conversion
- Explore :doc:`gpu_features` for device management
- Check :doc:`advanced_topics` for optimization strategies