Conversion System

This document provides detailed information about arraybridge’s conversion system, including the algorithms, optimization strategies, and implementation details.

Overview

arraybridge provides automatic conversion between six array/tensor frameworks using a combination of zero-copy operations (via DLPack) and NumPy-based fallback conversions. The conversion system is designed to be:

  • Fast: Uses zero-copy when possible via DLPack

  • Reliable: Falls back to NumPy bridge when needed

  • Type-preserving: Maintains dtypes across conversions

  • Device-aware: Handles CPU/GPU device management

Core Functions

detect_memory_type

Detect the framework and memory type of an array or tensor.

from arraybridge import detect_memory_type

# Returns: 'numpy', 'cupy', 'torch', 'tensorflow', 'jax', 'pyclesperanto', or None
mem_type = detect_memory_type(array)

Implementation Details:

  • Uses isinstance() checks for known types

  • Returns None for unsupported types

  • Handles both CPU and GPU arrays

  • Thread-safe and performant

Supported Types:

  • NumPy: numpy.ndarray

  • CuPy: cupy.ndarray

  • PyTorch: torch.Tensor

  • TensorFlow: tensorflow.Tensor, tensorflow.EagerTensor

  • JAX: jax.Array, jaxlib.xla_extension.DeviceArray

  • pyclesperanto: pyclesperanto_prototype._tier0._pycl.OCLArray

convert_memory

Convert arrays between different memory types and devices.

from arraybridge import convert_memory

result = convert_memory(
    data,
    source_type='numpy',
    target_type='torch',
    gpu_id=0
)

Parameters:

  • data: Input array/tensor

  • source_type: Source memory type (MemoryType or str)

  • target_type: Target memory type (MemoryType or str)

  • gpu_id: GPU device ID (default: 0), None for CPU

Returns:

  • Converted array/tensor in the target format

Raises:

  • MemoryConversionError: If conversion fails

Conversion Strategies

DLPack (Zero-Copy)

DLPack enables zero-copy sharing of GPU memory between frameworks.

Supported Paths:

  • CuPy ↔ PyTorch (GPU only)

  • CuPy ↔ JAX (GPU only)

  • PyTorch ↔ JAX (GPU only)

Example:

import cupy as cp
from arraybridge import convert_memory

# Create CuPy array on GPU
cupy_data = cp.random.rand(1000, 1000)

# Zero-copy conversion to PyTorch
torch_data = convert_memory(cupy_data, 'cupy', 'torch', gpu_id=0)

# Same memory location - zero copy!
assert torch_data.data_ptr() == cupy_data.data.ptr

When DLPack is Used:

  1. Source and target are both GPU-based

  2. Both frameworks support DLPack

  3. Arrays are contiguous in memory

  4. No dtype conversion is needed

NumPy Bridge

When DLPack is not available, arraybridge uses NumPy as an intermediate format.

Conversion Path:

  1. Source → NumPy (using __array__() or .cpu().numpy())

  2. NumPy → Target (using framework-specific constructors)

Example:

# PyTorch (GPU) → NumPy → TensorFlow (GPU)
torch_data = torch.rand(100, 100).cuda()
tf_data = convert_memory(torch_data, 'torch', 'tensorflow', gpu_id=0)

When NumPy Bridge is Used:

  • Source or target is CPU-based

  • DLPack not supported for the framework pair

  • Dtype conversion is required

  • Non-contiguous arrays

Framework-Specific Conversions

Some frameworks require special handling:

TensorFlow:

# Uses tf.constant() or tf.identity()
# Handles eager vs graph mode
tf_data = convert_memory(np_data, 'numpy', 'tensorflow')

pyclesperanto:

# Uses push() and pull() operations
# Always GPU-based
cle_data = convert_memory(np_data, 'numpy', 'pyclesperanto', gpu_id=0)

JAX:

# Uses jax.device_put() for device placement
# Respects JAX's device management
jax_data = convert_memory(np_data, 'numpy', 'jax', gpu_id=0)

Device Management

GPU Device Selection

arraybridge handles device selection for GPU operations:

# Move to GPU 0
gpu0_data = convert_memory(data, 'numpy', 'cupy', gpu_id=0)

# Move to GPU 1
gpu1_data = convert_memory(data, 'numpy', 'cupy', gpu_id=1)

Framework-Specific Behavior:

  • CuPy: Uses cp.cuda.Device(gpu_id)

  • PyTorch: Uses torch.cuda.device(gpu_id)

  • JAX: Uses jax.devices('gpu')[gpu_id]

  • TensorFlow: Uses tf.device(f'/GPU:{gpu_id}')

CPU Operations

When gpu_id=None, operations are CPU-only:

# CPU-only conversion
torch_cpu = convert_memory(np_data, 'numpy', 'torch', gpu_id=None)
print(torch_cpu.device)  # cpu

Cross-Device Transfers

Moving data between devices:

# GPU 0 to GPU 1
gpu0_data = convert_memory(data, 'numpy', 'torch', gpu_id=0)
gpu1_data = convert_memory(gpu0_data, 'torch', 'torch', gpu_id=1)

# GPU to CPU
cpu_data = convert_memory(gpu0_data, 'torch', 'numpy', gpu_id=None)

Dtype Handling

Dtype Preservation

arraybridge preserves dtypes during conversion:

import numpy as np
from arraybridge import convert_memory

# Float32 preservation
f32_data = np.array([1, 2, 3], dtype=np.float32)
torch_data = convert_memory(f32_data, 'numpy', 'torch')
assert torch_data.dtype == torch.float32

# Int64 preservation
i64_data = np.array([1, 2, 3], dtype=np.int64)
torch_data = convert_memory(i64_data, 'numpy', 'torch')
assert torch_data.dtype == torch.int64

Dtype Mapping

Framework-specific dtype mappings:

NumPy → PyTorch:

  • float32torch.float32

  • float64torch.float64

  • int32torch.int32

  • int64torch.int64

NumPy → TensorFlow:

  • float32tf.float32

  • float64tf.float64

  • int32tf.int32

  • int64tf.int64

NumPy → CuPy:

  • Exact dtype mapping (CuPy follows NumPy)

Conversion Performance

Benchmarks

Relative performance of different conversion strategies:

  1. DLPack (Zero-Copy): ~0.001 ms (pointer sharing)

  2. NumPy Bridge (Small): ~0.1-1 ms (< 1 MB)

  3. NumPy Bridge (Large): ~10-100 ms (> 100 MB)

  4. CPU-GPU Transfer: 10-1000 ms (depends on size and PCIe)

Optimization Tips

  1. Use Zero-Copy When Possible:

    # Fast: DLPack zero-copy
    cupy_data = cp.random.rand(1000, 1000)
    torch_data = convert_memory(cupy_data, 'cupy', 'torch')
    
  2. Avoid Unnecessary Conversions:

    # Bad: Convert in loop
    for i in range(100):
        torch_data = convert_memory(np_data, 'numpy', 'torch')
        result = process(torch_data)
    
    # Good: Convert once
    torch_data = convert_memory(np_data, 'numpy', 'torch')
    for i in range(100):
        result = process(torch_data)
    
  3. Batch CPU-GPU Transfers:

    # Transfer multiple arrays together
    batch = np.stack([arr1, arr2, arr3])
    gpu_batch = convert_memory(batch, 'numpy', 'torch', gpu_id=0)
    
  4. Use Pinned Memory for Large Transfers:

    # PyTorch pinned memory for faster CPU-GPU transfer
    import torch
    pinned_data = torch.from_numpy(np_data).pin_memory()
    gpu_data = pinned_data.cuda()
    

Error Handling

Common Errors

Framework Not Available:

try:
    result = convert_memory(data, 'numpy', 'torch')
except MemoryConversionError as e:
    if "not available" in str(e):
        print("PyTorch is not installed")

Invalid Memory Type:

try:
    result = convert_memory(data, 'invalid', 'numpy')
except MemoryConversionError as e:
    print(f"Invalid memory type: {e}")

GPU Not Available:

try:
    result = convert_memory(data, 'numpy', 'cupy', gpu_id=0)
except MemoryConversionError as e:
    if "CUDA" in str(e):
        print("GPU not available, fallback to CPU")
        result = convert_memory(data, 'numpy', 'numpy')

Recovery Strategies

Implement fallback logic:

def safe_convert(data, target_type, gpu_id=0):
    """Convert with automatic fallback."""
    source_type = detect_memory_type(data)

    try:
        # Try GPU conversion
        return convert_memory(data, source_type, target_type, gpu_id=gpu_id)
    except MemoryConversionError:
        # Fallback to CPU
        print("GPU conversion failed, using CPU")
        return convert_memory(data, source_type, target_type, gpu_id=None)

Advanced Topics

Custom Memory Types

arraybridge supports a fixed set of memory types. To add custom types, you would need to extend the MemoryType enum and implement conversion logic in the converters module.

Thread Safety

The conversion functions are thread-safe, but:

  • GPU contexts are thread-local

  • Framework-specific thread safety applies

  • Use locks when sharing GPU devices across threads

Memory Management

  • Converted arrays are independent copies (except DLPack)

  • Source arrays are not modified

  • Garbage collection works normally

  • GPU memory is released when arrays are deleted

Conversion Matrix

Full conversion support matrix:

Conversion Support

Source\Target

NumPy

CuPy

PyTorch

TensorFlow

JAX

pyclesperanto

NumPy

CuPy

✓ (DLPack)

✓ (DLPack)

PyTorch

✓ (DLPack)

✓ (DLPack)

TensorFlow

JAX

✓ (DLPack)

✓ (DLPack)

pyclesperanto

✓ = Supported, DLPack = Zero-copy via DLPack

API Reference

See API Reference for complete function signatures and parameters.

Next Steps