Hpt
Home
GitHub
crate.io
Benchmarks
Home
GitHub
crate.io
Benchmarks
  • Docs

    • unary

      • sin
      • sin_
      • cos
      • cos_
      • tan
      • tan_
      • sinh
      • sinh_
      • cosh
      • cosh_
      • tanh
      • tanh_
      • asinh
      • asinh_
      • acosh
      • acosh_
      • atanh
      • atanh_
      • asin
      • asin_
      • acos
      • acos_
      • atan
      • atan_
      • exp
      • exp_
      • exp2
      • exp2_
      • sqrt
      • sqrt_
      • recip
      • recip_
      • ln
      • ln_
      • log2
      • log2_
      • log10
      • log10_
      • celu
      • celu_
      • sigmoid
      • sigmoid_
      • elu
      • elu_
      • erf
      • gelu
      • gelu_
      • selu
      • selu_
      • hard_sigmoid
      • hard_sigmoid_
      • hard_swish
      • hard_swish_
      • softplus
      • softplus_
      • softsign
      • softsign_
      • mish
      • mish_
      • cbrt
      • cbrt_
      • sincos
      • sincos_
      • exp10
      • exp10_
    • binary

      • add
      • add_
      • sub
      • sub_
      • mul
      • mul_
      • div
      • div_
      • rem
      • rem_
      • pow
      • pow_
      • hypot
      • hypot_
    • reduce

      • logsumexp
      • argmin
      • argmax
      • max
      • min
      • mean
      • sum
      • sum_
      • nansum
      • nansum_
      • prod
      • nanprod
      • sum_square
      • reducel1
      • reducel2
      • reducel3
      • all
      • any
    • conv

      • cuda

        • conv2d_group
        • conv2d
        • dwconv2d
        • batchnorm_conv2d
      • cpu

        • batchnorm_conv2d
        • conv2d_group
        • conv2d_transpose
        • conv2d
        • dwconv2d
    • pooling

      • maxpool2d
      • avgpool2d
      • adaptive_maxpool2d
      • adaptive_avgpool2d
    • compare

      • tensor_neq
      • tensor_eq
      • tensor_gt
      • tensor_lt
      • tensor_ge
      • tensor_le
    • advanced

      • scatter
      • hardmax
      • tensor_where
      • topk
      • onehot
    • normalization

      • log_softmax
      • layernorm
      • softmax
    • cumulative

      • cumsum
      • cumprod
    • regularization

      • dropout
      • shrinkage
    • linalg

      • matmul
      • matmul_post
      • gemm
      • tensordot
    • random

      • randn
      • randn_like
      • rand
      • rand_like
      • beta
      • beta_like
      • chisquare
      • chisquare_like
      • exponential
      • exponential_like
      • gamma
      • gamma_like
      • gumbel
      • gumbel_like
      • lognormal
      • lognormal_like
      • normal_gaussian
      • normal_gaussian_like
      • pareto
      • pareto_like
      • poisson
      • poisson_like
      • weibull
      • weibull_like
      • zipf
      • zipf_like
      • triangular
      • triangular_like
      • bernoulli
      • randint
      • randint_like
    • shape manipulate

      • squeeze
      • unsqueeze
      • reshape
      • transpose
      • permute
      • permute_inv
      • expand
      • t
      • mt
      • flip
      • fliplr
      • flipud
      • tile
      • trim_zeros
      • repeat
      • split
      • dsplit
      • hsplit
      • vsplit
      • swap_axes
      • flatten
      • concat
      • vstack
      • hstack
      • dstack
    • creation

      • empty
      • zeros
      • ones
      • empty_like
      • zeros_like
      • ones_like
      • full
      • full_like
      • arange
      • arange_step
      • eye
      • linspace
      • logspace
      • geomspace
      • tri
      • tril
      • triu
      • identity
    • windows

      • hamming_window
      • hann_window
      • blackman_window
    • iterator

      • par_iter
      • par_iter_mut
      • par_iter_simd
      • par_iter_simd_mut
      • strided_map
      • strided_map_simd
      • collect
    • utils

      • set_display_elements
      • resize_cpu_lru_cache
      • resize_cuda_lru_cache
      • set_seed
      • num_threads
    • associated methods

      • cpu

        • forget
        • forget_copy
        • from_raw
        • to_cuda
      • cuda

        • forget
        • forget_copy
        • from_raw
        • to_cpu
      • all_close
      • astype
    • custom type
    • custom allocator
    • slice
    • save/load

Custom Type

Since hpt is designed in purely generic types, the user can define their own type and can use custom type to do all the computation hpt supports.

How

You can reference the steps at here.

Note

For now, your custom type must implemented Copy trait. The reason why hpt doesn't support type with only Clone is because of the conv2d implementation issue. The conv2d used fixed size array to preallocate registers [T; N], and this requires T implemented Copy trait.

Backend Support

BackendSupported
CPU✅
Cuda❌
最近更新: 2025/6/24 21:23
Contributors: Jianqoq
Next
custom allocator