Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion rocfile/docs/async.rst → hipfile/docs/async.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Asynchronous API
================

Asynchronous I/O is currently not supported in rocFile.
Asynchronous I/O is currently not supported in hipFile.

API calls from the async section of the reference manual will
fail and return error codes.
2 changes: 1 addition & 1 deletion rocfile/docs/batch.rst → hipfile/docs/batch.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Batch API
=========

Batch IO operations are currently not supported in rocFile.
Batch IO operations are currently not supported in hipFile.

API calls from the batch section of the reference manual will
fail and return error codes.
15 changes: 15 additions & 0 deletions hipfile/docs/core.rst
Original file line number Diff line number Diff line change
Expand Up @@ -29,3 +29,18 @@ Any of the parameters can be ignored by passing in a NULL pointer.
We do not provide a hipFile equivalent to cuFile's ``cuFileGetVersion()``
via the ``hipify`` tool. This is because any logic involving the obtained
version number would be platform-specific and have to be customized regardless.

Parameter Getters and Setters
-----------------------------
Like cuFile, hipFile includes a set of functions for getting and setting library
parameters. These comprise several API calls that get/set parameters
of a particular type based on an enum selector (e.g., ``hipFileGetParameterSizeT()``).
These API calls are all unsupported at this time and will return errors
if called.

* ``hipFileError_t hipFileGetParameterSizeT(hipFileSizeTConfigParameter_t param, size_t *value)``
* ``hipFileError_t hipFileGetParameterBool(hipFileBoolConfigParameter_t param, bool *value)``
* ``hipFileError_t hipFileGetParameterString(hipFileStringConfigParameter_t param, char *desc_str, int len)``
* ``hipFileError_t hipFileSetParameterSizeT(hipFileSizeTConfigParameter_t param, size_t value)``
* ``hipFileError_t hipFileSetParameterBool(hipFileBoolConfigParameter_t param, bool value)``
* ``hipFileError_t hipFileSetParameterString(hipFileStringConfigParameter_t param, const char *desc_str)``
42 changes: 21 additions & 21 deletions rocfile/docs/driver.rst → hipfile/docs/driver.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ Driver API
Basic API Usage
---------------

The rocFile driver API is used to initialize and shut down the "driver" (i.e.,
The hipFile driver API is used to initialize and shut down the "driver" (i.e.,
the library state). The driver is reference counted, and calls to open and
close the driver simply increase and decrease the reference count. When the
reference count drops to zero, the driver's data structures are cleaned up,
Expand All @@ -19,23 +19,23 @@ drivers open in a single process.

The basic driver API calls:

* ``rocFileError_t rocFileDriverOpen(void)``
* ``rocFileError_t rocFileDriverClose(void)``
* ``int64_t rocFileUseCount(void)``
* ``hipFileError_t hipFileDriverOpen(void)``
* ``hipFileError_t hipFileDriverClose(void)``
* ``int64_t hipFileUseCount(void)``

The open call should be called before making any other rocFile API calls
and the close call should be called when all rocFile API calls are complete.
The ``rocFileUseCount()`` call can be used to check the current reference
The open call should be called before making any other hipFile API calls
and the close call should be called when all hipFile API calls are complete.
The ``hipFileUseCount()`` call can be used to check the current reference
count of the driver.

Note that some API calls will automatically open the driver and bump the
reference count if it has not yet been explicitly opened:

* ``rocFileHandleRegister()``
* ``rocFileBufRegister()``
* ``hipFileHandleRegister()``
* ``hipFileBufRegister()``

Also note that this "implicit opening" will not be tracked, so it's up the
application to add an extra call to ``rocFileDriverClose()`` to fully
application to add an extra call to ``hipFileDriverClose()`` to fully
close the driver.

These API calls will only increment the driver count if they succeed.
Expand All @@ -44,24 +44,24 @@ if the intent is to completely shut down the driver, as simply matching open
and close calls may not reduce the reference count to zero.

These API calls have been coded to behave like cuFile's calls, regarding
initialization and errors, but there is no guarantee that rocFile's driver
initialization and errors, but there is no guarantee that hipFile's driver
calls will match unpublished cuFile API behaviour.

Driver Getters and Setters
--------------------------

These API calls exist in rocFile, but currently have no effect and will return
These API calls exist in hipFile, but currently have no effect and will return
an error:

* ``rocFileError_t rocFileDriverGetProperties(rocFileDriverProps_t *props)``
* ``rocFileError_t rocFileDriverSetPollMode(bool poll, size_t poll_threshold_size)``
* ``rocFileError_t rocFileDriverSetMaxDirectIOSize(size_t max_direct_io_size)``
* ``rocFileError_t rocFileDriverSetMaxCacheSize(size_t max_cache_size)``
* ``rocFileError_t rocFileDriverSetMaxPinnedMemSize(size_t max_pinned_size)``
* ``hipFileError_t hipFileDriverGetProperties(hipFileDriverProps_t *props)``
* ``hipFileError_t hipFileDriverSetPollMode(bool poll, size_t poll_threshold_size)``
* ``hipFileError_t hipFileDriverSetMaxDirectIOSize(size_t max_direct_io_size)``
* ``hipFileError_t hipFileDriverSetMaxCacheSize(size_t max_cache_size)``
* ``hipFileError_t hipFileDriverSetMaxPinnedMemSize(size_t max_pinned_size)``

These API calls also exist in rocFile, but have no effect, cannot be used to set
These API calls also exist in hipFile, but have no effect, cannot be used to set
driver properties, and will return an error:

* ``rocFileError_t rocFileSetParameterSizeT(rocFileSizeTConfigParameter_t param, size_t value)``
* ``rocFileError_t rocFileSetParameterBool(rocFileBoolConfigParameter_t param, bool value)``
* ``rocFileError_t rocFileSetParameterString(rocFileStringConfigParameter_t param, const char *desc_str)``
* ``hipFileError_t hipFileSetParameterSizeT(hipFileSizeTConfigParameter_t param, size_t value)``
* ``hipFileError_t hipFileSetParameterBool(hipFileBoolConfigParameter_t param, bool value)``
* ``hipFileError_t hipFileSetParameterString(hipFileStringConfigParameter_t param, const char *desc_str)``
44 changes: 22 additions & 22 deletions rocfile/docs/errors.rst → hipfile/docs/errors.rst
Original file line number Diff line number Diff line change
@@ -1,62 +1,62 @@
Errors and Error Handling
=========================

Functions that return a rocFileOpError_t struct
Functions that return a hipFileOpError_t struct
-----------------------------------------------
Errors are handled identically to cuFile. Most API calls return
a ``rocFileError_t`` struct, which includes ``rocFileOpError_t``
field for returning rocFile error codes, and a ``hipError_t``
a ``hipFileError_t`` struct, which includes ``hipFileOpError_t``
field for returning hipFile error codes, and a ``hipError_t``
field for returning GPU error codes. These fields should be checked
for ``rocFileSuccess`` and ``hipSuccess``, respectively. Any other
for ``hipFileSuccess`` and ``hipSuccess``, respectively. Any other
values indicate an error.

::

typedef struct __ROCFILE_NODISCARD rocFileError {
rocFileOpError_t err; //!< Errors related to rocFile or the GPU IO driver
typedef struct __ROCFILE_NODISCARD hipFileError {
hipFileOpError_t err; //!< Errors related to hipFile or the GPU IO driver
hipError_t hip_drv_err; //!< Errors related to the GPU driver
} rocFileError_t;
} hipFileError_t;

Note that the struct is marked with the ``[[nodiscard]]`` attribute
so in C++17 / C23 or greater, the compiler will complain if you do
not check error values.

If a GPU driver error occurs the ``rocFileOpError_t`` value will be
``rocFileHipDriverError`` and the ``hipError_t`` field will be set to
If a GPU driver error occurs the ``hipFileOpError_t`` value will be
``hipFileHipDriverError`` and the ``hipError_t`` field will be set to
the appropriate HIP driver error value.

When any other rocFile error is returned, the ``rocFileOpError_t`` field will be
set to the appropriate rocFile error and the ``hipError_t`` field will
When any other hipFile error is returned, the ``hipFileOpError_t`` field will be
set to the appropriate hipFile error and the ``hipError_t`` field will
be set to ``hipSuccess``.

Several helper macros are included in ``rocfile.h`` that help with error checking:
Several helper macros are included in ``hipfile.h`` that help with error checking:

* ``IS_ROCFILE_ERR()``
* ``ROCFILE_ERRSTR()``
* ``IS_HIP_DRV_ERR()``
* ``HIP_DRV_ERR()``

See the error section of the rocFile reference manual for documentation for
See the error section of the hipFile reference manual for documentation for
each of these macros.

Functions that return an integer value
--------------------------------------
Several read and write functions (e.g., ``rocFileRead()``) return a ``ssize_t`` value.
Several read and write functions (e.g., ``hipFileRead()``) return a ``ssize_t`` value.
Like the POSIX ``read(3)`` call, these API calls return negative values for errors.
Unlike the POSIX call, however, which only returns -1 on errors, the rocFile IO
calls return a value that reflects the ``rocFileOpError_t`` or ``hipError_t``
Unlike the POSIX call, however, which only returns -1 on errors, the hipFile IO
calls return a value that reflects the ``hipFileOpError_t`` or ``hipError_t``
value that would have been returned.

``hipError_t`` and its values are defined in ``hip/hip_runtime_api.h``. The enum
values are assigned integers up to ~1000. ``rocFileOpError_t`` enum values are
values are assigned integers up to ~1000. ``hipFileOpError_t`` enum values are
assigned values of 5000+.

When rocFile IO calls that return a ``size_t`` fail, the returned value is the
negative of the ``rocFileOpError_t`` value (rocFile errors) or the negative of
When hipFile IO calls that return a ``size_t`` fail, the returned value is the
negative of the ``hipFileOpError_t`` value (hipFile errors) or the negative of
the ``hipError_t`` value (GPU driver errors) that would normally have been returned
via the ``rocFileError_t`` struct.
via the ``hipFileError_t`` struct.

Other functions
---------------
* ``rocFileOpStatusError()`` returns a string that corresponds to a ``rocFileOpError_t`` value. It cannot fail.
* ``rocFileUseCount()`` returns the reference count of the library. It returns -1 on errors.
* ``hipFileOpStatusError()`` returns a string that corresponds to a ``hipFileOpError_t`` value. It cannot fail.
* ``hipFileUseCount()`` returns the reference count of the library. It returns -1 on errors.
40 changes: 40 additions & 0 deletions hipfile/docs/file.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
File Handle API
===============

File Handles
------------
hipFile files are accessed via an opaque ``hipFileHandle_t`` pointer
obtained from ``hipFileHandleRegister()``. This registration API call
takes a ``hipFileDescr_t`` struct, which contains the filesystem file
descriptor to be used for hipFile IO. This hipFile handle must be closed
using ``hipFileHandleDeregister()`` to avoid leaking resources.

Ideally, ``hipFileDriverOpen()`` should be used to initialize the driver
before registering hipFile handles. hipFile will automatically perform
this initialization for the caller, though, if the driver has not been
initialized when ``hipFileHandleRegister()`` is called. See the driver
section of the documentation for a more thorough discussion of this.

Buffers
-------
Memory buffers that will be used with multiple hipFile IO operations
should be registered via ``hipFileBufRegister()``. If this is not
done, a temporary internal buffer will be used for IO, though this
may not be as performant as using registered buffers.

When registering a buffer, no special handle or pointer is returned
to the caller. Instead, the registered buffer will be tracked internally
and used for IO when appropriate. When no longer necessary for IO,
registered buffers should be freed using ``hipFileBufDeregister()``.

As in ``hipFileHandleRegister()``, the driver will automatically be
initialized by ``hipFileBufRegister()`` if it has not already.

IO Operations
-------------
hipFile read and write operations are performed using ``hipFileRead()``
and ``hipFileWrite()``, respectively. These API calls take a hipFile
handle, a buffer (registered or not), the size of the IO operation
in bytes, and the file and buffer offsets. If using a registered buffer,
the buffer pointer should be the one that was registered and not
a pointer inside that buffer.
25 changes: 17 additions & 8 deletions hipfile/docs/introduction.rst
Original file line number Diff line number Diff line change
@@ -1,11 +1,20 @@
Introduction
============
In recent years, the demand for high-performance data movement between
storage and GPU memory has grown rapidly, driven by the increasing
scale of AI training, scientific simulations, and data analytics
workloads. In response to this evolving need within heterogeneous
computing environments, AMD introduces hipFile, a cuFile-equivalent
API framework designed to enable direct data paths between NVMe
devices and GPU memory, significantly reducing CPU overhead and
improving IO throughput.

hipFile is an AMD equivalent to NVIDIA's cuFile API. It is intendended
to be a drop-in, easily HIPify-able replacement for cuFile. Like other
HIP libraries, it transparently maps hipFile API calls to AMD's rocFile
or NVIDIA's cuFile based on whether `__HIP_PLATFORM_AMD__` or `__HIP_PLATFORM_NVIDIA__`
were set when building the library, respectively.

The documentation for hipFile is somewhat sparse as it's a very thin layer.
For API details, see AMD's rocFile or NVIDIA's cuFile API documentation.
hipFile provides developers with an interface for performing
high-performance IO operations between storage devices and AMD
GPUs. By bypassing traditional CPU memory staging buffers,
hipFile allows applications to achieve lower latency and higher
bandwidth when transferring large datasets into GPU memory. This
direct data path integration complements the broader ROCm stack
- seamlessly interoperating with HIP kernels, HIP streams, and
RDMA-enabled storage systems - to support end-to-end acceleration
for IO-bound workflows.
File renamed without changes.
6 changes: 6 additions & 0 deletions hipfile/docs/rdma.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
Userspace RDMA API
==================

RDMA is currently not supported in hipFile.

Setting RDMA options in hipFile API calls will have no effect.
5 changes: 0 additions & 5 deletions rocfile/docs/config_build.rst

This file was deleted.

42 changes: 0 additions & 42 deletions rocfile/docs/core.rst

This file was deleted.

40 changes: 0 additions & 40 deletions rocfile/docs/file.rst

This file was deleted.

20 changes: 0 additions & 20 deletions rocfile/docs/introduction.rst

This file was deleted.

5 changes: 0 additions & 5 deletions rocfile/docs/nvidia_compat.rst

This file was deleted.

Loading