Zero-copy hardware transcoding with Nvidia GPUs on 64-bit Windows is supported with the help of the Nvidia Video Codec SDK. This SDK provides APIs and sample applications to enable developers to easily create applications that can take advantage of the power of the GPU to accelerate video transcoding. The SDK can be used to create applications that use the GPU to transcode video to H.264 or H.265 codecs. The SDK also supports the use of the GPU for video encoding, decoding, and processing. Additionally, the SDK provides support for the use of the GPU for video post-processing, such as color correction and image stabilization.


This article may be too technical for most readers to understand. Please help improve it for make it understandable for non-expertswithout removing the technical details. (june 2016) (Learn how and when to remove this template message)

zero copy” describe computer operations in which the cpu does not perform the task of copying data from a memory area to another or in which unnecessary copying of data is avoided. This is often used to save CPU cycles and memory bandwidth on many time-consuming tasks, such as streaming a Archive at high speed for a networketc., thus improving presentations in Software (Law Suit) executed by a computer.


zero copy schedule techniques can be used when exchanging data within a user space process (that is, between two or more topicsetc.) and/or between two or more processes (see also producer-consumer problem) and/or when data needs to be accessed/copied/moved within kernel space or between a user space process and parts of kernel space from operational systems (ONLY).

Typically, when a userspace process needs to perform system operations such as reading or writing data to and from a device (i.e. one discoan network cardetc.) through its high level software interfaces or how to move data from one device to another, etc., it must perform one or more system calls that are executed in kernel space by the operating system.

If data has to be copied or moved from source to destination and both are located within kernel space (i.e. two files, a file and a network card, etc.), unnecessary data copies from kernel to user space and from user space to kernel space can be avoided by using special (zero copy) system calls, usually available in newer versions of popular operating systems.

Zero-copy versions of operating system elements such as device drivers, file systems, network protocol stacksetc., greatly enhance the performance of certain application programs (which become processes when run) and use system resources more efficiently. Performance is improved by allowing the CPU to switch to other tasks while copying/processing data proceeds in parallel elsewhere on the machine. In addition, zero-copy operations reduce the number of time-consuming context switches between user space and kernel space. System resources are used more efficiently because using a sophisticated CPU to perform large data copy operations, which is a relatively simple task, is a waste if other, simpler system components can do the copying.

For example, reading a file and sending it over a network in the traditional way requires 2 extra copies of data (1 to read from kernel to user space + 1 to write from user to kernel space) and 4 context switches per read/write cycle. These extra data copies use the CPU. Sending this file using file data mmap and a cycle of write calls reduces the context switches to 2 per write call and avoids the previous 2 extra copies of user data. Sending the same file via zero copy reduces context switches to 2 per sendfile call and eliminates all extra CPU data copying (both in user and kernel space).

Zero-copy protocols are especially important for very high-speed networks where the capacity of a network link approaches or exceeds the processing capacity of the CPU. In this case, the CPU can spend almost all of its time copying the transferred data, and therefore it becomes a bottleneck limiting the communication rate below the capacity of the link. A rule of thumb used in the industry is that approximately one CPU clock cycle is required to process one bit of incoming data.

hardware implementations

An initial implementation was IBM OS/360 where a program can instruct the channel subsystem to read blocks of data from a file or device into a shock absorber and write to another one of the same buffer without moving the data.

Techniques for creating zero-copy software include using direct memory access (DMA) based copy and memory mapping through a memory management unit (MMU). These features require specific hardware support and often involve specific memory alignment requirements.

A more recent approach used by the Heterogeneous system architecture (HSA) facilitates the passage of pointers Between the cpu and the gpu as well as other processors. This requires a unified address space for the CPU and the GPU.

program interfaces

Several operating systems support zero copying of user data and file contents via APIs.

Listed here are just a few well-known system calls/APIs available on the most popular operating systems.

Novell NetWare supports a form of zero copying through Event Control Blocks (ECBs), see NCOPY.

The internal COPY OF command in some versions of DR-DOS since 1992 it starts this too when COMMAND.COM detects that the files to be copied are stored on a NetWare file server, otherwise it returns to normal file copy. The external MOVE command since DR DOS 6.0 (1991) and MS-DOS 6.0 (1993) performs an internal RENAME (causing only the directory entries to be modified on the file system instead of physically copying the file data) when the source and destination are located on the same logical volume.

The Linux Kernel supports zero copying through various system calls such as:

  • sendfile, sendfile64;
  • amend;
  • T-shirt;
  • vmsplice;
  • process_vm_readv;
  • process_vm_writev;
  • copy_file_range;
  • raw sockets with package mmap or AF_XDP.

Some of them are specified in POSIX and therefore also present in the BSD grains or IBM AIXsome are unique to Linux Kernel API.

FreeBSDGenericName, NetBSDName, OpenBSDGenericName, DragonFly BSDNameetc. support zero copy through at least these system calls:

  • Send file;
  • write, writev + mmap when writing data to a network socket.

Mac OS it should support zero copying through the FreeBSD part of the kernel because it offers the same system calls (and its man pages are still marked as BSD), like:

  • Send file.

Oracle Solaris supports zero copying through at least these system calls:

  • Send file;
  • send file;
  • write, writev + mmap.

Microsoft Windows supports zero copying through at least this system call:

  • TransmitFile.

Java Input streams can support zero copy via the transferTo() method of java.nio.channels.FileChannel if the underlying operating system also supports zero copy.

RDMA Protocols (Remote Direct Memory Access) rely heavily on zero-copy techniques.

See too


Source: Zero-copy

Video about Support Zero-Copy Hardware Transcoding With Nvidia Gpus On 64-Bit Windows

Windows tutorial unlimited transcodes for Nvidia GPUs

Question about Support Zero-Copy Hardware Transcoding With Nvidia Gpus On 64-Bit Windows

If you have any questions about Support Zero-Copy Hardware Transcoding With Nvidia Gpus On 64-Bit Windows, please let us know, all your questions or suggestions will help us improve in the following articles!

The article Support Zero-Copy Hardware Transcoding With Nvidia Gpus On 64-Bit Windows was compiled by me and my team from many sources. If you find the article Support Zero-Copy Hardware Transcoding With Nvidia Gpus On 64-Bit Windows helpful to you, please support the team Like or Share!

Rate Articles Zero-copy

Rate: 4-5 stars
Ratings: 5696
Views: 37667362

Search keywords Support Zero-Copy Hardware Transcoding With Nvidia Gpus On 64-Bit Windows

1. Memory Management
2. Data Transfer
3. Data Copying
4. Networking
5. Operating System
6. Memory Optimization
7. Kernel Programming
8. Data Structures
9. Memory Allocation
10. Inter-process Communication