Overview ======== The RDPSND protocol is a binary, packet based protocol that maps Microsoft's legacy wave API to a RDP channel. All values are in little endian ordering. Responses are sent with the same opcode as the request, and usually also with the same packet structure. The basic structure of a packet is: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Opcode | Unknown | Payload size | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Opcode Command or response type. Known values: 0x01 RDPSND_CLOSE 0x02 RDPSND_WRITE 0x03 RDPSND_SET_VOLUME 0x04 Unknown (4 bytes payload, no response) 0x05 RDPSND_COMPLETION 0x06 RDPSND_PING 0x07 RDPSND_NEGOTIATE Unknown Usage not known. Not valid when message comes from server Payload size Number of bytes following the header. Opcodes ======= Following is a list of all known opcodes and the contents in their payloads. RDPSND_CLOSE ------------ Tells the client that all applications on the server have released control of the sound system, allowing the client to free any local resources. No payload and no response. RDPSND_WRITE ------------ Request to play a chunk of data. The client is expected to queue up the data and send a RDPSND_COMPLETION when playback has finished. No response. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Tick | Format index | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Packet index | Pad | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Waveform data | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Tick Low 16 bits of clock tick count when packet was scheduled for playback. Format index Waveform data format in the form of an index to the previously negotiated format list. Packet index Index number of this packet. Pad Unused padding. Waveform data Binary waveform data in the format specified by the format index. Size defined by the packet boundary. Because of a strange design in Microsoft's server, the length of the packet will be 4 bytes short and bytes 12 to 15 (4 bytes) of the payload should be removed. RDPSND_SET_VOLUME ----------------- Request from the server to the client to change the output volume. No response. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Left channel | Right channel | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Left channel Volume of left channel in the range [0, 65535]. Right channel Volume of right channel in the range [0, 65535]. RDPSND_COMPLETION ----------------- Sent by the client to the server when a packet, queued by RDPSND_WRITE, has finished playing. No response is sent by the server. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Tick | Packet index | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Tick Clock tick count when packet finished playing on device. Packet index Index number of packet played. Reserved Reserved. Always 0. RDPSND_PING ----------- Sent by the server to the client to determine transport latency. The client must respond by echoing the first four bytes of this packet. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Tick | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Garbage | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Tick Clock tick count when sent from server. Reserved Reserved. Always 0. Garbage 1016 optional bytes of random data. Purpose unknown. Only sent from server. RDPSND_NEGOTIATE ---------------- Initial packet sent by server when the client [re]connects. Allows the server to determine the capabilities of the client. The client should reply with an identical packet, with the relevant fields filled in, and a filtered list of formats (based on what the client supports). 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Flags | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Left channel | Right channel | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Pitch | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | UDP port | Format count | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Initial idx | Version | Padding | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Format tag | Channels | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Frames per sec. | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Bytes per sec. | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Block align | Bits per sample | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Extra size | Extra data ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Flags Flags for client capabilities. Data not valid when from server. 0000 0001 Do UDP song and dance (details unknown). Must be set for sound to work. 0000 0002 Volume control support. Indicates that volume fields are valid and that the client understands RDPSND_SET_VOLUME. 0000 0004 Pitch field is valid. Left channel Initial volume for left channel. Data not valid when from server. Right channel Initial volume for right channel. Data not valid when from server. Pitch Initial pitch of the sound device. Data not valid when from server. UDP port Port used for UDP transfers. Sent in network (big endian) order. Data not valid when from server. Format count Number of format structures following the header. Initial index Initial packet index. The first RDPSND_WRITE will have a packet index one larger than this value. Data not valid when from client. Version Software version. Server (XP & 2K3) sets this to 5. Client (XP) also sets this to 5. Padding Unused. Format tag Audio format type as registered at Microsoft. Channels Number of channels per frame. Frames per sec. Frames per second in Hz. Bytes per sec. Number of bytes per second. Should be the product of "Frames per sec." and "Block align". Block align The size of each frame. Note that not all bytes may contain useful data. Bits per sample Number of bits per sample. Commonly 8 or 16. Extra size Number of bytes of extra information following this format description. Extra data Optional extra format data. Contents specific to each format type.