USB Isochronous Transfers

I’ve recently been doing some work on the STM32F1 ARM chip. In particular I modified one of the ST USB Audio Speaker projects to suit a particular need.

Being an ‘audiophile’ I wanted to see if I could get 24bit, 96kHz, stereo out through the USB interface. Many people say that you can’t actually get higher than 192kHz out of USB 1.1 or Full Speed USB. You can then DMA the data to the 12bit DACs onboard and have yourself a little wee music player / recorder.

Not including USB3.0, there are three USB speed levels, low speed, full speed and high speed. (1.5Mbits, 12Mbits, 480Mbits)

The STM32F103 device actually supports USB full speed. Better devices support high speed, but an external PHY is required (they’re so sneaky about that!!!).

USB Audio uses the USB Isochronous transfer mode.

  • Low Speed – Isochronous is not allowed
  • Full Speed – Isochronous, upto 1023 bytes per 1ms frame
  • High Speed – upto 3x 1024 bytes per 125us frame*

*High speed frames are actually called ‘microframes’

I’ll focus on the Full Speed bandwidth for now as the STM32F103 only supports Full Speed.

Doing the Maths,

  • Full Speed bandwidth (bytes / second ) = 1023 * 1000 = 1023000 bytes per second.

So how does this 24bit, 96kHz, stereo limitation come about?

  • 24 bits = 3 bytes * 96000 samples per second * 2 for stereo channels = 576000 bytes per second
  • 24 bits = 3 bytes * 192000 samples per second * 2 for stereo channels = 1152000 bytes per second <— too much data for Full Speed noooo!!!!!!!

Furthermore, I’ve discovered that the STM32F103 only has a 512 byte USB buffer! Ouch!! Thus our isochronous frames can only handle 512 bytes, not the full 1023 bytes!! Things just are getting worse and worse!!

This means that we can’t even do Stereo, 24bit @ 96kHz as it requires 576 bytes per frame. We could do Stereo 16bit @ 96kHz (384 bytes per frame).

And so the journey ends. The STM32F103 is incapable of 24/96kHz, what a let down :(Does anyone have any ideas to get around this, or have come across a similar issue? Let me know, I’d love to hear what you’ve done!

I know there is a double buffering feature that gets used on this device, I thought perhaps there’s a way to use this to extract more data quicker, but alas, no, it applies only to separate frames, and besides, the entire USB buffer is limited to a stingy 512 bytes!

Bulk USB Audio Transfers anyone??

8 thoughts on “USB Isochronous Transfers

  1. I just came across this while searching for something else and thought I might give you a few ideas.

    Note: This is for AVR, but you could do something similar for the STM32F103.
    If you do, I bet you could get 1023000 Bytes per 1ms frame.
    https://www.obdev.at/products/vusb/index.html
    -In other words: Do not use the built-in USB hardware, but write your own in software / “firmware”.
    This will cost a lot of CPU-time, because you’re going to make the CPU bit-bang the data out on the port.
    The “Blue-Pill” STM32F103C8 board has two 16-bit ports available, however PB2 has been ruined by a stupid 100K resistor (bad design; they should have put the resistor on the VCC and GND sides of the pin header).
    Fix: Simply solder a 0R resistor (eg. a piece of wire) over that resistor and you can use the entire Port B for a 16-bit data transfer.
    Note: You can use a multimeter to ‘beep’ the board and find the resistor (if the board is blue, the resistor is likely R4, which is located on the bottom layer).

    You should not attempt to use the DMA to write to the GPIO port, because the DMA is real slow. It’s running at 3.6MHz when the CPU is clocked at 72MHz, which means the DMA can make 7200000 transitions per second (7.2 Million transitions).

    The maximum number of bytes/halfwords the STM32F103 can write on a GPIO port (which is AHB) is 36000000 when clocked at 72MHz.
    The maximum number of bytes/halfwords that can be read is also 36000000.
    Reading a byte or halfword will cost 2 CPU clock cycles (because the first LDR instruction in the pipeline uses 2 clock cycles; the next will use only 1 clock cycle, but if reading the GPIO port, the CPU needs to wait for the AHB anyway).
    Writing only cost 1 CPU clock cycle, but still we’d need to wait for the AHB.
    That leaves us 2 clock cycles for reading and 1 for writing. Then there’s 1 clock cycle spare, which can be used for data processing such as bit-shifting or the like.

    Conclusion: If you’re not doing anything but copy data from one port to another, you could copy 18000000 halfwords per second, which would be a total of 36000000 Bytes per second (but you’re likely to read bytes from USB and write halfwords to something else, so you’d probably end up with a maximum of 18000000 Bytes per second.
    Real-world implementation would of course be below those mentioned numbers, but I’m quite sure it’s possible to make a hack-ish fake-USB implementation that could handle 1023000 Bytes per second; you could perhaps even ignore parts of the headers and avoid checking data validity to gain some CPU-time…

  2. Uhm, you might want to try using bulk transfers before doing all of the stuff I mentioned above.
    Look at this post:
    http://www.microchip.com/forums/FindPost/417685
    -If the PIC can do it, I’m pretty sure the STM32F103 can beat it, since it would be very likely to have much more CPU-time free.
    A few problems with bulk transfers:
    * The transfers are not “real-time friendly”.
    * You should connect only one device to the computer’s USB port (no hub or keyboard in between the device and the computer).
    * You’ll need buffering on the STM32F103, otherwise you’ll get a lot of underruns, which results in clicks and pops. You’ve got around 20K on the C8 variant, which is good for around 17ms; not much, but it’d definitely get rid of the worst noise.
    * Bulk endpoints might comsume extra CPU time on your host computer, but this is probably not noticable if you have a fairly recent machine.

Leave a reply to Yuriy Cancel reply