Changing FOC Switching frequency

Hummie · December 27, 2018, 6:32am

The large majority of us don’t have the background in this stuff but if u were to go way back to the basics and walk us through i for one would appreciate, especially with what it seems you’re saying. Is there better programming possible for the Vesc?

Gamer43 · December 27, 2018, 6:40am

EDIT: first part of this post is irrelevant. @Hummie Basically, I know this is hard to believe, but I think Vedder left out a “divide by 2” in the code that is making the switching frequency half of what it should be. (This is just a guess, I could be very wrong).

Microcontrollers use this hardware unit called a “timer” that basically takes in a clock and counts up, or counts down. “Center-Aligned Mode” means the the timer counts up then counts down and repeats. PWM is generated by changing the output state when the timer hits a certain value. The “Period” value is what the timer counts up to and counts down from. Let us use 500 as an example. So, in “Center-Aligned Mode”, the timer will start at 0, count up until it reaches 500, then count down until it reaches 0 again, generating one PWM cycle along the way. This means the timer needs 500 counts to reach the top, and 500 counts to reach the bottom, resulting in a total of 1000 counts for one cycle. Let’s say we give the timer a clock of 100khz. This means the timer will count 100,000 times in one second. Since it takes our timer 1,000 counts to complete one cycle, this means the timer (and resulting PWM) frequency is 100,000 divided by 1,000, which in this case is 100. What this means is it takes the timer two times the period value to complete one cycle. In the code, the period value was determined assuming the timer only requires one times the period value to complete one cycle. The result is your PWM frequency is half of what you expect it to be.

EDIT: everything below here might still be irrelevant?

The part where I was rambling about sample timing, that isn’t an issue, just a resource inefficiency that I am anal-retentive about. That can be optimized by using Timer 8 instead of Timer 1 to generate the PWMs (but this change is not possible on most hardware), or use a different microcontroller and modify the code accordingly; STM32F446RE and STM32L452RE are two possible candidates. Using a different microcontroller would reduce the cost of the VESC, in fact, using the latter wouldn’t even need a crystal oscillator, further reducing cost. Vedder used the STM32F405RG because there was a discovery board for it, making prototyping and developing easy, and the two microcontrollers I mentioned were not available when he developed the first iteration of the VESC. Porting the code will be an arduous task as Vedder wrote the code to be highly efficient, making it difficult to port (this is a tradeoff common to all embedded software). One possibility is to try to wrap his FOC algorithm and USB communication scheme into a library and develop a new application.

The second part is a little hard to explain. Imagine you are trying to get your house to a certain temperature by using your air conditioner and heater. But what is happening here is it takes 30 minutes to either turn on or turn off the air conditioner or heater. You can get around this issue by trying to guess what temperature your house will be in 30 minutes.

@Petertaylor Oh, what did you set the dead-time compensation to be? In firmware, the actual dead-time is 60 timer counts, at 168Mhz, this comes out to be ~360nS (or 0.36uS), and incorrect values for dead-time compensation will only become exacerbated by higher switching frequencies. The default in VESC tool is 0.06uS.

Deodand · December 27, 2018, 4:58pm

Your analysis is somewhat correct but flawed. The whole point of using center aligned pwm mode is it effectively doubles the pwm switching frequency for a given pwm period. Honestly a bit too lazy too draw up all the diagrams etc but if you look at how center-aligned three phase switching works for a 10kHz switching period you get an effective 20 kHz resonance. Trust me if your motor was switching at a lower frequency it would be intolerably loud. The whining you hear is lower harmonics, since the system has some nonlinearities smaller amplitude lower harmonics present themselves as 1/2 and 1/4 the switching frequency, and the amplitude of these just depends on system Dynamics.

Gamer43 · December 27, 2018, 5:44pm

I thought the purpose of center-aligned pwm is to generate a symmetric pwm signal to reduce higher order harmonics. Every reference manual I have read tells me Center-aligned mode generates a pwm signal of half the frequency of the equivalent edge-aligned signal with the same period value. So you’re saying the LR filter that makes up the phase windings actually sees a signal twice that of the actual pwm signal frequency?

Deodand · December 27, 2018, 5:47pm

Yeah the frequency gets set to 10 kHz for instance and the windings LR sees 20 kHz. I’ve also seen this topic brought up with vedder before and he had the same response.

taz · December 27, 2018, 5:50pm

Gamer43 · December 27, 2018, 5:56pm

Oh hmm, no wonder why transistor switching losses aren’t as bad as I thought they would be. I wonder if the whining would persist if I tried a switching frequency of 40khz or higher (20khz pwm). What could be causing the discrepancy petertaylor is seeing? Some things I noticed is that the VESC sometimes has issues transitioning from sensored to sensorless, and the observer has some problems locking on at low speeds.

EDIT: I took another look at the firmware, I think that the code does not compensate the fact that the updated duty cycles do not take effect until the next update event (update events occur at twice the PWM frequency). I am not exactly sure what effect this has on the current control loop and observer, but I do know that if this is indeed the case, the current control loop will run half a PWM cycle behind the actual measured current, and the applied Valpha and Vbeta vectors used in the observer are slightly off.

The fix for this is current extrapolation, but the trade-off is that the observer will run at one quarter the switching frequency (or one half the switching frequency if phase shunts are used). You can get around this by doubling the PWM frequency and using MOSFETs that aren’t from International Rectifier, i.e have decent gate charge characteristics. International Rectifier MOSFETs are reliable and rugged (because the transistor die is HUGE), but aside from Rdson and avalanche energy, have relatively poor electrical characteristics compared to similar MOSFETs on the market (again, because the die is absolutely huge).

Petertaylor · December 30, 2018, 5:23am

Hi Gamer43! Thanks for the info, and looking seriously at the code. It’s perhaps a little over my head, or at least desire

Looks like you’re retracted some of the prior statements. Anyhow, i really appreciate the look at the code, and thoughts. Can you keep checking back on this thread as more info shows up?

I had defaults settings as far as i could, so i presume the graphs above were done with the default (which i see as 0.080 µS, not 0.06uS). For optimum efficiency, do you recommend 0.36uS?

In other news, i realised that the sensor voltage i was using has drifted slightly, and i’ should really be using a dedicated reference voltage, or ADC with one incorporated. That change will happen soon/tomorrow. It’s only a percent or two, but still annoying.

I’ve got a couple other motors on the dyno now, so can do some testing in a different range of motor kvs, pole counts, resistance, etc. I also got another vesc (4.12 hardware) to play with. New motors are Flipsky 6374 190kv and turnigy 6364 213kv.

Gamer43 · December 30, 2018, 8:40am

In the firmware, in file “mc_interface.h” , there is a #define called “HW_DEAD_TIME_VALUE” and it is defined with a value of 60.

If my math is right, at 168mhz the frequency of Timer 1, 60 counts comes out to be ~357.14 nS, which I rounded to 0.36uS. To get the most accurate value, use 0.357uS. The STM32F405RG only supports single precision floating points so more precise values will simply be truncated when dead-time compensation is applied.

I don’t think it’ll have too much of an effect, though. Improper (or the lack thereof) dead-time compensation causes minor hiccups with the observer, especially when phase current crosses zero. Detailed description can be found in this application note: https://www.nxp.com/docs/en/application-note/AN4863.pdf

Basically the rotor angle calculated by the observer would be slightly off, and may jump around when phase current crosses zero, this can introduce inefficiencies, but really is only problematic at low speeds in sensorless operation and high switching frequencies. (Or when you are using 8-in delta wound hub motors that have really high magnetizing currents, the observer utterly could not lock on until 3500 ERPM).

Although I really would like to verify whether the VESC compensates for the fact that the applied phase voltages do not take effect until the next timer update event. On low side shunt hardware, the real applied voltage will be the average of the previous and currently calculated values; on phase shunt hardware, the real applied voltage will actually be the previously calculated value. From what I could tell in the firmware (namely the observer_update and current_control functions), it did not appear to do this. Maybe there was something I missed and it does indeed perform this compensation. I feel that if the VESC firmware did not compensate, this would negatively affect dynamic performance of the FOC algorithm.

I will check back on this thread regularly, I really appreciate all the work you’ve done to perform these experiments.

Gamer43 · December 31, 2018, 9:57pm

Sorry for the double post, but I discovered something interesting/funny.

So I was messing around with really high switching frequencies on my FSESC 4.20. running with a damaged TB 6374 motor, It did 60khz just fine (this is 30khz pwm).

80khz bricked the MCU.

How did it brick it? The FOC algorithm does not run fast enough, LMAO. The adc ISR is not fast enough. Since the 4.20 uses low-side shunts, this means the controllers and estimators cannot run at 40khz, and probably cannot run much faster above 30khz.

This means phase shunt hardware can’t run above 30-some khz switching. EDIT: It will work at 70khz switching, which is 35khz for the observer, thus 35khz is most likely the maximum switching frequency for phase shunt hardware.

Interestingly enough, a very high switching frequency makes faults happen less often on the FSESC 4.20

I also ran into this when I was messing with STM32F4 ADC throughput.

Deodand · December 31, 2018, 11:46pm

Yeah, I’ve run into this before. Note that for phase shunt hardware if you disable “sample in v0 and v7” then you can run the same frequency as low side shunt hardware. Sorry if this is obvious to you but you can re-flash with an st-link to fix.

Can you elaborate on the adc throughput? I’m pretty sure theres a few options you can tune to increase sampling speed at the cost of a small amount ofaccuracy.

Gamer43 · January 1, 2019, 12:10am

That’s what I did, had to dig out an old connector and hooked it up to my Nucleo-H743ZI board, at first I thought I busted the DRV8302, but then realized when USB would enumerate but then block.

Basically, what I was doing was on an STM32F446RE was I set a single adc to its max sample rate on one channel, and experimented with how many samples I needed the DMA to transfer before the adc_conv_complted callback.would overload and stall the CPU.

The answer is 7 xD.

At 1.5 msps, this means the amount of overhead on the adc_conv_complted ISR only allows for it to be called at ~214khz (lower for the STM32F405RG since it runs at 168mhz instead of 180mhz).

I was using the HAL library from STMicroelectronics, and oh boy is there a ton of overhead in the ISR callbacks. (lots of branches to select the correct callback and clearing the status registers).

This was right in-line with what I observed with my own BLDC controller algorithm (only trapezoidal control). 200khz was just about the highest I could push the sampling frequency without the entire algorithm becoming unstable (it could do up to 400khz, since I would blank the sample timer 50% of the time in delay commutation mode, but because I didn’t handle my state variables properly, if the sample timer were to be enabled with the motor not spinning, the algorithm would brick itself).

Deodand · January 1, 2019, 12:14am

Very interesting, thanks for sharing. Yeah I was doing some testing on the unity to see how long between triggering the adc sampling and all 15 samples were put into memory. I remember it being a small percentage of the total time window of pwm period. I could definitely see running into problems around 200 kHz but such a switching frequency probably only would be needed for super low inductance motors.

Gamer43 · January 1, 2019, 12:15am

or extremely high speed motors, I was running one motor at 140k ERPM xD.

Deodand · January 1, 2019, 12:21am

We’ve tested the unity up to 135k erpm, works but definitely gets loud. It’s like 2200 rev/s so at 15kHz update (30 kHz pwm low side shunts) you only get about 7 updates per revolution so its basically back to a trapezoid at those speeds.

Vedder’s video shows 80 kHz pwm on his new 75/300 vesc, I’ve been wondering if he cut some of the bloat in the loop to achieve that. Any idea of the compiler gets rid if multiplies in #defines for instance? Haven’t had a chance to test but if it doesn’t theres a ton of waste clock cycles in conversion defines etc.

Gamer43 · January 1, 2019, 1:01am

Wow, that’s really impressive.

Maybe he’s using the STM32H743ZI instead? That beast runs at 400mhz, has a superscalar pipeline, and supports double precision floats; it is several times more powerful than the STM32F405RG for only about double the cost xD.

Preprocessor substitutes all #defines with the literal expression prior to compile time (THIS IS WHY PARENTHESIZING YOUR EXPRESSIONS IS SO IMPORTANT), so as long as compiler optimizations are enabled, shouldn’t be losing clock cycles to #defines as any expressions that evaluate to a constant should be compiled as a single constant in the code.

For example, this: #define MACRO 10 * 5 + 3

int val = MACRO * 6;

will appear as this to the compiler:

int val = 10 * 5 + 3 * 6;

Although when I was looking through the code, there were a few things I thought could be optimized, can’t recall all of them off the top of my head. Some of those optimizations would be at the cost of portability (and maybe code safety?). One of them was when updating duty cycles, timer update event is disabled and then re-enabled afterwards. That’s at least two store instructions (so three clock cycles?), but cutting this out would mean that the algorithm operates under the assumption that observer and current updates can be calculated before the next update event. If only sampling on the low side, setting the repetition counter to 1 instead of 0 might be necessary.

The observer also calculates phase angle iteratively, a different observer algorithm that solves for BEMF and phase angle in one go could potentially cut down on clock cycles.

Deodand · January 1, 2019, 2:31am

I did some googling of my own on the whole #define thing, a lot of interesting reading out there on code optimization. I guess anywhere there is a float divide it wouldn’t optimize unless you raised the optimization level (by default the makefile is at -O2, if you set it to -Ofast I guess it would do this) but I’m not sure if it would have adverse effects anywhere else. I’ll give it a try tomorrow.

Yeah I’d need to do some profiling to determine where best to focus attention. The iterative observer looks reasonably cheap and runs at a fixed 6 iterations but obviously it all adds up. Going to do some testing on this sometime soon.

That MCU looks pretty amazing, but I think a more expensive MCU is definitely the wrong direction for eskate motor controller, I guess maybe if you need insane ERPM for other applications.

Gamer43 · January 1, 2019, 3:13am

I heard Vedder was testing the 75/300 on a wind turbine, that would definitely warrant an overpowered MCU xD. Or EV applications. Or four independent FOC motors from one controller xD.

Honestly, I would’ve loved to see something like the H743ZI go into the Unity xD, but that’s just my ridiculous opinion.

Moving from O2 to Ofast optimization does a few things. First, the compiler will start sacrificing significant amounts of code size, doing things like unrolling entire loops, inlining functions, and using lookup tables. The F405RG has plenty of additional flash memory so no problem there. Second, debugging becomes extremely difficult as the compiler tries to generate an order of instructions the most independent from each other so as to mitigate stalls from pipeline hazards and pipeline refills from branch mispredicts. Third, it may introduce some bugs if variables were not declared as volatile or if certain caveats are not followed, since the compiler may end up optimizing something away that is critical, but because of the context it appears in, the compiler did not think so. Fourth, as you mentioned, things like floating point division optimization. Divide is REALLY expensive, taking up to 17(if i remember correctly) clocks on the ARM-Cortex M4, and causing data dependencies down the line to stall. One well known optimization for floating point division is to approximate the reciprocal (it might do this using a lookup table, actually, some architectures have an instruction for this) and multiply (takes one clock) by that instead. The approximation is slightly less accurate, but much faster. Although I don’t remember if the compiler will start sacrificing accuracy for speed at Ofast automatically and you may need to explicitly tell it to do so using a flag in the makefile or project settings.

Deodand · January 1, 2019, 3:21am

Thanks, really useful information. My background is all over the place so I kind of have to pick up things as I go. I think I will test out timing impacts of increasing optimization that way vs just using the fast math flag with GCC which is probably less risky. I’ll report back soon, I’m expecting very little

Gamer43 · January 1, 2019, 3:49am

Glad I could helpful. I appreciate the tests you are doing.

Also, I just remembered another caveat with floating point #defines and constants (at least with the toolchain I was using, Atollic Truestudio)

All floating point constants need to be suffixed with an -f I believe (or something else to indicate “TREAT AS SINGLE PRECISION”, can’t remember exactly, I do know explicitly casting to float will work as well) otherwise the compiler will treat them as doubles. Software implementation of doubles is costly in terms of code and in terms of speed.

I remember from my own tests, the difference from O2 to Ofast on my BLDC algorithm was something like 23% CPU utilization to 21% CPU utilization. (This was at 200khz sampling with sample timer blanking and BEMF integration). About 10%, significant, but not too much, I think it got most of it from function inlining and lookup tables.

Interesting design point, if you add filter capacitors to the phase voltage resistor dividers (RC time constant of ~183uS), you can get BLDC Integration mode to work out of the box with no tuning whatsoever (on a different algorithm, VESC algorithm does not support this). Ramp up the motor in openloop or sensored mode until the BEMF signal exceeds a certain hysteresis or until a certain ERPM, let the motor freewheel and watch one phase. Integrate between two zero crossings and divide that total value by 16 (or 14.98 for sinusoidal BEMF) to get the integration limit. Lock onto rotor position at the next zero crossing using a lookup table. The motor should freewheel at most two electrical revolutions, so there shouldn’t be too much of a problem with dynamic performance there. Added bonus, works the same regardless of whether you use slow decay or fast decay. You can also sample whenever you want and at whatever speed you want. E.g. pwm can be 25khz, sample at 157khz. I got the idea for this on-the-fly detection scheme from Allegro Microsystems’ A4960 and A4964 datasheets. The one downside is the BEMF signal after the filter will lag behind by 6 degrees at higher speeds (according to LTSpice), but this can be compensated for using phase advance, also I couldn’t see much lag when I probed with my oscilloscope (at no phase advance).