Quantcast
Channel: Andys Workshop
Viewing all 38 articles
Browse latest View live

An FPGA sprite graphics accelerator with a 180MHz STM32F429 controller and 640 x 360 LCD

$
0
0

A very warm welcome to my most ambitious project to date. In this project I’m going to attempt to design and build a sprite-based graphics accelerator that will function as a co-processor to an MCU. Using cheap off-the-shelf components I’m hoping to achieve a level of gaming performance that compares well to popular commercial hand-held gaming consoles.

I’m hoping that I’ll learn a few new tricks along the way, and, if the ideas currently zinging around inside my head all land the right way up and in the right order then I should be able to write a demo or two, maybe even a small game as a proof of concept. Naturally this project will be entirely open source so if you feel the need to copy, extend or just kick some tires then you’ll be doing so with my blessing.

Interested? I hope you are. So sit back and grab a beverage because this may take some time.

System design

I decided up-front that this would be a sprite-based 2D graphics accelerator. Sprites are graphical objects that the developer can place at arbitrary locations on the screen. They can overlap each other in a predictable Z-order and can have areas of transparency so that they may be non-rectangular.

A frame from a game is assembled from a collection of sprites, some of which will represent the player’s environment, some will represent the player and other game actors and still others will represent transients such as explosions and other effects.

Sprites are the only graphics on the display and each frame is assembled independently by placing each sprite at its configured position in the z-order. This means that additional hardware such as bit-blitters are not required and moving a large sprite around costs the same as moving a small sprite. Radical changes between frames are as cheap as no changes at all.

All the cellphone LCDs that I’ve seen have a default refresh rate of approximately 60 frames per second (fps) so I decided on a target of 30 fps for the main engine. This means that I can spend 1/60s preparing the next frame in a frame buffer and then the next 1/60s sending it to the LCD.

This technique is known as double buffering and, together with careful timing of the refreshing of the display data is the primary method by which we avoid tearing.

The LCD retrieves data from its internal memory from top to bottom, left to right. If we happen to be updating an area of the screen at the same time as the LCD is retrieving data from it to push to the panel then we’ll see an ugly effect called tearing where the image on the display consists partly of the previous and next frames.

This effect can be seen in some PC games where an option is provided to ‘disable vsync’ allowing players to achieve a higher display refresh rate at the expense of image consistency.

Luckily the LCD provides a signal that they often call ‘Tearing Effect’ (TE). TE goes active during a part of the display known as the ‘vertical blanking period’ which is a few lines at the top and bottom of the panel that you can’t see.

To achieve a flicker-free display we need to start refreshing data when TE goes active and we must move at least as fast as the display refresh so that it doesn’t catch us up before all the data has been uploaded.

The timing here is a critical part of the design. The LCD controller must offer a write speed that allows a complete frame to be written in less time than the display refresh rate and our graphics accelerator must be able to write data out at that speed.




Click for larger

The screenshot shows the TE signal from the LCD used in this project captured using my Ant18e logic analyser.

High level components

It’s not possible to do all of this using an unassisted MCU with CPU power alone. We need to offload the heavy lifting involved with moving all those graphics around to a co-processor and as you can tell from the title of this article I’ve elected to use an FPGA to do that. Why an FPGA? The core of this graphics accelerator involves interacting with external components at high frequencies and with nanosecond-level timing margins. The amount of combinatorial logic involved is fairly low and so an FPGA is the obvious choice.

The FPGA will not be the only processor on this board. Games need a controller, and it needs to be a pretty decent one if we want to be able to perform game-engine computations in the fixed period available to us between frames. I’ve elected to include the MCU on-board instead of just breaking out the FPGA interface to a pin header because parallel buses and high signal frequencies will not play well with flying interconnect wires.

Games need graphics, lots of graphics. To deal with that I’m going to provide an SD card slot that the controller can use to access graphics and other data authored on a computer. The FPGA isn’t going to be talking to the SD card because SDIO does not offer a predictable, constant sustained transfer rate so I’ll provide a high-capacity flash memory IC that the FPGA can use to read the graphics at high speed.

The FPGA needs a RAM buffer to render its frames to. FPGAs do come with some different types of very fast RAM on board but it’s nowhere near large enough for a frame buffer so we’ll need to add a chunk of our own. Asynchronous Static RAM (SRAM) offers the simplest interface and the possibility of high sequential throughput so we’ll use it in preference to SDRAM. The other option, SDRAM, is cheaper and offers densities far in excess of what we need but the controller is much more complex and does not deliver a benefit in this design.

Of course we also need an LCD to display the actual image and I’m going to choose the highest resolution device that I can possibly get away with given the space and time constraints imposed by the other resources.

As a final touch I’ll throw in an EEPROM to allow a relatively small amount of data to be persisted while the power’s off. High score tables are an example of such data.

Now let’s look at a block diagram that illustrates what I’ve just talked myself into.

Component selection

Now that the basic system design has been decided, it’s time to choose the actual components that will be used on this board.

The LCD


The 3.2″ Sony Vivaz LCD

Just like in my previous project, the halogen reflow oven, I’ve selected the 640×360 LCD from the Sony Vivaz U5 cellphone. You can read all about my initial reverse engineering effort for this display in this article.

This display ticks all the important boxes for this project. Good quality replacement parts are cheaply available on ebay, I’ve worked with it before and I know it’s reliable, and the timings and resolution fit perfectly.

I’ll be driving the display in 16-bit 5-6-5 RGB mode which means I need 2 bytes per pixel. That means the frame buffer is going to have to be at least a 4 megabit SRAM part. If the resolution were any higher then it would push me into an expensive 8 megabit part and in all likelihood the timing would be too tight to achieve in the selected FPGA, again pushing up the cost and complexity into undesirable territory.

The latch

This is a small part with a critical task. If I’m going to squeeze my design into the limit of 63 FPGA user IOs then I need to take steps to reduce the pin count wherever I can. The 8-bit latch will be used to reduce the pins required by the LCD data bus from 16 to 9 (1 extra pin is required for the ALE signal).

The performance of the latch is critical to the success of the design. My timing constraints are such that the ALE line will be high for only 10ns give or take some skew so I needed to select a part that met the criteria. The Texas Instruments SN74ABT573APW device fits the bill perfectly, requiring only a 3.3ns high pulse. Not only is it fast enough but it has a sensible pinout where the outputs are on opposite sides of the device to the inputs which is perfect for a bus. Quite a lot of multi-bit latches have a crazy pinout where outputs are adjacent to the inputs which guarantees you a mess of vias as you try to reassemble your bus lines on the PCB.

The FPGA

I chose the Xilinx Spartan 3 XC3S50 in the -5 high speed grade as the FPGA that I would use. This is the smallest, cheapest and most hobbyist-friendly Xilinx FPGA in production and it’s still got 100 pins, somewhat validating the FPGA’s reputation as being big, formidable devices to work with. At least it’s not a BGA is all I can say.

The top two manufacturers in the FPGA market are Xilinx and Altera. Xilinx was first off the block in the FPGA industry and has the lion’s share of the market. My choice of Xilinx over Altera is based on local availability of the parts. Farnell UK is a Xilinx supplier and so it made sense for me to choose Xilinx. Both manufacturers offer free synthesis and simulation software and a similar product line-up so if Altera parts are easier to find in your locality then I’m sure you’ll get by just fine with Altera FPGAs.

Back to the XC3S50. It’s quoted as having 50,000 logic gates but that’s an aggregate number that isn’t really reflected in the resource usage you see when you synthesise your design. The key figures are that it has 1,536 slice flip-flops, 768 slices and 1,536 4-input LUTs.

As well as the core logic gates there are some additional on-chip resources that are going to be crucial to my design. There are 73,728 bits of dual port block RAM (BRAM), 12,228 bits of distributed RAM, 4 hardware multipliers and 2 digital clock managers (DCM).

It’s important to note that the distributed RAM is not an independent resource like the block RAM. Distributed RAM is implemented using LUTs, reducing the area available to hold your design logic.

I will use the BRAM to hold the sprite records, the distributed ram to implement a FIFO for incoming commands from the MCU and a DCM to synthesize a high frequency clock from an external oscillator.

Of those 100 pins, 63 are available for user IO. That might seem like a lot but once you start adding up the SRAM address and data buses, the MCU interface and the LCD bus it doesn’t look so generous after all. You’ll see how I squeeze it all in as you read the rest of this article.

Finally, do you see that huge bump up the top right of the package that looks like it ought to be the pin 1 indicator? Well it’s not. Pin 1 is down at the bottom left next to the much smaller bump. I found that one out the hard way. The convention with ICs is that if you hold them with the printed text upright then pin 1 will be at the bottom left.

The oscillator

If you want to do anything significant in an FPGA then you need to supply it with a clock signal from an oscillator. A cheap quartz crystal isn’t sufficient, it must be a full oscillator. These cost slightly more but are still very affordable.

The oscillator I chose is the 40MHz Fox FXO-HC73 and it will be fed to one of the global clock pins on the FPGA. FPGAs provide dedicated low-skew routing resources for clock signals to ensure that all the parts of your design that run off the clock are closely synchronised.

The entire FPGA design runs at 100MHz so I use one of the DCM resources inside the FPGA to multiply up the 40MHz signal to 100MHz. There’s no critical reason to choose 40MHz for the oscillator, it’s just one of the cheaply available frequencies that multiplies/divides to 100MHz easily and isn’t too fast to cause routing problems on the PCB.

The MCU

The MCU is going to be the STM32F429VIT6. It’s a 180MHz 32-bit ARM Cortex M4 MCU from ST Microelectronics that comes with 2Mb of flash memory, 256Kb of SRAM and a hardware FPU capable of single-cycle add and multiply operations on single-precision floating point numbers.

It’s a formidable MCU and I chose it because of its high core clock speed and abundant resources. All the game logic has to execute in a fixed time period so it pays to have a high clock speed. It’s almost certainly overkill, and the fact that the F429 contains a ‘Chrome-ART’ accelerator that has considerable functional overlap with the sprite accelerator has not gone un-noticed. However I decided to err on the safe side and fit the fastest STM32 currently available.

Programming the device is easy, all I need to do is expose the SWD pins, connect up my ST-Link/v2 programmer and I can program it using OpenOCD.

The MCU is available in a number of packages with the LQFP-100 being the one with the fewest number of pins. Not exactly small but there’s already a quad 100 pin package on this board so what’s another one between friends?

The flash

The flash device is a 128 megabit (16 megabyte) S25FL127S SPI part from Spansion. This device was selected for its low cost, high speed and high capacity. Uncompressed graphics require lots of space and multi-frame sprites only multiply up that space requirement. This device has the capacity for 8 megapixels, or 36 complete full frames of data.

If you think SPI flash is not going to be fast enough then you’re going to be pleasantly surprised. The Spansion device can operate in a non-standard 4-bit output mode and can be clocked as high as 108MHz giving a maximum data output rate of 54 megabytes per second. Operating this kind of bus is bread and butter to an FPGA. I’ll clock the flash device at the full internal FPGA clock speed of 100MHz and I’ll use the 4-bit quad output mode to enable me to read out a 16-bit pixel in 4×10 = 40ns. This just happens to be exactly how long I need to write out a pixel to the SRAM frame buffer. Serendipitous indeed.

The SRAM

The SRAM IC that I chose is the ISSI IS61LV5128AL 4Mb device arranged in a 512Kb x 8-bit layout with an access time of 10ns (100MHz). The LCD pixels are 16-bits wide so I’ll need to do two SRAM accesses to read or write a full pixel but I’ll save 8 pins from my FPGA budget.

4 megabits is enough to hold 262,144 pixels. My LCD has 640×360 = 230,400 pixels so there’s 31,744 to spare. I don’t have a use case for those extra bits so they’re just going to be left unused in this design.

The 10ns access time means that I’ll have no trouble doing a full pixel write in the same time frame that a 16-bit pixel is read out from the flash IC. Conversely, I’ll be able to read out a full pixel in the same time period that it takes to write out a pixel to the LCD. FPGAs are designed to do multiple tasks concurrently with nanosecond precision so everything should line up nicely.

The EEPROM

The EEPROM plays a peripheral, non-core role in the design. It’s just there so that we’ve got some space to store arbitrary data that must survive a power-cycle. Unlike the popular Atmel AVR chips used in the Arduino, the STM32 MCUs do not come with EEPROM built-in. It’s possible to write flash-pages inside the STM32 on-demand so EEPROM can be emulated but with the cost of I2C EEPROMs so low I figured I might as well include one here.

The device I chose is a 32Kbit BR24G32FJ device from Rohm in the SOIC-8 package.

EEPROMs are a rare example where there is cross-manufacturer pinout and operational compatibility. You can pretty much choose any device in the right-sized package and it’ll work over the I2C protocol just the same as a device from a different manufacturer. If you’re building this project yourself then feel free to substitute an alternative part if the Rohm device is not available where you live.

The power supplies

There are no fewer than five different levels on this board, six if you count the output from the LCD backlight boost converter. A 5V external power supply feeds the LDO regulators that supply power to the rest of the system. Nearly all the components are powered off an AMS1117 3.3V regulator except, predictably, the FPGA. It requires 2.5V and 1.2V for its auxiliary and internal operations in addition to the 3.3V level that we use for all the IOs. The last level is the 2.8V required for the LCD panel supply.

When running in sprite mode with the LCD backlight at 90% the system will draw nearly 400mA down the 5V line. For this reason I chose 3.3V, 2.5V and 1.2V regulators that have a big margin in the amount of current that they can supply. I didn’t want to be left with an iffy power supply at the end of the day. The 2.5V and 1.2V regulators are both from the Taiwan Semiconductor TS1117 family and the 2.8V regulator is the ZXCL280H5TA by Diodes Inc.

It’s all about the timing

All of the selected components must work together within the timing constraints imposed by how fast we can get data out of the flash IC, into the SRAM frame buffer and subsequently to drive into the LCD. Here’s a diagram that shows a high level overview of the timing from the point of view of the game developer.

At the start of the first frame the FPGA will drive a busy signal high to indicate that it’s about to start parsing the sprite configuration stored in the internal BRAM. It will use this configuration to fetch graphics from the external flash and write them out to the SRAM frame buffer. During this period it is not safe to write any commands to the FPGA that would cause the sprite state to change.

When the FPGA has finished this task it will drive the busy signal low again. This transition must happen before the start of the next frame or display corruption will be observed. The MCU should use this period to run its game logic and prepare for writing out the new state of the display.

During the second frame the FPGA takes the data in the frame buffer and writes it out to the LCD as a complete frame. During this period it is safe for the MCU to upload the new state of the world to the FPGA. In fact it’s safe to do so as soon as the busy signal goes low.

When frame two is complete the whole cycle starts back again at frame one. Since the display is running at 60fps what we’ve got here is a 30fps sprite engine.

More about the sprites

I’m planning to provide two operating modes for the FPGA. In passthrough mode the FPGA will send data that it receives from the MCU directly through to the LCD bus. This allows the MCU to directly drive the embedded Renesas R61523 controller in the LCD at a decent speed but not as fast as if it were directly connected. This mode is used to initialise the LCD controller, display introduction and high score screens and to send the command sequence that prepares it for entering sprite mode.

In sprite mode the FPGA takes over driving data to the display as described in the above timing diagram. The MCU can only send sprite-related load/move/show/hide commands. The FPGA requires a 127-bit record to hold the full state of a sprite and the BRAM address bus width must be a power of 2 therefore we can store a total of 512 sprites in the FPGA, where one sprite equals one graphic or one animation cel. That should be more than enough for a game and in fact I’ll find that I’m limited by timing more than anything else.

To show a complete frame the sprites must occupy 100% of the pixels on the display. There is no fill background command so the background must be made up of one or more sprites that cover the entire frame. If a solid colour background is required then a solid colour sprite must be provided for that purpose. The FPGA design provides the facility to auto-repeat a sprite in the X and Y directions to help optimise both flash and sprite memory usage.

Sprites are arranged so that the first one is at the back and the last one is at the front. Pixel transparency (but not alpha level) is provided so that sprites can be an irregular shape or have cut-outs within them.

The last feature that I’m providing is a partial display model. This allows me to define which sprite data row and/or column should be the first to be displayed and which should be the last. Rows/columns outside the range are ignored during the display writing phase.

In the above picture, sprite 4 has its ending column set so that it appears to be hanging off the right edge of the display. Sprite 5 has its starting column and ending row set so that it appears to be partially off the bottom left of the display.

In practice this feature is used to allow sprites to ‘walk on’ and ‘walk off’ the edges of the screen, or it can be used to achieve smooth omni-directional scrolling. I plan to put both of these features to the test in my game demonstration.

Limiting factors

The limiting factor that governs how many sprites I can display is the LCD frame timing. The rendering of the sprites into the frame buffer by the FPGA must finish within one frame, or 16.2ms. Let’s see how that timing budget can be spent.

The FPGA will check every one of the 512 sprites to see if it needs to display it or not. It takes 30ns to check each sprite giving us a fixed overhead of 30 x 512 = 0.015ms. So small that it can be considered negligible.

For each visible sprite, there is a constant setup and completion time of 280ns. This applies even if a sprite is being auto-repeated in the X or Y direction by the FPGA. For each pixel there is an overhead of 40ns. So we end up with a formula of ((40 x num_pixels) + 280)/1,000,000 ms per sprite. This is the important calculation.

In a game where the display has a solid colour background then we have a fixed overhead for it of ((40 x 640 x 360) + 280)/1,000,000 = 9.21ms. That leaves us 6.99ms for sprites that represent the game action. If we re-arrange the timing formula to work out how many pixels that leaves us then we come out with around 174,000, or to put it another way about 75% of the display area. That is the limiting factor for any game design and it’s something I’ll need to bear in mind.

The schematic

Now that I know the parts I’m going to use I’m going to create the schematic that links them all together. Click on the thumbnail to view a full-size PDF of this design.




Click on the thumbnail for a PDF

As you can see it’s quite a large one and is predictably dominated by the FPGA and the MCU. It’s much easier to follow if we break it down into modules. Let’s do that now.

The power supply

Inputs to the main trio of regulators comes into a jack plug from an external 5V supply. The 2.8V supply is located physically far from the 5V input so it was more convenient to supply it from the 3.3V power line. The 10µF and 22µF smoothing capacitors are all tantalum and are all placed physically very close to the regulator that they are designed to work with.

C26, C18 and C23 are electrolytics that provide bulk low-frequency decoupling for the board. In one of Xilinx’s many design guides they recommend that every decade up to 100µF is covered by decoupling so I’m sticking to that recommendation here.

The 120Ω resistor from the 2.5V line to ground is another Xilinx feature. In XAPP453 Xilinx explain how to configure (program) an FPGA using a 3.3V MCU. One of the steps that must be taken is to include the 120Ω shunt resistor from 2.5V to ground to prevent the regulator from seeing a reverse current on its output pin. The downside of this requirement is that there will be a constant 20.8mA (50mW) drain even when nothing is happening.

It pays to study the thermal characteristics part of the voltage regulator datasheet. In my early experiments I was running this design with a 12V input instead of 5V. After running for some time I noticed that the system was spontaneously resetting itself. Odd, I thought, and then I touched the board. It was red hot around the AMS1117. The AMS1117 was going into thermal shutdown to protect itself from burnout and I went back to the datasheet to find out why.

In the thermal considerations section of the datasheet the formula for the power dissipation is given as PD = ( VIN – VOUT )( IOUT ). For my 12V input with a 400mA worst-case output that’s 3.48W of heat that’s going to be generated. Rather a lot. Going on to plug that figure into the formula for the maximum junction temperature gave me a figure of 233°C. The maximum allowed is 125°C. Hardly surprising that I was running into issues. By reducing the input voltage to 5V the power dissipation drops to a mere 0.68W and the maximum junction temperature to 65°C. Much better, and a valuable lesson learned.

The FPGA

The FPGA is pictured here with its decoupling capacitors not shown to save space. If you want to see the decoupling network then please view the full PDF. The decoupling for the FPGA is quite substantial and attempts to follow the guidelines in Xilinx’s Power Distribution System (PDS) Design application note.

I’m using 62 out of the 63 available IOs in this design, only just squeezing in everything that I need. I chose the FPGA pins to be friendly to the ICs such as the SRAM, flash and the latch. The idea is that the components with high frequency signals will be placed very close to the FPGA and should require no board vias. This is meant to be a hobbyist-friendly design so it’ll be a 2 layer board and that means I must take care with the signal routing.

The SRAM is the greediest IC, requiring 28 IOs to cover its address, data and WE control signals. I’m saving 2 pins here by tying CS and OE both to ground as permitted in the SRAM datasheet.

The flash, being a SPI device is quite frugal in pin usage requiring only 6 pins in total. I can’t tie CS low with this device because CS is used in the SPI protocol to terminate certain command sequences.

The signal inputs from the MCU are the 10-bit data bus D0..9, the WR strobe and an active-high reset. The outputs to the MCU are the busy signal and a debug output that I used during development to set when certain states occurred. Debugging an FPGA in-circuit is about as hard as it gets folks.

All the programming signals, PROG_B, INIT_B, CCLK, DIN and DONE are all present and correct. I will be programming the FPGA using what Xilinx calls slave serial mode where an MCU clocks data into the FPGA and monitors the output signals to determine the success of the operation. The compiled .bit design is about 55Kbytes in size and takes a few tenths of a second to upload from the MCU.

I compile the .bit file in with the MCU program and load it into the FPGA on startup. For those that don’t know, an FPGA configuration is held internally in volatile SRAM so it’s lost when the power goes off and must be restored on system startup. (Some FPGAs do come with internal configuration storage flash memory but this family does not).

The LCD signals include the 8 bit data bus, the latch control line (LE) and the RS and WR lines. Going the other direction is the vital TE signal that will allow us to synchronise to the LCD frame output.

VCCO[0..7] are the 3.3V inputs, there’s one per FPGA IO bank. VCCINT are the 1.2V inputs and VCCAUX are the 2.5V inputs. This is a lot of power pins and associated decoupling capacitors and it only gets worse as you go up to larger FPGA packages. Another reason to stick with the small devices for hobbyist designs.

The MCU

The MCU is the counterparty to the FPGA in this design and you can easily see the opposing ends of some of the signals. For example I map the whole 10 bit data bus and the WR signal to port E. This will allow me to set the data and the WR strobe in a single port write. A reset button is provided just in case I need to externally reset the board at any time.

The SDIO signals map to the pins connected to the MCU’s SDIO peripheral so I can read and write to the SD card easily. The I2C SCL and SDA lines are connected to the I2C#2 peripheral inside the MCU and I’ve elected to provide two LEDs for status and other general purpose use. 18 of the GPIO pins are broken out to an external pin header so that I can add peripherals such as joysticks and other input devices for testing. The pins that are broken out are not done so at random, they are selected to cover a variety of the onboard peripherals that could be useful during development.

You may notice that that there’s no oscillator or quartz and that’s because this MCU doesn’t need one. As long as you can make do with 1% clock accuracy then you can use the internal 16MHz High Speed Internal (HSI) oscillator as the input to the PLL that generates the 180MHz core clock. 1% is fine by me and so I use the HSI.

After the complex requirements of the FPGA, the power supply for the MCU is a breath of fresh air. Decoupling (not shown in the screenshot) follows ST’s guidelines of a ceramic capacitor per-pin and a 4.7µF ‘chemical capacitor’ on the board. Electrolytics are chemical capacitors so that’s what I used. I prefer tantalums for their low ESR but didn’t have any to hand at the time.

Debugging and programming is done using SWD, a two wire protocol designed to replace JTAG as a more efficient design. SWDIO and SWCLK are broken out to a debug header on the board that can be directly connected to the cheap and effective ST-Link/v2 programming dongle.

The flash

The flash IC is the highest speed external peripheral on this board. The clock will run at 100MHz which is well into the territory where I could have signal integrity issues caused by overshoot, undershoot, reflections or any combination of the above if I’m not careful. For that reason all of the IO lines and the clock all feature 33Ω series termination resistors designed to sink reflections before they can harm the signal. These lines will also be kept very short on the PCB.

How did I decide on 33Ω? Rule of thumb I’m afraid. I don’t have the kind of equipment required to measure and select an ideal value so I’m starting at 33Ω and if I get problems then I’ll break out my bench oscilloscope and see whether I need to increase the resistance or not.

The SRAM

Memory ICs are not very exciting really. They’re just a pair of buses, a few control strobes and the power supply. To save on pins I’m connecting CS and OE directly to ground as permitted by the datasheet. I just need to control WE when I need to write data.

The address and data lines will change at a maximum of 50MHz in this design but the WE line will toggle at 100MHz. I’m not concerned about the address and data as long as I keep the lines short and without vias then 50MHz won’t be a problem. Writing this after the fact I do think that I should have at least put a footprint in for a 33Ω resistor on the WE line. If and when I produce another revision of the board then I think I’ll do just that.

The oscillator

An oscillator doesn’t need to be kickstarted by an external device and will start ticking from the moment it’s powered up. In my design I’ve elected to include a 33Ω series termination resistor on the clock line even though it’s probably overkill. This clock is so critical to everything else that I thought I’d be better safe than sorry.

The LCD

The schematic for the LCD will be familiar to anyone that’s read either of my reverse engineering or my halogen reflow oven articles.

All of the control signals are connected to the FPGA except LCD_RES which is the reset signal. This one is connected to the MCU. There’s no need for us to bother the FPGA with the burden of the LCD reset sequence, this is best performed by the MCU.

The LCD backlight

The backlight for this LCD consists of six white LEDS in series so we need a boost converter to generate the high voltage required to overcome the combined voltage drop of each LED.

The AP5724 from Diodes Inc. is a dedicated current-mode backlight driver that incorporates a boost converter. You only need to add a few external components including a current-setting resistor and the driver will then consistently output the selected current as long as the EN pin is driven high.

The cool thing is that we don’t even need to supply a PWM signal to the EN pin to control the backlight brightness because the LCD has a function to do that for us. All we need to do is tell the R61523 controller the duty cycle that we’d like to use and it’ll do the rest. That saves us a pin and a timer resource on the MCU.

The latch

The latch sits between the FPGA and the LCD, allowing us to use only 8 pins on the FPGA to drive a 16 bit data bus.

When LE is high the latch is transparent, data passes through from the D inputs to the Q outputs. A few nanoseconds after LE goes low the latch goes deaf to its inputs and continues to drive its outputs from the last data that it saw on those inputs.

What we do is write out the first 8 bits of data, lock the latch and then write out the second 8 bits. As you can see from the schematic this will result in all 16-bits being driven. What’s really helpful is that the FPGA design can be coded to output any bit to any pin so I can tailor the design so that the data bus can be laid out in parallel on the PCB without any vias.

The EEPROM

The EEPROM is an I2C device that provides some persistent storage for us.

The Rohm 32 Kbit IC has a simple 2-wire I2C interface. Nothing much to say here, it’s hooked up to the I2C peripheral on the MCU. I2C is a bi-directional single-wire bus that provides protection against being accidentally driven by multiple drivers by operating in open-drain mode. That means that the bus (and clock) must have pull-up resistors somewhere. I provide those 4.7kΩ pull-ups close to the MCU as you can see in the MCU schematic.

The SD connector

An ALPS SCHD3A0100 SD card cage is provided to house a micro SD card. The cage I’ve chosen accepts a slide-in SD card which is then locked into place by sliding it back a millimetre or so underneath a lip. Once in, it’s held securely and is not likely to fall out of its own accord.

I envisage that the graphics data will be much larger than I could program into the core of the MCU so some sort of external interface is required. SD cards are the most convenient way to do that and the MCU has a built-in SDIO peripheral that will allow me to access the card in the high-speed 4-bit mode. SDIO is another bus that requires pull-ups on its data and command lines, presumably because it’s also running in open-drain mode. I provide these 10kΩ resistors in the MCU schematic screenshot.

The pin headers

Two 2.54mm pin headers are provided for GPIO and debugging.

The pinout for the debug header matches the requirements published by ST Microelectronics for the SWD protocol and the ST-Link/v2 programmer/debugger. I’ve been really impressed with the ST-Link/v2. I use it all the time now with OpenOCD as the debug server and it’s never let me down.

The GPIO header breaks out a number of pins from the MCU for general purpose use. I’ve made sure that quite a few of the commonly used peripherals are covered including the I2S peripherals that I may use in the future for prototyping an audio capability.

Bill of materials

Here’s the full bill of materials for this project. There are, as you might expect, rather a lot of components. Nearly all are available at Farnell, my preferred local supplier and I’ve included links to their site where possible but you will have to venture further afield for a few of the other components.

A few of the components can be substituted for compatible devices from other manufacturers where something is available in the same footprint. Examples are U1, U2, U3, U12, L1, D1. Xilinx and ST both recommend low ESR capacitors for decoupling so choose the electrolytic and tantalum devices carefully.

Designator Value Description Footprint Quantity Farnell
C1, C2, C3, C5, C6 10µF Tantalum capacitor 1206 5 2353045
C4 22µF Tantalum capacitor 1206 1 2333013
C7, C8, C10, C13, C19, C24, C27, C28 100nF Ceramic capacitor 0402 8 1759380
C14, C21, C22, C29, C31, C32, C33, C39, C40, C41, C44, C46, C47, C49, C50, C58, C59, C60 100nF Ceramic capacitor 0603 26 2211177
C9, C12 10µF Ceramic capacitor 0805 2 2320852
C11, C15, C34, C38 1µF Ceramic capacitor 0603 4 1759399
C16 56pF 50V Ceramic capacitor 0603 1 1759063
C17 56pF Ceramic capacitor 0603 1 1759063
C18 100µF Electrolytic capacitor radial 2mm 1 8767122
C20, C25, C36 10nF Ceramic capacitor 0603 3 1759022
C23 47µF Electrolytic capacitor radial 2mm 1 2079293
C26 4.7µF Electrolytic capacitor radial 2mm 1 1236668
C35 1µF 50V Ceramic capacitor 0805 1 1845750
C37, C45, C48 2.2µF Ceramic capacitor 0603 3 1759392
C42 4.7µF Ceramic capacitor 0603 1 2320811
D1 B0530W Any compatible schottky SOD123 1 1863142
DEBUG HDR2X10 Header, 10-Pin, Dual row 2.54mm 1
L1 22µH Inductor 6x6x3mm 1 1864120
LCD AXE534124 Panasonic connector (Digikey US) 17x2x0.4mm 1
LED2 Blue LED LED, <3.3V Vf 1206 1 2322084
LED3 White LED LED,<3.3V Vf 1206 1
P1 HDR2X11 Header, 11-Pin, Dual row 2.54mm 1
P2 SCHD3A0100 ALPS Micro SD connector 2.54mm 1
P3 2.1mm PCB power jack 1 2.1mm
POWER Red LED Power indicator 1206 1 2099256
R1 120Ω Resistor 0603 1 2331714
R2, R20, R21 390Ω Resistor 0805 3 2331790
R3, R23, R24 4.7KΩ Resistor 0603 3 1469807
R4, R7 68Ω Resistor 0805 2 2138823
R5 330Ω Resistor 0603 1 2331721
R6 5.1Ω 1% Feedback resistor 0805 1 2128935
R8, R9, R14, R17, R22, R26, R27, R28, R29 10KΩ Resistor 0603 9 9238603
R10, R11, R12, R13, R16, R18 33Ω Resistor 0603 6 9238301
R25 1KΩ Resistor 0603 1 2073348
RESET PCB Button Make type 1
U1 TS1117BCP 1.2V LDO regulator TO-252 1 1476674
U2 AP1117E33G 3.3V LDO regulator SOT-223 1 1825291
U3 TS1117CW 2.5V LDO regulator SOT-223 1 7208340
U4 SN74ABT573APW Octal latch TSSOP-20 1 1740911
U5 XC3S50 Xilinx Spartan 3 FPGA VQ100 1
U6 IS61LV5128AL ISSI 512K x 8 10ns SRAM TSOP2-44 1 1077676
U7 AP5724 Diodes Inc. LED driver SOT26A-6 1
U8 ZXCL280H5T Diodes Inc. 2.8V LDO regulator SOT353-5N 1 1461559
U10 S25FL127S Spansion 128Mb serial flash SOIC8 (208 mil) 1 2328002
U11 STM32F429VIT6 STM32 F429 MCU LQFP100 1 2393659
U12 BR24G32FJ Rohm 32Kb I2C EEPROM SOP8 1 2373743
X1 FXO-HC375 Fox 40MHz SMD oscillator 1 custom 1641011

PCB design

I decided to target a low-cost 2-layer 10x10cm board of the sort that any hobbyist can afford to have printed at one of the Chinese prototyping houses. Routing it took a while. A long while.

The first component to go down was the LCD connector because it must physically sit in a specific place so that the LCD can be mounted on to the board in a position that allows the other connectors to be placed around it. The LCD connector is actually on the bottom of the board which, when sitting on my desk is facing upwards.

The next component to go down is the FPGA which I plonked down close to the center but mindful that I’d need to place another 100 pin device not far away.

After the FPGA the flash and the SRAM are placed as physically close to the FPGA as I dare and their IO traces are routed carefully.

Next to be routed are the FPGA power and decoupling traces. These traces are wider than most and the ceramic decoupling capacitors are placed as close to the FPGA pins as I can put them, and that meant using very small 0402 components for some of the pins. Others are decoupled on the opposite side of the board and use larger 0603 and 0805 packages.

Now that the components with specific requirements are down it was just a matter of placing the MCU in the best position I could find and routing the remaining signals. That part was not hard, just time consuming.

Final touches include a silk-screen logo, M3 mounting holes and some cool looking rounded corners on the PCB as a whole. The mounting holes are actually quite important because this board will be operated component-side down I will need stand-offs to provide the necessary clearance.

Let’s take a look at the routed PCB. I’ve hidden the ground pours on this screenshot to better show off the traces and components.




Click for a larger view

I elected to get the board printed at Elecrow, one of the many Chinese online services that’ll print you ten copies for a very reasonable price. About two or three weeks later the boards arrived in the mail and they look great!




Click for larger

Note how the all the important traces from the FPGA go directly to their target ICs as a bus with no vias. One of the many beauties of working with an FPGA is that you decide the function of each pin and if you plan ahead then you can keep your board neat and tidy.




Click for larger

I inspected a PCB under a magnifying glass and could only find one issue, which was entirely my fault. The drill holes for the 5V power supply connector were too small by about 1mm. I’d mis-entered them into the footprint designer and not spotted it during any of my post-routing visual checks. Thankfully there was an easy solution to the problem, all I had to do was shave off about a millimeter from the legs on the power connector and I would be OK.

Assembling the PCB

Putting it all together required a bit of forward planning. I wanted to reflow the majority of the components using my halogen reflow oven but the problem was that the 34 pin, 0.4mm pitch LCD connector on the other side would also need reflow.

In the end it wasn’t so hard. I zoned off the area underneath the LCD connector that thankfully only housed a few small ICs for the backlight driver and reflowed the entire remainder of the component side in my reflow oven. For the second stage I turned the board over and reflowed just the LCD connector on my hot plate by holding the PCB with just the aforementioned zoned off area over the plate.

Now I could return to the zoned off area and reflow the remaining SMD components manually with my hot air gun. Finally the easy through-hole components were soldered into place with a regular iron. After a quick bath in white spirit to clean off the flux residues she’s ready for the photoshoot.




Click for larger

That’s the component side with everything in place. The reflow process in the oven, my first major project with the halogen oven, took care of everything very well but I still went around afterwards touching up joints here and there with my iron under the microscope.




Click for larger

And the back side, which will actually be the topmost side when in use shown here before the LCD is attached. The array of decoupling capacitors that belong to the FPGA are clearly visible.




Click for larger

And the top side again, now with the LCD and 10mm standoffs in place. The LCD itself is held down and lifted clear of the PCB and the exposed capacitors with a set of double-sided sticky pads. The debug cable is shown in place, just missing that capacitor by a few millimetres (phew!).

Still with me? Great. The hardware was the easy bit, now I’m going to tackle the MCU firmware and the FPGA design. This should be fun.

Testing

Obviously the first step in the testing phase is just to apply power, cross fingers and switch it on. The red power LED lit up. A little victory. Next we can see if the MCU’s alive by attaching the ST-Link/v2 dongle and seeing if I can connect to it with OpenOCD.

$ bin-x64/openocd-x64-0.7.0.exe -f scripts/board/stm32f4discovery.cfg
Open On-Chip Debugger 0.7.0 (2013-05-05-10:44)
Licensed under GNU GPL v2
For bug reports, read

http://openocd.sourceforge.net/doc/doxygen/bugs.html

srst_only separate srst_nogate srst_open_drain connect_deassert_srst
Info : This adapter doesn't support configurable speed
Info : STLINK v2 JTAG v17 API v2 SWIM v4 VID 0x0483 PID 0x3748
Info : Target voltage: 3.193738
Info : stm32f4x.cpu: hardware has 6 breakpoints, 4 watchpoints

As you can see I’m using the pre-canned script that sets the interface and target MCU as if it were an F4 discovery board and we have a complete success, the MCU should now be programmable.

MCU test code

I wrote a number of small test programs to check out the various features on the board and you can see them all on github. They all make use of my stm32plus library to take the heavy lifting out of working with the STM32 peripherals. Although the stm32plus library does not directly support the F42x line it’s quite OK to use the supported F40x build, you just won’t get any support for the additional peripherals on the F429.

      GpioD<DefaultDigitalOutputFeature<10,11> > pd;

      // loop forever switching it on and off with a 1 second
      // delay in between each cycle

      for(;;) {

        pd[10].reset();
        pd[11].set();
        MillisecondTimer::delay(1000);

        pd[10].set();
        pd[11].reset();
        MillisecondTimer::delay(1000);
      }

An alternate on-off blinker for the two LEDs

One small hurdle that I had to overcome was the issue of the startup code and setting the core clock to 180MHz using only the internal oscillator. After reset and before main() executes every STM32 program goes through a small routine to configure the device’s clock tree and set the speeds of the various buses.

I couldn’t find any sample ST initialisation code for the F429 using the high-speed internal (HSI) 16MHz clock as the system clock source. They do provide an Excel spreadsheet that supports the F40x devices so I took that as the starting point and adjusted the PLL multiplier accordingly to generate the 180MHz core clock. I couldn’t resist tidying up ST’s code – if you’ve ever seen any ST-authored ‘C’ code then you’ll know what I’m talking about! Click here to view the startup code on github.

/*
 * These are the key constants for setting up the PLL using 16MHz HSI as the source
 */

enum {
  VECT_TAB_OFFSET = 0,      // Vector Table base offset field. This value must be a multiple of 0x200.
  PLL_M           = 16,     // PLL_VCO = (HSE_VALUE or HSI_VALUE / PLL_M) * PLL_N
  PLL_N           = 360,
  PLL_P           = 2,      // SYSCLK = PLL_VCO / PLL_P
  PLL_Q           = 8       // USB OTG FS, SDIO and RNG Clock =  PLL_VCO / PLL_Q (note 45MHz unsuitable for USB)
};

One-by-one all the peripherals checked out fine. GPIO, EEPROM and crucially SDIO all worked without any problems. I’ve got to tell you, I was pleased at this point despite not even having looked at the FPGA yet. Talking of which…

Configuring the FPGA

Configuring an FPGA is analogous to programming an MCU. You take your compiled work and operate a manufacturer-defined protocol to get it on to the device whereupon the device is able to start doing what you intended it to do. Where FPGAs differ from MCUs is that their configuration is volatile, that is when you turn off the power it’s gone so you have to reconfigure it every time you boot up the power.

Xilinx FPGAs offer a wide range of configuration methods with excellent documentation. The method that I’ve chosen is called slave serial and it requires an external device (the MCU) to bit-bang the compiled configuration using a serial stream. Here’s an image from the Xilinx documentation that shows a potential configuration circuit, slightly modified by me to remove the JTAG pins because I’m not using those.

The only problem is that the configuration interface on the FPGA is powered by the VCCAUX 2.5V supply and the MCU is powered by the 3.3V supply. Luckily Xilinx have thought of that one and have produced another excellent reference document that tells us what we need to do to configure the FPGA safely from a 3.3V MCU.

The full meaning of the additional parts is very well explained by Xilinx but to quickly summarise, they’ve inserted some current limiting resistors and a shunt resistor next to the 2.5V regulator to drain away excess current so the regulator is prevented from seeing a potentially damaging reverse current on its output pin.

Configuration performance considerations

The compiled and uncompressed bit file for the XC3S50 is about 440096 bits give or take. The maximum speed that Xilinx allows me to operate the serial data line is 66MHz which reduces to 20MHz with compression. You can find these limits documented in the main datasheet under the specifications for FCCSER.

My design is going to almost fully utilise the FPGA so the rudimentary run-length compression supported by the FPGA is of no use to me and I’d prefer to take advantage of that 66MHz maximum clock. Theoretically I can program the FPGA in 440096/66000000*1000 = approx 7ms. In practice I have the additional overhead of shifting the bits for output and monitoring the INIT_B and DONE pins for status changes so in the end the debug build of the firmware can program the device in about half a second.

You can see the source to the configuration class here on github. It relies on the bit file being compiled into flash with the MCU program and you can see how I do that here on github.

Testing the FPGA

Now I have the means to configure the FPGA I need to stretch its legs a bit to check out whether my home manufacturing process has been a success. Naturally, that’s going to involve a blink test, the hardware equivalent of Hello World.

Xilinx development

Back when I decided to teach myself a hardware description language (HDL) the choice of available languages came down to two options: VHDL or Verilog. VHDL was touted as looking a bit like Pascal (ask your Dad) or Ada (ask your Grandad). Verilog was touted as looking a bit like ‘C’ (it doesn’t). I rather liked the rigorous and verbose looking syntax of VHDL and and so, since there’s no difference in the capabilities of each language, I chose VHDL. Professional FPGA engineers are likely to know both.

Xilinx offers a free development environment that they call the ISE Webpack that somehow manages to require 17Gb of space on my SSD. The tools offered by the webpack are both comprehensive and confusing in equal measure. As a beginner you’ll want to use an IDE to introduce you to the workflow and you’ll get a choice of two from Xilinx.

ISE Design Suite is the first option and the one I recommend for beginners. It appears to be written in a language that compiles to native code so it’s quite efficient and doesn’t consume many resources itself while the synthesis tools are running.

The ISE project navigator clearly shows the synthesis workflow and has an easy to use interface for creating and running simulations. It’s particularly useful for determining the availability and meaning of the many command line options that you can use.

The second option is to create your project using their Plan Ahead tool. I don’t recommend this. Plan Ahead appears to be written in java and as such it takes a heavy toll on your resources while synthesis is running. On a low powered laptop I caught it using 100% of a CPU core presumably just monitoring the tool output files for changes. However, Plan Ahead is fine for IO pin planning and is actually very useful for that task because it allows you to visualise the package while choosing pins.

I started out some time ago using ISE Design Suite and once I’d got the hang of the workflow and the command line options I dropped it in favour of a command line build environment using my favourite text editor and the scons build system. This project does not use the ISE GUI.

Xilinx command line builds

The one area where Xilinx seems to have gone completely off the rails is the ability to operate the tools from the command line in harmony with a source control system. The tools will spew literally dozens of output, intermediate and report files and subdirectories into your source directory. You can see the rules I have to create in the SConscript and .gitignore files just hide this garbage.

Furthermore, the coregen.exe utility commits the cardinal sin of modifying its input source file when you run it making git think it’s always been modified. You have to ask yourself whether the teams that wrote these tools actually use source control themselves.

FPGA blink

Here’s an implementation of blink on the FPGA in VHDL.

library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.numeric_std.all;
 
entity blink is
port(
  clk     : in  std_logic;
  led_out : out std_logic
);
end blink;

architecture behavioural of blink is
 constant clock_count : natural := 80000000;      -- 2x clock frequency in Hz
begin

  process(clk)
    variable count : natural range 0 to clock_count;
  begin

    -- for half the time led_out = 0 and 1 for the other half

    if rising_edge(clk) then
      if count < clock_count/2 then
        led_out <='1';
        count := count + 1;
      elsif count < clock_count then
        led_out <='0';
        count := count + 1;
      else
        count := 0;
        led_out <='1';
      end if;
    end if;
  end process; 

end behavioural;

This is a synchronous design with the sole process synchronised to the 40MHz oscillator via the clk signal. Each time the oscillator ticks a counter is increased and after a second the led_out signal is toggled.

My FPGA is not connected to an LED so I routed led_out to the DEBUG pin and used a logic analyser to verify the ticking. It worked, which told me that the FPGA was up and running, was being correctly programmed via the MCU and that at least two of the pins worked and the oscillator was ticking. A small design proved out a large part of the board.

Programming the flash

If you review the schematic you’ll see that the flash IC is only connected to the FPGA and not the MCU therefore I needed to create an FPGA design for programming it. A simple approach would have been to configure the FPGA to just pass through all signals from the MCU to the flash. The FPGA would function as little more than a buffer and all the logic would be in the MCU.

The second option would be to operate the flash programming logic from within the FPGA, accepting commands and data to be programmed from the MCU. This was the more complex option but I was up for the challenge and set about doing it this way.

The diagram shows the workflow involved in programming the flash. At a high level the MCU reads pre-formatted graphics from the SD card and then writes them one by one to the FPGA and then verifies that each one has been written correctly. The actual steps involved are:

  1. This design runs at a relatively pedestrian 40MHz so I set the non-volatile flash configuration register (CR) to the default speed setting and disable the bit that enables the quad IO mode.
  2. I issue the command to erase the entire flash device and wait for the FPGA to de-assert the BUSY pin that indicates the operation is complete. The FPGA polls the flash status register for this flag. It takes the flash about 30 seconds to complete a full erase operation.
  3. One by one I program each of the files on the SD card into the flash device by issuing block program instructions for each 256 byte page.
  4. I go back and get the FPGA to verify each of the files. I re-supply the file data for each block and the FPGA reads the block from the flash, compares it and asserts the DEBUG pin if there’s a discrepancy.
  5. I set the bits in the configuration register that enable the flash to operate at 100MHz and I enable quad output mode in the assumption that the main sprite accelerator will be the design that runs next.

The source code for all this is available on github. The MCU program is here and the FPGA design is here. I’d simulated this design before writing the MCU code so I was sure that the logic was OK and so it did more or less work first time. The only glitch I had to iron out was sampling the asynchronous WR line in the FPGA. FPGAs really don’t like asynchronous signals and you have to take extra steps to avoid metastability when sampling an asynchronous pin.




Click for larger

I really can’t emphasize enough how hard it is to debug an FPGA in-circuit so it’s imperative that designs are thoroughly simulated up-front. My in-circuit debugger consists of a single output pin that I can set or reset depending on some internal state.

The main design

The main VHDL design is broken down into components linked together by their input and output signals. In FPGA lingo they call this a hierarchical design and if you’re accustomed to any modern programming language it’ll feel completely natural to you as opposed to dumping the whole design into one file (yes, they do that too).

main is the overall container that declares the I/O port that maps directly to the pins on the VQ100 package. Main is responsible for instantiating the rest of the components and linking together their inputs and outputs. Let’s take a brief look at the purpose of each component.

mcu_interface

The FPGA is connected to the MCU via a 10-bit data bus and an asynchronous WR strobe. The MCU writes data on to this bus and then pulses WR low and then high again. The FPGA reacts to the rising edge of WR by writing the 10-bit value into an internal 64-entry FIFO implemented in distributed RAM.

At the same time, mcu_interface is also reading data off the other end of the FIFO and when it’s read enough parameters to execute the desired command it will spend a few 10ns cycles executing that command before returning to reading more data off the FIFO.

It’s up to the MCU to ensure that it doesn’t write data to the FIFO faster than the FPGA can read it off. In practice this is not likely as the MCU is likely to have to spend some time executing logic between each command during which time the FPGA will be draining the FIFO and executing commands.

The actual commands that I’ve implemented allow the MCU to write raw data to the LCD in passthrough mode, put the FPGA into sprite mode and execute commands to load, move hide and show sprites. You can see the documentation for the commands here on github.

sprite_writer

This is the big one. sprite_writer is responsible for reading sprite definitions from the internal block RAM (BRAM), reading the graphics from the flash IC and writing them to the correct locaton in the SRAM frame buffer.

The outer loop of this component iterates through each of the 512 sprite definitions acting on each one that has the visible bit set to true. For each visible sprite there are then two inner loops that handle the X and Y repitition counters that allow a sprite to be output into a grid pattern on the display.

Inside the X/Y repitition counter there is the main loop that reads data from the flash and writes it to the SRAM. This is the timing-critical part as there are only 4 cycles (40ns) available to completely process each pixel. The loop operates a sort of pipeline where each iteration writes out the previously read pixel to the SRAM while simultaneously reading the next pixel from the flash. Pixel transparency is handled and the internal logic that allows sprites to be partially visible is taken care of.

sprite_writer instantiates a couple of internal components for its own use. Those with a conventional programming background may find it surprising that the things they take for granted such as addition, subtraction and even counting do not necessarily come for free with an FPGA. If you want to add two numbers then you’ll need to implement an adder. Want to multiply? That’ll cost you a multiplier. In these days of virtual machines and interpreted languages it’s refreshing to know that little has changed at the fundamental level since I started a long time ago.

This has not gone un-noticed by the FPGA manufacturers who in many models offer hard implementations of adders and multipliers (sometimes touted as DSP primitives) distributed throughout the chip fabric. The adders that I instantiate are pipelined implementations that take less of a chunk out of my timing budget than those that xst infers if I just use the VHDL + operator.

The memorably named OFDDRSSE component is one of a family of OFDDR primitives that allow you to output a clock to an IOB (a package pin). You might think that you can just hook up an internal clock signal to an output, or maybe gate a clock with some internal logic and output that signal to an IOB. That would be naiive because it would create a high level of skew between the output clock and your design. Clocks in an FPGA are treated like royalty and there’s always a correct way to do the common clock operations. Using an OFDDR primitive is the correct way to output a clock signal to a pin and I use it to create the 100MHz flash clock with the CE clock-enable input used to switch the clock on and off.

frame_counter

In the initial design I explained how I was going to use even frames to write data from the SRAM to the LCD and odd frames to load up the SRAM from the flash. frame_counter monitors the LCD TE signal and each time it spots a rising edge it flips the bit that indicates odd or even frames.

TE is an asynchronous signal so a simple shift register is used to sample the current state and use the two previous states to check for sure if there has been a rising edge.

lcd_sender

lcd_sender is a utility component that outputs a 16-bit value to the LCD data bus, taking care of the interaction with the latch and the correct timing for the LCD WR strobe. I call it from mcu_interface when the design is in passthrough mode and I need to write out a value from the MCU to the LCD. It takes exactly 70ns to execute and has a ‘busy’ output signal and a ‘go’ input signal to allow synchronisation with its operation.

sprite_memory

sprite_memory is an instantiation of the Xilinx BRAM IP core. Block RAM on this FPGA is a true dual-port RAM with configurable data and address bus widths. I use it to store sprite definitions. Here’s the definition of a sprite record:

  -- subtypes for the various bit vectors
  
  subtype bram_data_t       is std_logic_vector(126 downto 0);
  subtype sprite_number_t   is std_logic_vector(8 downto 0);
  subtype flash_addr_t      is std_logic_vector(23 downto 0);
  subtype sram_pixel_addr_t is std_logic_vector(17 downto 0);
  subtype sram_byte_addr_t  is std_logic_vector(18 downto 0);
  subtype sram_data_t       is std_logic_vector(7 downto 0);
  subtype sprite_size_t     is std_logic_vector(17 downto 0);
  subtype sprite_width_t    is std_logic_vector(8 downto 0);
  subtype byte_width_t      is std_logic_vector(9 downto 0);
  subtype sprite_height_t   is std_logic_vector(9 downto 0);
  subtype pixel_t           is std_logic_vector(15 downto 0);
  subtype flash_io_bus_t    is std_logic_vector(3 downto 0);

  -- structure of a sprite in BRAM
  -- total size is 127 bits
  
  type sprite_record_t is record
    
    -- physical address in flash where the sprite starts (24 bits)
    flash_addr : flash_addr_t;
    
    -- pixel address in SRAM where we start writing out the sprite (18 bits)
    sram_addr : sram_pixel_addr_t;
    
    -- size in pixels of this sprite (18 bits)
    size : sprite_size_t;

    -- width of this sprite (9 bits)
    width : sprite_width_t;
    
    -- number of times to repeat in the X-direction (9 bits)
    repeat_x : sprite_width_t;

    -- number of times to repeat in the Y-direction (10 bits)
    repeat_y : sprite_height_t;

    -- visible (enabled) flag (1 bit)
    visible : std_logic;

    -- firstx is the offset of the first pixel to be displayed if the sprite is partially off the left
    firstx : sprite_width_t;

    -- lastx is the offset of the last pixel to be displayed if the sprite is partially off the right
    lastx : sprite_width_t;

    -- firsty is the offset of the first pixel to be displayed if the sprite is partially off the top
    firsty : sprite_height_t;

    -- lasty is the offset of the last pixel to be displayed if the sprite is partially off the bottom
    lasty : sprite_height_t;

  end record;

Since my record is 127 bits long I configure the BRAM to have a 127-bit data width. The address bus must of course be a power of 2 wide so that means I can fit 512 sprite definitions into the BRAM on this FPGA.

frame_writer

frame_writer is the component responsible for doing all the work during the even frames when the FPGA is in sprite mode. It reads the rendered frame from SRAM and writes it out to the LCD. It operates a pipeline, reading out a pixel from SRAM and writing the previously read pixel to the LCD simultaneously during a core 70ns loop. There are 640×360 = 230,400 pixels on the display which means that this whole operation takes exactly 16.128ms. The LCD is reading from its internal GRAM and writing to the physical display at a rate of one every 16.2ms so we come in just within the required timing.

frame_writer does impose a few small requirements on the MCU before sprite mode is engaged. The display window must have been set to the full screen, the write mode must have been set to auto-reset to the start of the display window and the last LCD command to have been sent must be the ‘write data’ command. With this prep done by the MCU the FPGA can just let rip with the continual flow of graphics data. My AseAccessMode class takes care of all this.

lcd_arbiter

My decision to support passthrough and sprite modes means that there are potentially two different parts of the design that want to write data to the LCD bus. mcu_interface will write data via the lcd_sender class in passthrough mode and frame_writer will want to write data when we’re in sprite mode.

It makes no sense to have multiple drivers attempting to connect to the same signal and the synthesis tool will flag it up as an error if you try. The answer is to have an arbitration process that inspects a state variable and connects up the output according to that state.

architecture behavioral of lcd_arbiter is

begin

  process(clk100) is
  begin
    
    if rising_edge(clk100) then
    
      if mode = mode_passthrough then
        lcd_db <= lcd_sender_db;
        lcd_wr <= lcd_sender_wr;
        lcd_ale <= lcd_sender_ale;
        lcd_rs <= lcd_sender_rs;
      else
        lcd_db <= frame_writer_db;
        lcd_wr <= frame_writer_wr;
        lcd_ale <= frame_writer_ale;
        lcd_rs <= '1';
      end if;

    end if;

  end process;

end behavioral;

As you can see it’s a really simple job to do the arbitration.

reset_conditioner

resets, like clocks, have a special place in the heart of the FPGA designer and everyone’s got an opinion on how to best implement a reset. The current thinking, which I tend to agree with, is that reset should be a synchronous signal and that it should only be an input to components that actually need it. Don’t waste space and un-necessarily increase the signal’s fanout by hooking it into a component that doesn’t need to be reset.

Reset is a drastic operation that you don’t want to happen by accident so reset_conditioner implements a slightly longer and more rigorous shift register to ensure that the asynchronous signal from the MCU has been correctly asserted before supplying its own synchronous conditioned output that gets routed to all the components that have something to do upon reset.

clock_generator

Earlier FPGAs from Xilinx always had a PLL on board that you could use to multiply up a clock input to give you a higher frequency for operating the synchronous parts of your design. Xilinx have significantly improved that facility and now they provide multiple Digital Clock Manager (DCM) primitives. The DCMs are highly flexible clock conditioning and synthesis primitives. You can perform all kinds of phase adjustment, clock doubling, multiply/divide synthesis all with guaranteed low skew synchronised outputs.

The above diagram is taken from the Xilinx datasheet and shows the structure of a DCM. My design runs at an internal frequency of 100MHz so I use the CLKFX clock synthesis facility to multiply and divide the 40MHz input to get that 100MHz target.

  inst_clock_generator : clock_generator port map(
    clkin_in        => clk40,
    clkfx_out       => clk100,
    clkfx180_out    => clk100_inv,
    clkin_ibufg_out => open,
    clk0_out        => open
  );

Not so obvious is that I need to use the CLKFX180 output to receive a 100MHz signal phase-shifted by 180°. This signal is required as an input to the OFDDRSSE component that reconstructs the 100MHz clock for output to the flash IC. I’m guessing that it’s used so that the internal logic can just trigger on rising clock edges.

FPGA resource utilisation

A succesful FPGA design must meet its area and timing constraints. Meeting the area constraint simply means that all your logic has to fit in your chosen device. If it doesn’t then there are tricks and optimisations that you can apply but if they don’t work then your only option might be to step up to the next larger device in the range and that can be expensive. Here’s my area utilisation results:

Device utilization summary:
---------------------------

Selected Device : 3s50vq100-5 

 Number of Slices:               795  out of    768   103% (*) 
 Number of Slice Flip Flops:     875  out of   1536    56%  
 Number of 4 input LUTs:         1406  out of   1536    91%  
    Number used as logic:       1326
    Number used as RAMs:          80
 Number of IOs:                   61
 Number of bonded IOBs:           61  out of     63    96%  
 Number of BRAMs:                  4  out of      4   100%  
 Number of GCLKs:                  3  out of      8    37%  
 Number of DCMs:                   1  out of      2    50%  

I like to get value for money out of my kit so a healthy 103% usage is a good result. But wait, didn’t I say that you couldn’t over-utilise? Yes I did but these stats are just an estimate from xst, the synthesis tool. The important tool, map, is the one that fits the compiled design to the device and tries to optimise it. I use map with the ‘try really hard please’ flag set and get these results:

Design Summary
--------------

Design Summary:
Number of errors:      0
Number of warnings:   14
Logic Utilization:
  Number of Slice Flip Flops:           913 out of   1,536   59%
  Number of 4 input LUTs:             1,375 out of   1,536   89%
Logic Distribution:
  Number of occupied Slices:            760 out of     768   98%
    Number of Slices containing only related logic:     760 out of     760 100%
    Number of Slices containing unrelated logic:          0 out of     760   0%
      *See NOTES below for an explanation of the effects of unrelated logic.
  Total Number of 4 input LUTs:       1,449 out of   1,536   94%
    Number used as logic:             1,286
    Number used as a route-thru:         74
    Number used for Dual Port RAMs:      80
      (Two LUTs used per Dual Port RAM)
    Number used as Shift registers:       9

  The Slice Logic Distribution report is not meaningful if the design is
  over-mapped for a non-slice resource or if Placement fails.

  Number of bonded IOBs:                 61 out of      63   96%
    IOB Flip Flops:                       2
  Number of RAMB16s:                      4 out of       4  100%
  Number of BUFGMUXs:                     3 out of       8   37%
  Number of DCMs:                         1 out of       2   50%

Average Fanout of Non-Clock Nets:                3.23

Now that’s much better and gives us a much better insight into the actual device resource utilisation.

Meeting timing generally means that your worst case signal delay must be shorter than the interval between your clock edges. Signal delays are made up of the time taken to execute your combinatorial logic plus the routing delays involved in pushing electrons around the die. Meeting timing can be a black art with seemingly irrelevant changes taking whole megahertz out of your timing results. Once timing is met though, there is zero point in doing any more work on it because your design will not function any differently because of it.

The Xilinx tools report your worst-case timing results in the post-place and route static timing results. My target is 100MHz and here’s the results:

Design statistics:
   Minimum period:   9.516ns{1}   (Maximum frequency: 105.086MHz)

That’s a healthy margin and like I said before it’s pointless trying to improve it because the design will execute exactly the same.

Sample applications

The first sample application is a test that ensures we can use the LCD in passthrough mode. To do this I’ll use the stm32plus graphics library to display some test colours. The stm32plus graphics subsystem is built using a tiered approach that separates the responsibility for the high-level drawing algorithms from the LCD driver which is itself separate from the method used to access the driver.

Up until now I’ve provided access modes that work either by using the STM32′s FSMC peripheral or by using GPIO pins to drive the LCD. To make this custom board work with all the existing stm32plus infrastructure all I had to do was write an access mode class that handles the work of writing to the 10-bit bus that I designed. I called it AseAccessMode where Ase stands for Andy’s Sprite Engine.

Predictable timings are very important for the access mode to function reliably. The setup and particularly the hold time for the WR signal is very important. The FPGA requires 4 cycles or 40ns from the rising edge of WR to it being ready again to receive the next rising edge. The following assembly language is used by AseAccessMode to write a command to the FPGA.

inline void AseAccessMode::writeFpgaCommand(uint16_t value) const {

  // 20ns low, 20ns high = 25MHz max toggle rate

  __asm volatile(
    " str  %[value_low],  [%[data]]   \n\t"     // port <= value (WR = 0)
    " dsb                             \n\t"     // synchronise data
    " str  %[value_low],  [%[data]]   \n\t"     // port <= value (WR = 0)
    " dsb                             \n\t"     // synchronise data
    " str  %[value_low],  [%[data]]   \n\t"     // port <= value (WR = 0)
    " dsb                             \n\t"     // synchronise data
    " str  %[value_high],  [%[data]]  \n\t"     // port <= value (WR = 1)
    " dsb                             \n\t"     // synchronise data
    " str  %[value_high],  [%[data]]  \n\t"     // port <= value (WR = 1)
    " dsb                             \n\t"     // synchronise data
    " str  %[value_high],  [%[data]]  \n\t"     // port <= value (WR = 1)
    " dsb                             \n\t"     // synchronise data

    :: [value_low]  "l" (value),                // input value (WR = 0)
       [value_high] "l" (value | 0x400),        // input value (WR = 1)
       [data]       "l" (_busOutputRegister)    // the bus
  );
}

The dsb (Data Synchronisation Barrier) instructions are important to get predictable timings. Without them the powerful F4 MCU core will optimise its execution pipeline and give you results that don’t tally with the raw instruction timings published in the ARM reference manual.

I’ve designed passthrough mode to require just two transfers to send either a complete 16-bit data or command value to the LCD or to ‘escape’ into sprite mode.

The first transfer sends either the first 8 bits of the 16-bit LCD data value or, if the high bit is set it will immediately escape into sprite mode and the second transfer never happens.

The second transfer sends the top 8 bits of the 16-bit LCD data value and, in the high bit, the value of the LCD RS (register select) line.

You can see the source code to the passthrough test here on github. I must say I was very pleased when this test worked because it was the first time that I’d seen the LCD fire up and display data whilst under the control of the FPGA, even though that control is heavily martialled by the MCU in this passthrough mode.

Manic Knights

Right back at the beginning of this article I did promise you a game demo and I’m here now to make good on that promise. I’m going to put together a demo with some commercial-quality graphics that shows how a platform game could be implemented using this system. The game will feature animated sprites that follow their paths in a non-linear fashion using easing functions that make use of the hardware FPU on the F4 to accelerate and decelerate. The game will be able to scroll the visible window in all four directions to allow the player to explore a world that’s considerably larger than the display.

Tile-based map

The game world is divided into an array of 20×30 tiles. Each tile is 64 pixels square. I used a free program called Tiled to create the map using a set of graphics that I bought from cartoonsmart.com. Free graphics are available but the quality isn’t so great so I thought I’d spend a few dollars on something of commercial quality.

The Tiled program allows you to quickly draw your world and then it’ll save out an XML representation that you can parse into whatever format you want. My main issue is that this is a game that operates in landscape mode but the sprite engine runs in portrait mode — it must do that to stay in sync with the panel refresh ‘beam’ which always runs vertically regardless of the logical display orientation.

To solve this issue I wrote a small C# utility to export the tiles to PNG format and rotate them 90° counter-clockwise on the fly. That, combined with some perl glue solved the issue of getting the Tiled output into a form that I could easily upload to the flash IC.




Click for larger (much larger)

The above image shows the full world design, rotated back to landscape format for easy viewing here. This world will form the background to the game. In the game implementation I set aside a block of sprite ‘slots’ at the start of the array that are reserved for the background. As the player moves around the world these reserved slots get updated so that they always hold the correct grid for the background at that point. Because these sprites are at the start of the array they will always be behind sprites that are subsequently drawn into the world. Speaking of which…

Baddies

All games need some baddies for our hero to avoid as he navigates throughout the world. In true platform tradition I’ve implemented enemies that walk back and forth along the platforms, the idea being that the hero times his jumps so he avoids them.

The image shows the first six frames of a twelve frame animation sequence for one of the enemy characters. The (255,0,255) pink background is hardcoded into the FPGA to be interpreted as a transparent colour — whatever pixels were previously drawn at this position will show through.

In my game I use an Actor class to manage the transition of a character along a series of paths. The character is ‘eased’ along the path using an easing function. I can choose from functions that appear to accelerate and decelerate at various rates or perhaps do a bouncing effect.

These mathematical functions are all supported in the stm32plus fx namespace. The key to getting them to work in reasonable time — the 16.2ms that I’ve got between frames — is the hardware FPU built into the F4 MCU. Multiplication and addition are single cycle operations on the built-in floating point registers, as are conversions back and forth between the FP and integer registers.

Of course animation isn’t just for the bad guys. Any platform game worth its salt has an array of features such as lifts and static but animated decorations such as lights. I’ve implemented some of these to show how it could be done.




Click for larger (much larger)

In my demo implementation I completely animate the world but I don’t provide a hero for you to guide because the logic to implement the basic physics and collision detection would take longer than the time I have available so instead I allow you to browse the world using up/down/left/right controls connected to buttons or a joystick.

The demo logic that includes updating the world position, animating all the sprites and uploading everything to the FPGA has a hard limit of 32ms in which to execute, of which a variable portion is a safe window for uploading new sprites to the FPGA. In debug mode all my demo logic (which is not optimised) takes only 1ms which is very quick and would leave ample time to add the additional logic required to implement a main character in the game. The MCU resource usage is shown below (-Os optimisation)

   text    data     bss     dec     hex filename
  86500    2128    1116   89744   15e90 manic_knights.elf

Video or it didn’t happen…

I’ve uploaded a short demo to youtube that shows the game demo in action. Click below to see it.

Power consumption

The bench power supply that I use shows me the current that it’s supplying on its front panel. Let’s take a look at some figures taken at different points during the game demo.

The current consumption is an overall figure for the board including the FPGA, MCU, flash, LCD and SRAM.

Passthrough mode, backlight off 210mA
Passthrough mode, backlight 90% 300mA
Sprite mode, no activity 310mA
Sprite mode, game running 360mA


Signal integrity

Signal integrity was always going to be an issue with a 2-layer board featuring not just the complex and demanding FPGA but a high-end Cortex M4 MCU as well. I did some signal sampling with my bench oscilloscope to peek under the hood and see just how ragged things really are. Firstly, here’s the output from the 40MHz oscillator with the FPGA programmed and the game demo running.

Not too bad a signal from the oscillator, there’s some bounce at the top and the bottom of the edges but not enough to cause a problem and there’s no sign of any glitches. Now let’s take a look at the WR signal from the MCU to the FPGA because it’s running at a speed that my oscilloscope can hit with enough samples to reconstruct a decent picture of the signal.

A different picture emerges here. There’s a spike at the bottom which I assume must be ground bounce and there’s evidence of ringing after the rising edge. All together though it’s not enough to cause a false edge to be detected.

The design overall is completely reliable for all the time that I’ve had it running but I do believe that I’m getting away with it due to the large safety margin between the high and low 3.3V LVCMOS signalling thresholds. If the design had to run at the increasingly common 2.5V or 1.8V level then that safety margin would be eroded possibly to the point where I’d see glitching.

A very promising feature implemented in the more modern Xilinx FPGAs is Digitally Controlled Impedence (DCI). DCI allows the FPGA to automatically apply a termination resistance that matches the impedence of the trace that it’s connected to. I would certainly enable this feature if it were available in the device I was using.

Lessons learned

In a large project like this there’s always areas that could be improved even though in my testing I found the board to be completely reliable when powered with an external power supply in the 4.6V to 5.0V range. Here’s what I think could be done to improve the overall system.

  • The heat sinking around the AMS1117 3.3V SOT-223 package isn’t good enough. I should increase the size of the pad that the thermal tab is soldered to and mirror the pad on the opposite side of the board, connecting them together with a grid of vias.
  • There’s no audio. This was intentional for the first phase of the design. Now I know that the design works I could add a few DACS and a headphone amplifier to provide audio capability.
  • Signal integrity. This was always going to be a challenge with a 2-layer board and I plan to publish a separate article that shows my findings regarding the shape and quality of the signals at various points of the board. There are definitely changes that can be made to the board layout that would optimise the return current paths and improve the signal integrity.
  • The STM32F429 turned out to be just the overkill that I thought it would be and it’s the most expensive part on this board. My guess is that the sweet spot would be the new 84MHz F401 device that retains the important SDIO peripheral and FPU core while running plenty fast enough to execute game logic and costing half the price of the F429.

Final words

It’s taken a few months-worth of evening and weekend hacking to pull all of this together into a coherent and working design but it’s been a success so I certainly think it was worth it.

I’m always happy to hear your comments and suggestions. You can leave a comment at the bottom of this article or if you want to start a discussion then please feel free to drop by the forum and let me know your thoughts.

If you’re looking for the source code and you missed all the links in the article body then click here to go to github for the MCU/FPGA source or click here to go to my downloads page to get the PCB gerbers for the board itself.


ST-Link v2. One programmer for all STM32 devices

$
0
0

Over the last few years I’ve amassed quite a collection of STM32 development boards. Third party boards dominate my collection for the F1 series whilst I have official ST discovery boards for the F0, F4 and F1 Value Line. We’ve been lucky with the official ST discovery boards because they all come with an ST-Link included on the PCB so you don’t need to buy anything else at all to get a complete C++ development and visual debugging environment up and running.

The embedded ST-Link debugger on the discovery boards is implemented inside ST’s own STMF32F103C8T6 MCU in a 48 pin QFP package with an external 8MHz clock. I suppose that when you are the manufacturer of these MCUs it’s cheaper to do it this way than to manufacture a custom ST-Link IC just for this purpose

The situation with the commonly available third party F1 boards was always less clear because up until a year or so ago the ST-Link interface was not fully operational in the popular and free OpenOCD debugger. Because of the lack of support in OpenOCD for ST-Link v2 I was forced to go down the third party route and use the Olimex ARM-USB-TINY-H for all my F1 programming and debugging.

This is a JTAG-based programmer that is compatible with ARM devices from many manufacturers. It’s fast, reliable and it costs double what you should be paying for an ST-Link v2.

Times have changed since those early days and now since the release of version 0.7.0 of OpenOCD the support for ST-Link is completely stable and there’s no reason why you can’t use ST-Link v2 for all your STM32 programming and debugging needs.

Not only is it the most compatible of all the programmers and debuggers, it’s also probably the cheapest. At the time of writing it’s only £18.68 plus VAT at Farnell. If you’re buying elsewhere then make sure that you’re getting the ‘v2′ device. There are still some places offering the older ‘v1′ version.

In the rest of this article I’ll take each board that I’ve got and explain how to connect and use it with OpenOCD using the ST-Link v2 programmer. Once you’ve got a live OpenOCD connection you can flash your .hex binaries and do interactive debugging using Eclipse.

The version of OpenOCD that I’ll be using is 0.8.0 and my test system will be Windows 7 x64 using Cygwin. The OpenOCD binaries were downloaded from Freddie Chopin’s site.

STM32F103VET6 ‘Mini’ board

This was the first F1 development board that I bought some years ago and it’s still available in various forms on ebay. Connectivity with the ST-Link device is via a direct connection to the 20-pin JTAG/SWD header using the supplied cable. Since the ST-Link connection is not designed to supply power to the target board you must also connect up the USB A-B cable.

Here’s the command sequence for connecting with OpenOCD:

$ pwd
/cygdrive/p/docs/cyghome/andy/openocd-0.8.0
$ bin-x64/openocd-x64-0.8.0.exe -f scripts/interface/stlink-v2.cfg -f scripts/target/stm32f1x_stlink.cfg 
Open On-Chip Debugger 0.8.0 (2014-04-28-08:42)
Licensed under GNU GPL v2
For bug reports, read

http://openocd.sourceforge.net/doc/doxygen/bugs.html

Info : This adapter doesn't support configurable speed
Info : STLINK v2 JTAG v17 API v2 SWIM v4 VID 0x0483 PID 0x3748
Info : using stlink api v2
Info : Target voltage: 3.225049
Info : stm32f1x.cpu: hardware has 6 breakpoints, 4 watchpoints

‘Redbull’ STM32F103ZET6 board

This is my favourite F1 development board. It’s based around the full fat STM32F103ZET6 144-pin MCU and comes with additional SRAM and flash resources as well as the usual buttons and LEDs. On my board the additional SRAM and flash ICs are the ISSI IS61LV256160AL-10TL, the SST 39VF1601 and the Samsung K9F1G08U0C. These are correctly mapped to the MCU’s FSMC peripheral as you’d expect. My only minor gripe with the board is that there’s not enough exposed GND and 3.3V pins for hassle-free connecting of external peripherals.

Connecting the board with ST-Link is identical to the ‘Mini’ board described above. Simply connect it up to the 20-pin JTAG header and run the same command sequence:

$ pwd
/cygdrive/p/docs/cyghome/andy/openocd-0.8.0
$ bin-x64/openocd-x64-0.8.0.exe -f scripts/interface/stlink-v2.cfg -f scripts/target/stm32f1x_stlink.cfg 
Open On-Chip Debugger 0.8.0 (2014-04-28-08:42)
Licensed under GNU GPL v2
For bug reports, read

http://openocd.sourceforge.net/doc/doxygen/bugs.html

Info : This adapter doesn't support configurable speed
Info : STLINK v2 JTAG v17 API v2 SWIM v4 VID 0x0483 PID 0x3748
Info : using stlink api v2
Info : Target voltage: 3.272160
Info : stm32f1x.cpu: hardware has 6 breakpoints, 4 watchpoints

PowerMCU STMF407ZGT6 board

Making an entry into the F4 development board business can’t be easy when ST sell the discovery board for such a low price. Therefore any competitor is going to have to offer significant extras to have any hope of selling their board.

This development board offers several nice upgrades to ST’s discovery offering. Firstly the MCU is the 144-pin device which means that all banks of the FSMC peripheral are available and this board builds on that by including a Samsung K9F1G08U0C NAND flash chip on the front side of the board.

Note that the external clock is 25MHz which means that a straight recompile of firmware that targets the discovery board will not be enough. You will need to go into the startup code and set the appropriate PLL multipliers to get the clock tree set up correctly. ST provide an Excel spreadsheet with macros that will do this for you – search for AN3988 to get it.

I can’t finish up with the front side of this board without having a moan about the JTAG/SWD header. It’s a reduced size 2mm pitch socket that requires an adaptor to connect to the standard 2.54mm JTAG header. I can’t seriously believe that it was cheaper to save a few millimeters of board space than it was to ship a cable adaptor with every board as they do. Barmy decision.

I need to show the back of the board because there’s a few significant components down there. The large IC is a Cypress CY62157EV30L SRAM device, correctly connected to the MCUs FSMC peripheral. The unpopulated footprint looks like it was originally designed to hold a NOR flash IC but is unpopulated on my board.

I’m pleased to see the linear regulator is an AMS1117 3.3V device. This a much more heavy duty regulator than the one on the discovery board and will allow you to connect more demanding peripherals than you can attach to the discovery board.

And so on to the OpenOCD connectivity. Hook up your standard JTAG cable to the board via the adaptor supplied with the board and here’s how to attach to it:

$ pwd
/cygdrive/p/docs/cyghome/andy/openocd-0.8.0
$ bin-x64/openocd-x64-0.8.0.exe -f scripts/interface/stlink-v2.cfg -f scripts/target/stm32f4x_stlink.cfg 
Open On-Chip Debugger 0.8.0 (2014-04-28-08:42)
Licensed under GNU GPL v2
For bug reports, read

http://openocd.sourceforge.net/doc/doxygen/bugs.html

Info : This adapter doesn't support configurable speed
Info : STLINK v2 JTAG v17 API v2 SWIM v4 VID 0x0483 PID 0x3748
Info : using stlink api v2
Info : Target voltage: 3.217755
Info : stm32f4x.cpu: hardware has 6 breakpoints, 4 watchpoints

PowerMCU STMF207ZGT6 board

This entry into the F2 development board market is from PowerMCU.com and it’s pretty much identical to the F4 board that I described above, including the external 25MHz crystal that will require some attention in your code if you’re already working with something that assumes an 8MHz crystal.

All the additional features on this board are identical to those on their F4 offering so I won’t repeat them here.

The back side of the board yields no surprises having already seen the F4 board. Let’s move quickly on to the OpenOCD commands to connect to it:

/cygdrive/p/docs/cyghome/andy/openocd-0.8.0
$ bin-x64/openocd-x64-0.8.0.exe -f scripts/interface/stlink-v2.cfg -f scripts/target/stm32f2x_stlink.cfg 
Open On-Chip Debugger 0.8.0 (2014-04-28-08:42)
Licensed under GNU GPL v2
For bug reports, read

http://openocd.sourceforge.net/doc/doxygen/bugs.html

Info : This adapter doesn't support configurable speed
Info : STLINK v2 JTAG v17 API v2 SWIM v4 VID 0x0483 PID 0x3748
Info : using stlink api v2
Info : Target voltage: 3.238120
Info : stm32f2x.cpu: hardware has 6 breakpoints, 4 watchpoints

Cheap ST-Link clones

Recently there have been a number of bare boards appearing on ebay for as little as £5.00 that claim to function as ST-Link v2 devices. They’re so cheap that I thought I’d pick one up and see what the story is.

My first observation is that I can’t see how these things are legal at all. Apart from the obvious unauthorised use of the USB VID and PID that belong to ST Microelectronics there is the question of the firmware implementation itself. If you look closely at an official ST discovery board then you’ll see that ST-Link is implemented in firmware inside an STM32F103 device. The exact same model of STM32F103 appears on this clone. I can only surmise that the manufacturer has somehow managed to circumvent ST’s code readout protection (assuming that ST remembered to enable that protection) and cloned the firmware byte for byte.

Does it work though? Let’s try it out and see. I’m going to refer to it as the FakeLink from here on just so you know that’s the one I’m using. I hooked it up to the ‘RedBull’ F1 board using jumper wires connected from the FakeLink to the JTAG socket using the following pinout. GND -> GND(4), 3V3 -> board 3V3, CLK -> SWCLK(9) and IO -> SWDIO(7).

For the tests I connected just the FakeLink USB connector to the computer. It seems that the FakeLink can power the dev board from its 3.3V output. I also tested it with the dev board receiving power from its USB connector and the results were the same.

Now let’s pretend it’s a real ST-Link and connect to it via OpenOCD.

$ bin-x64/openocd-x64-0.8.0.exe -f scripts/interface/stlink-v2.cfg -f scripts/target/stm32f1x_stlink.cfg 
Open On-Chip Debugger 0.8.0 (2014-04-28-08:42)
Licensed under GNU GPL v2
For bug reports, read

http://openocd.sourceforge.net/doc/doxygen/bugs.html

Info : This adapter doesn't support configurable speed
Info : STLINK v2 JTAG v17 API v2 SWIM v4 VID 0x0483 PID 0x3748
Info : using stlink api v2
Info : Target voltage: 3.560727
Info : stm32f1x.cpu: hardware has 6 breakpoints, 4 watchpoints

So far so good though the target voltage is higher than I would have expected. Next I’ll try the basic functionality and flash a ‘blink’ example to the MCU using telnet to control the connected OpenOCD server.

$ telnet localhost 4444
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
Open On-Chip Debugger
> reset init
target state: halted
target halted due to debug-request, current mode: Thread 
xPSR: 0x01000000 pc: 0x08000238 msp: 0x2000fffc
> flash write_image erase p:/tmp/blink.hex
auto erase enabled
target state: halted
target halted due to breakpoint, current mode: Thread 
xPSR: 0x61000000 pc: 0x2000003a msp: 0x2000fffc
wrote 8192 bytes from file p:/tmp/blink.hex in 0.771045s (10.376 KiB/s)
> reset

The first command, reset init resets the MCU and brings it under the control of OpenOCD. The next command, flash write_image erase p:/tmp/blink.hex writes the compiled blink program to the MCU and finally we reset it to run the program with reset.

It worked as expected. So far so good for the FakeLink. Now for the only test that really matters, can I debug the program from Eclipse just like I can with the real ST-Link? People often write to me asking for the Eclipse debug settings that I use, so here they are:



Executing the debug configuration resulted in everything working just as I would expect it to. Eclipse was able to auto-flash the compiled executable and then set breakpoints and single step through the code. Basically it just worked.

Verdict on the fake ST-Link

Functionally the device passed every test that I executed so in that respect I have to give it a plus mark. However, I just can’t condone the blatant illegality of the thing. It is an unashamed rip-off of ST’s intellectual property as well as the PID/VID owned by ST. In my view, especially as the offical ST-Link not expensive, the right thing to do is to buy the ST device and don’t support these clones.

Hacking the HP Z800 Xeon motherboard into a standard case

$
0
0

About four years ago now the company I work for were investing in some new servers for a project that we were working on and what turned up were quad LGA1366 socket Xeons with support for up to 192Gb of memory. In most cases two sockets were populated with Intel Xeon X5670 CPUs, hex core devices with 12Mb of cache memory. We ran Redhat Enterprise Linux on them and they were, and still are, extremely fast linux servers that could operate as physical boxes in our production environment or virtuals in development.

I wanted one. I still want one. I looked around and noticed that HP were doing a very similar board with two sockets and, crucially, it was packaged up into what looked like a normal PC tower case. And it was very expensive, much too expensive to justify forking out for one.

Fast forward four years and times have changed. You can now pick up a brand new replacement motherboard for an HP Z800 on ebay for £100. So that’s what I did, and here it is.




click for larger

Excitement quickly turned into a daunting realisation that I may have bitten off more than I could chew. The board is massive. It will not fit into any ‘normal’ PC case, not even an EATX tower case. The mounting screws will not mate with any of the ATX holes in a motherboard tray. The large ATX power connector is non-standard. The CPU fan headers are non-standard. There’s a separate power connector for the memory banks with a proprietary connector. The list goes on…

Clearly this is a server motherboard adapted only slightly to fit into HP’s proprietary case with HP’s proprietary power supply and cooling system. Buyers of the Z800 certainly received their money’s worth compared to an anonymous box filled with generic parts.

I’m not one to give up in the face of a technical challenge and besides I’d just forked out a hundred notes on the board so the rest of this article will go through all the steps in detail that you would have to do in order to get one of these beasts up and running yourself. There’s no cheaper way to get 12 cores of Xeon power under your desk.

The BIOS and CPU compatibility

The Z800 board comes in three different revisions, indicated by the AS# number printed on the white sticker located directly below the big black chipset heatsink.

The revisions are indicated by 001, 002 or 003. As you can see from the image this board is an 002 revision. The executive summary to what I’m about to explain is that if you have revision 001 or 002 then you can officially use only Xeon X55xx CPUs. If you have the later 003 revision then you can use either X55xx or X56xx CPUs.

The issue is the BIOS bootblock. It’s physically write-protected within the BIOS and does not get upgraded when you flash the BIOS. So if you have one of the earlier revisions and you flash your BIOS then the X56xx CPUs will be recognised but the bootblock may fail to get that far and you could be presented with a black screen and a POST failure when you power up.

The key is the bootblock date shown in the BIOS System Information screen. A date of 11/10/09 is required to support X56xx CPUs.

Note how I’ve used the word officially a few times back there because the fact is that it might work. Word on the HP forums is that the behaviour of an earlier revision flashed up to the latest BIOS and used with X56xx CPUs ranges from ‘works for me’ to ‘sometimes won’t boot’ to ‘total failure to POST’. The worst issues seem to be with dual-CPU configurations. Later on this article I’ll go through my own setup and experiences.

The case

We need to start with a case for this thing and like I said, even the largest ‘normal’ tower case will be too small. You need an HPTX format case. There aren’t many of these and the one I chose was the Nanoxia Deep Silence 6.

I got it from Quiet PC in the UK and managed to grab a B-grade bargain at £139, that’s a £50 reduction on the full price and I couldn’t tell what made it B-grade because it looks perfect to me. Maybe I have lower standards than most.

I can confirm what the online reviews say when they describe this case as being massive. It is indeed, truly huge. I expect that if it were hollowed out then I could fit my current Fractal Design tower case inside it. It swallows the Z800 motherboard as if it were designed for it. Result.

Modifying the case to hold the motherboard

I mentioned before that the motherboard mounting holes do not match up with the corresponding holes in the case’s motherboard tray. I had to drill and tap new holes for enough screws to hold the board safely with its heavy load of up to two CPUs with large heatsink/fans attached.

This is a fiddly process. Before starting I fitted a small random PCI card into the motherboard and used it to work out exactly where the board needed to be so that the cards lined up with their fixing holes on the side of the case. I used a 2.8mm drill and then tapped the holes out to the correct imperial 6/32 size for motherboard mounting posts.

The hardest part is offering up the motherboard to the tray and accurately marking where to drill. You have to be very accurate or the posts won’t line up with the holes and you only get one chance. A slow, controllable drill that won’t skid is essential and it needs to get into some confined spaces.

Eventually I got all the holes drilled that I thought I could get away with and the board is held securely clear of the base of the case.

This is by far the hardest part of the job and even with all my careful measurement and drilling my expansion cards are slightly bent in their slots due to a couple of millimetres of misalignment. Oh well, at least they’re held tight! If I were to do this again I would probably drill wider diameter holes in the motherboard tray to allow a small amount of adjustment. I’d then use wide washers on either side of the tray, a standoff above the upper washer, a screw through everything and a locking nut at the back of the tray.

The power supply

The original Z800 comes with a power supply engineered by HP to fit the genuine HP case. It is, of course, totally incompatible with a standard PC case so I needed a standard PC power supply that met the requirements of the board.

The HP power supply distributes the main 12V output across 8 different rails, each with a maximum delivery of 18A but with a combined output ceiling of 70A for the 850W unit and 92.5A for the 1110W option. I have no way of knowing how much will actually be drawn by each rail so it’s safest for me to buy a single rail unit with a nice high overall amperage.

I plugged my prospective peripherals into an online PSU amperage calculator and it came out with a recommended 750W supply for dual 130W TDP processors, four 15K SCSI drives, an SSD and an ATI 7970 graphics card. The power supply is so important to the overall stability of the system that I don’t skimp on it but, as usual, I’m determined to get the best deal I can. After much research I bought a SuperFlower Leadex 1000W (80+ gold) supply from Overclockers UK on one of their ‘this week only’ deals for £110. It’s very highly thought of and should be more than enough for this system.

Custom cabling

I mentioned before that some of the board power connectors are non-standard, in fact only the 8-pin EPS connector has a standard pinout and fitting. The main ATX power connector and the memory power connector are custom HP designs. Thankfully the Z800 service manual gives the pinout of these connectors so it’s not hard to make up some custom cables to do the job.

The main ATX cable

The above image is taken from HP’s service manual for the Z800. It shows the pinout of the power cable, taken as you hold the cable and look at the connector. The first issue is the physical cable itself. Each of the pins in the connector is physically keyed with either a square or a slightly rounded socket and there are only 18 pins.

Luckily the order and shape of the pins is identical to a standard 24-pin ATX power cable leaving 6 pins unused at one of the ends. To solve the physical cable issue I bought an ATX power cable extender on ebay for a few pounds and simply sliced off the unwanted pins with a dremel and sanded it to leave a nice finish. The power supply’s standard connector will plug into the unmodified end of the extender and the modified end will go into the motherboard’s socket.

The second issue is the pinout. It’s not the same as the standard ATX pinout at all. To solve this issue I cut the wires of the extension cable around the center and simply remapped them to match the standard by soldering the ends together.

Most of the names in HP’s pinout have an obvious mapping to the ATX standard but there are some that need an explanation. The +12V and V12 lines are all the same and need connecting to the +12V ATX line. The PSU_ID line is an unknown. My guess is that it was designed to allow HP to detect which of the PSU models were fitted and maybe display it somewhere. I’ve no idea whether this should float, or be pulled low or high so I started with the easiest option which was to let it float by simply tying it off with a piece of masking tape.




click for larger

The memory cable

The Z800 has a whopping 12 banks of DDR3 memory available and the designers have, in common with many server motherboards where stability is paramount, opted to give it its own power supply. Again the HP service manual comes to the rescue with the pinout of the connector as you look at the cable.

The connector type is the same as the main ATX power cable and again we are very lucky that the keying of the connector shape matches up to one end of the standard 24 pin ATX cable. I purchased another ATX extender cable and sliced off the part of the connector that was not required.




click for larger

To hook it up to the main power supply I took a Molex ‘Y’ splitter of the type that you often get for free when you buy case accessories and cut off one of the plug ends. Molex cables have GND, 12V and 5V lines which is all I needed to wire up my hacked cable. It doesn’t matter that many of the 12V wire into one from the Molex connector because these lines don’t carry a high current. The unused wires were tied off with insulating tape.

Since the publication of this article a reader sent me a link to a seller on the Chinese TaoBao shopping site who offers a custom made cable set that does everything all my hacking does. If you have access to TaoBao then I’d recommend that you buy one instead of going through the hassle of making one like I did.

The memory

Because this PC is so closely related to a server board it requires registered PC3-10600P 1333MHz DIMMs. A wide range of configuration options is available up to a massive 192Gb when in dual-CPU configuration, explaining the presence of a separate power socket on the board for the memory banks.


After much scouring of the internet I scored 24Gb of original HP memory for about £60.




click for larger

This first lot of 24Gb will do fine for my initial tests. When I come to install the full complement of two Xeon processors then this board’s design means that I can either keep the 24Gb configuration (12Gb will be installed per processor) or I can double up to 48Gb by populating every slot with a 4Gb DIMM.

The original HP design features a large heatsink/fan unit dedicated to just the memory banks. Obviously HP have to design a system that will cope with the full load of 192Gb of memory potentially sharing a case with dual graphics cards and dual 130W TDP processors and a whole raft of hard disks. My more modest target of 24Gb of RAM and a single graphics card means that I’m not going to need a dedicated memory cooling system.

The CPU

At this point it would be a fairly big risk for me to go out and buy the 3.33GHz X5680 hex-core Xeon that I actually want because there’s still a possibility that this whole system will not work and I’ll be left with an expensive CPU to sell on. So for testing purposes I did a bit of bottom-feeding on ebay and scored a brand new quad-core 2.13GHz E5506 for the princely sum of £1.43, less than the cost of a power cable extender. Crazy.


Unloved, the quad-core E5506

The heatsink & fan

HP’s designers must have fallen asleep here because the original HSF is actually rather close to being a standard fit. The only real problem is that it comes with an odd 5-pin fan connector instead of the usual 4-pin. The extra pin is labelled TACH2 and from doing a bit of research it seems that on many boards HP have just grounded the extra pin. The other issue is the price. Even used units are very expensive for what they are so I’m going to go with a standard unit.

The HSF that I bought is the Zalman CNPS10X Optima, costing just £16.50 on Amazon. What’s really great is that no modifications are required to fit it to the board. The board already comes with a backplate ready to accept the screws that go into the fan assembly. All I have to do is screw it down on top of the CPU.

Or so I thought. The screws supplied with the Zalman are not the correct gauge for the thread on the backplate. It’s very close but if you pull on them then they’ll slip out. Not good and potentially fatal to the CPU if it were to let go in operation. I don’t know who to blame here because the thread on the backplate is a standard gauge – normal computer case screws are a perfect fit. Anyway I found some long-ish replacement screws and just fastened it down reasonably tight and as even as I could make it.

I connected the 4-pin Zalman fan connector to the first 4 pins on the CPU fan header, leaving the mysterious ‘TACH2′ pin floating. Hopefully the BIOS doesn’t care about this pin.

Edit: After upgrading to the X5680 Xeon I started to get a POST warning about having a high powered CPU and a low-powered CPU fan so maybe the fifth line is actively driven to a level by HP’s official high-speed fan to indicate that it’s compatible with the 130W Xeon’s. Anyway, there’s nothing low spec about the Zalman so that warning is duly ignored.

The front panel connectors

Every motherboard has a bank of pins for connecting up the power switch, hard disk LED, power LED etc and the Z800 is no different. On this board it’s the bank of pins labelled ‘P5′. Now here we have a problem because the service manual does not list the pinout for this bank.

I scanned the internet for clues, even searching google images to try to find an internal shot of the workstation where I could see the wiring. No luck. The nearest I came was this post on one of HPs forums that related to a similar but not the same workstation board. Just in case that post disappears off the internet, here’s the pinout:

Pin 1 = Hard drive clear plastic lens bottom LED  (which shows HD activity)… white wire
Pin 2 = Frosted lens top LED  (solid or blinking green LED)  (one wire of front panel dual-color LED)… red wire
Pin 3 = Hard drive clear plastic lens bottom LED (may be ground end)… green wire
Pin 4 = Frosted lens top LED Red (solid orange color LED)  (other wire of front panel dual-color LED)… black wire
Pin 5 = No wire attached in xw6400.  May be ground.
Pin 6 = Front panel switch for power on/off… thick white wire attached..  May be the positive.
Pin 7 = No wire attached in xw6400
Pin 8 = Front panel switch for power on/off… thick white wire attached.  May be the ground.
Pin 9 = No wire attached in my xw6400.  May be +5V
Pin 10 = Key (no pin on motherboard header; plastic filling the #10 hole in the receptacle)
Pin 11 = No wire attached in xw6400.  Has gray wire of ambient air temperature thermisitor attached in xw6600 cable
Pin 12 = No wire attached in xw6400.  Has brown wire ofambient air temperature thermisitor attached  in xw6600 cable
Pin 13 = Orange wire, but not used in my xw6400.  For hood sensor connector.  May be ground.
Pin 14 = Orange wire, but not used in my xw6400.  For hood sensor connector.
Pin 15 = Key (no pin on the motherboard; plastic filling the #15 hole in the receptacle)
Pin 16 = Blue wire, but not used in my xw6400.  For hood sensor connector.
Pin 17 = Internal speaker +… yellow wire
Pin 18 = Internal speaker - … yellow wire

In the absence of anything better to go on I decided to give that pinout a try. At the very least I need the power switch to work, everything else is a ‘nice to have’.

That’s what it looks like with the power switch, power LED and hard disk LED connectors in place. I’ve heard from someone else who’s working on one of these boards that the pinout for the internal speaker connector is also correct.

Testing

Now all my modifications are done, it’s time to put it all together in the case and do a quick test. You only need memory, a CPU and a graphics card to do a POST test on the board so I retrieved an old PCIe graphics card from storage in the garage and hooked it up to an old monitor for testing.




click for larger

The board actually looks quite normal inside that case but I can assure you that both the case and the board are very large indeed.

The moment of truth. Switch it on and see what happens.




click for larger

It works!

Everything seems to be OK. The power button and power LED connections are correct, all the memory is detected, the CPU is detected and the CPU fan is spinning. I went straight into the BIOS settings screen and had a look around. I noticed that the BIOS revision was behind the latest so I upgraded it to the latest version via a USB stick.




click for larger

The BIOS is able to to detect the CPU fan speed so I guess it doesn’t need that mystery TACH2 pin. There doesn’t seem to be anything that shows the model of PSU that it thinks is connected so I’ve no idea what, if anything, that PSU_ID pin was for.

Now we’ve got a good system it’s time to get some storage in there.

Storage

The Z800 board comes with a ton of SATA connectors and a SAS RAID controller manages at least some of them. The thing is, they are all SATA-2 3Gb/s and I’m planning on connecting a Samsung EVO 840 240Gb SSD as my primary OS and programs disk.

The Samsung SSD is a SATA-3 6Gb/s device and although in practice nobody can honestly tell the difference between 3Gb/s and 6Gb/s in real-life usage it would be nice to have the primary SSD on a full-speed bus. The answer is a cheap PCIe card with a couple of SATA-3 connectors on it.

I picked this ‘Syba 2 Port SATA 6Gbps PCI-Express x1 2.0 Card’ up on Amazon for less than 20 quid. I picked it because in the reviews there was a Z800 user who’s using it successfully as a boot device which is exactly what I want to do.

They’re using the popular Asmedia ASM1061 chip in a QFN package as the SATA controller. There’s a bit of wonky soldering on a few of the caps there but generally it looks fine. They’re using the same board layout for their SATA and eSATA products which explains the presence of the 0R bridging resistors being used as option selectors.

Working Specification

Now it’s all working I can move in all the rest of my peripheral ‘furniture’ and upgrade the CPU to the final specs. I picked up a Xeon X5680 on ebay for around £200 and thankfully it worked in my 002 revision board without any problems. I actually bought the X5680 before I found out about the issue with the BIOS bootblock and so I do consider myself one of the lucky ones. I think that if I hadn’t used the E5506 to upgrade the BIOS to the latest revision then I would have had a POST failure with the X5680 due to missing microcode for that newer CPU.




click for larger

I’ve got an 003 revision board on the way to me courtesy of ebay so when I do eventually decide to upgrade to dual Xeon’s then I’ll have a board that’s officially supported.

The rather ugly RAID configuration was ported over from my previous system. If anyone were building a new array today with this motherboard then it makes much more sense to use the onboard SAS RAID controller than the Ultra320 expansion card that I’m using.

Under Windows 8.1 with an ambient 20°C temperature all cores are idling at around 29°C. Stress testing with the prime95 application causes the cores to go up to about 58°C after which I got bored watching something that I’m never going to do in real usage and stopped it.

Notes, issues etc.

I did encounter some issues while assembling the full system with all my expansion cards. Here’s what I discovered and how I worked around each issue.

Attempting to install my XFX 7970 graphics card caused the biggest headache. It started OK then as soon as the Windows 8 start screen appeared the monitor signal was lost and I had to power off the computer. Worse, it would not POST afterwards. Worse than worse, restoring the previous graphics card also would not POST. I honestly thought I’d fried the entire motherboard.

However, some minutes later it all started working again with the previous card. I’m guessing that the XFX card somehow triggered a resettable overcurrent or heat-related fuse and a few minutes later the fuse reset. I’m now using a cheap Nvidia 210 single-slot card because I don’t play games so a power-hungry hot-running gaming card would do nothing except block useful expansion slots.

Installing the PERC 4e/DC RAID card caused the BIOS to complain on startup about being out of memory for option ROMs. This is another well documented complaint. The solution was to disable some of the unwanted onboard peripherals, something that also speeded up the POST process.

Samsung’s so called RAPID mode file system filter driver, which is actually just a write-back cache, causes random blue screens as-of version 4.4. Searching the internet shows that this is a common issue and since the SSD is plenty fast enough without dodgy drivers upsetting the stability I simply disabled RAPID mode.

Windows 8.1′s fast boot mode would sometimes cause the computer to do an immediate power-off as soon as the boot started. Fast-boot works a bit like hibernate mode where the previous state is restored from a disk file on boot. I couldn’t narrow this one down to any particular cause so I just disabled fast boot.

The PERC 4e/DC RAID BIOS screen uses the F10 key in some places to operate its menu system. It’s a well-documented problem that if F10 is used by your computer to enter its BIOS then F10 will not work inside option-ROM BIOS screens. The result was that I could not configure my RAID array. The solution was to use the megarc utility from LSI to configure the RAID options. For example, from an Administrator command prompt the megarc -newcfg -a0 -R5[1:1,1:2,1:3,1:4] -strpsz32 DIO WB RAA command would create a RAID-5 array using the SCSI disks on adaptor 0, channel 1, with SCSI ids of [1,2,3,4] using DirectIO, write-back and read-ahead-adaptive options.

You do get some warnings from the BIOS at the final stage of the peripheral initialisation.

These warnings are just for informational purposes and do not affect the operation of the system. They can be ignored.

That’s all folks

I’m done for now, at least for a few months until I can justify adding another X5680 and swapping out the 002 motherboard for an 003 to support it. I can report that the X5680 is very much faster at compiling my projects than the QX9650 that it replaced. I’m happy with the outcome.

Feel free to leave a comment below, or maybe you’re building a Z800-based system yourself and would like to stop by the forum to share your experiences.

Exploring the KSZ8091RNA RMII ethernet PHY

$
0
0

In my previous two articles (here, here) I’ve provided schematics and Gerbers for a breakout board that supports the Micrel KSZ8051MLL ethernet PHY. The KSZ8051MLL is an MII PHY manufactured in a reasonably easy to work with 48 pin quad-flat package.

One of the burdens of MII is that it requires rather a lot of pins to implement. The TX/RX data buses are 4-bits wide and operate at 25MHz, allowing the PHY to operate at 100Mb/s.

Enter RMII. RMII is a reduced pin-count interface that multiplexes some of the control and clock signals and halves the bus width to 2-bits at the expense of doubling the clock speed to 50MHz.

The advantage to us is that we can connect an RMII PHY to an MCU without using up so many of our GPIO pins. The main issue we will have is the data and clock rate of 50MHz. We will have to be careful with our board-to-board wiring to ensure that the signals arrive at each end intact.

The Micrel KSZ8091RNA

Micrel’s offering in the low-cost RMII PHY market is the KSZ8091RNA. At only 63 pence in units of 1 from Farnell, it’s very affordable.

The packaging is a 5x5mm QFN24 with 0.5mm pitch. These leadless packages are major pain in the neck for the hand-solderer’s out there. The edge-pads do have a very small exposure on the sides that mean you could potentially hand solder those but the issue that makes reflow the only viable option is the completely inaccessible center ground pad.


Tiny side pads might make hand soldering possible


Unfortunately the ground pad cannot be hand soldered

I’ve seen videos where people have attempted to reflow solder through the vias in the PCB pad to make a connection with the ground pad on the QFN. That might work but because you can’t see your work you won’t know if contact has been made.

Looking through the datasheet I can see that Micrel have re-used much of the KSZ8051MLL design in this device which makes it quite simple for me to take my previous design and adapt it to the minor changes. One benefit is that the reduced pin count and size of the QFN means that I have space on the 50mm square board to add some jumpers to select options that the PHY will read when it powers up.

Schematic




Click for full size PDF

The schematic is a straightforward breakout of the KSZ8091 incorporating an onboard 25MHz crystal oscillator and following the design guidelines for decoupling set out in the datasheet. The P1 header is where the PHY and RMII pins are broken out. The P3 header is where the bootstrap options are set, let’s take a look at those:

The KXZ8091 has a number of customisable options that can be set at startup by pulling some of the pins low or high. The PHY does contain weak internal pullup/pulldowns on these pins to set default options if you’re not interested in changing them. I’ve opted to break out the option pins to a header that can be used to choose the value of each one.

Jumper Function
AD0 PHY is at address 0 (default)
AD3 PHY is at address 3
WoL- Wake-on-LAN via PME is disabled (default)
WoL+ Wake-on-LAN via PME is enabled
100 Enable auto-negotiation and set 100Mb/s speed (default)
10 Disable auto-negotiation and set 10Mb/s speed

Note that the default address of zero is also the broadcast address for PHYs so if you have multiple PHYs attached to your controller then they’d all respond to this address.

The wake-on-lan options are special feature of the KSZ8091. If enabled, the host MAC can program its address into a PHY register and the PHY will then drive its PME_N2 pin low when it detects the WoL magic packet.

I’ve opted to use the popular and cost effective Hanrun HR911105A ethernet jack in this design.

The Hanrun RJ45 connector is easy to get hold of, features onboard LEDs and full magnetics, and tends to be about half the price of competing connectors from other manufacturers.

Board design

I decided to target a 50x50mm square PCB with this design, the idea being that if any of your would like to try to build one of these yourself then you can use the lowest cost service from one of the online manufacturing agents such as Elecrow, Seeed or ITead.




Click for full size PDF

I exported the Gerbers and sent them off to Elecrow for printing. At the time of writing Elecrow offer the coloured soldermask option for free with their $10 for 10 copies service which I think is excellent value. After the usual 3 week wait the boards arrived, and as usual they’re perfectly printed.


The front of the PCB

As you can see I’ve had plenty of space to include M3 mounting holes. These will be required because there are a few components on the bottom of the board and it’s best to lift these clear of the work surface to avoid the possibility of accidental short circuit.


Close up of the QFN footprint

The in-pad vias make the connection with the ground plane on the bottom as well as helping to wick away any heat generated by the IC. You can also clearly see the probe marks in the pads that are made by the ‘e-test’ machine used to test for manufacturing defects. The tolerances offered by the prototyping service don’t permit soldermask between the fine-pitch pins so we do have to be extra-careful when soldering the components to the board.


The front of the PCB

The ground plane is split, with the ground for the ethernet jack kept separate from the PHY’s ground. The jack’s ground connects back to the main ground plane via R18, an 0R ‘resistor’, placed very close to the GND pin that goes back to the MCU board.

Assembling the board

My assembly process can be broken down into these steps:

  1. Apply flux to the top layer pads
  2. Tin the fluxed pads with a soldering iron.
  3. Apply more flux to the tinned pads.
  4. Place all top-layer SMD components on to the tinned pads with tweezers and the assistance of a microscope for the fine-pitch ICs.
  5. Reflow in my halogen reflow oven.
  6. Touch up any dodgy connections under the microscope with a soldering iron.
  7. Solder all the through-hole components with a soldering iron.
  8. Flip the board over, tin the bottom pads and apply the components with a hot air-gun.
  9. Wash in warm soapy water, dry overnight and test.

Testing

To test the board I hooked it up to an STM32F107VCT6 development board from Waveshare.




Click for larger

If you’ve ever programmed the STM32 before then you’ll know that the pins you can use for the peripherals are not fixed. You can usually choose from a predefined set of pins to avoid clashes and to simplify board layout. The RMII interface is fixed on the STM32F107 but can vary slightly on the F4 as shown in the table below.

AF11 Function PHY board label Normal Remap (F4)
REFCLK 50M PA1
CRSDV CRS PA7
RXD0 RXD0 PC4
RXD1 RXD1 PC5
TXEN TXEN PB11 PG11
TXD0 TXD0 PB12 PG13
TXD1 TXD0 PB13 PG14
MDC MDC PC1
MDIO MDIO PA2


As you can see ST haven’t exactly pushed the boat out with the remap options. You can only move the TX pins up from port B to port G, and then only if you’re using at least the 144 pin package. Nonetheless, it’s a welcome option because port B is a crowded place and there’s comparatively little up there in port G. Incidentally, normal and remap are terms that I use based on what they used to be called in the F1 series.

The RXER, LED0, LED1 and INTRP pins are not required in order to get the board working and so they are left unconnected. I have connected the RES (PHY reset) pin to GPIO pin PB14 on the development board.

Naturally I want this PHY to work with the C++ TCP/IP stack included with my stm32plus library. I already support the KSZ8051MLL and so it was a trivial matter to add support for the KSZ8091RNA because they are extremely similar in operation. The device driver code is here on github.

I decided to use my net_udp_send example for the testing. It’ll perform a DHCP transaction to get configuration for itself and then send UDP datagrams to my PC where I can watch for their arrival with Wireshark.

A few modifications need to be made to the network stack configuration before we start. The physical and datalink layers need to be modified to include support for the PHY hard reset on PB14 and the PHY instance itself needs to be changed to the KSZ8091RNA.

template<class TPhy> using MyPhyHardReset=PhyHardReset<TPhy,gpio::PB14>;

typedef PhysicalLayer<KSZ8091RNA,MyPhyHardReset> MyPhysicalLayer;
typedef DatalinkLayer<MyPhysicalLayer,DefaultRmiiInterface,Mac> MyDatalinkLayer;

Secondly, since I’ve configured this PHY to be station zero I need to change the PHY address in the configuration structure:

params.dhcp_hostname="stm32plus";
params.phy_address=0;

That’s it for the essential changes. This example outputs status information to a USART and the default in the example code is Usart3_Remap2 which has a clash with the RMII TXEN pin on PB11 so I change it to Usart1:

typedef Usart1<> MyUsart;

Time to fire it up for testing and thankfully it all worked first time, which is a relief given the difficulty of reflowing the QFN package. DHCP is a broadcast protocol and so Wireshark on my PC was able to capture my network stack performing the DHCP transaction:




Click for larger

Note the default MAC address of 02:00:00:00:00:00. The network stack selects this address as the default if you don’t modify the mac_address member of the Parameters structure.

After the DHCP transaction has completed the example goes into a loop sending a batch of UDP datagrams to port 12345 at a configurable IP address every 5 seconds. Wireshark was also able to capture that traffic:




Click for larger

From this I can safely surmise that it’s all working and I’m happy with that. But before I sign off let’s take a look at one more essential topic.

Signal Integrity

No project that features on-board frequencies above a few tens of megahertz is complete without an analysis of signal integrity. This project features cross-board 50MHz signals connected with flying 20cm wires so it’ll be interesting to see just what those signals look like under the oscilloscope.

Here’s the REFCLK (50MHz) signal measured at the STM32F107 board pin using my 20 year old 500MHz 1Gs/s HP scope. Not too bad at all actually, and a lot better than I thought it would be. There’s some undershoot and overshoot which is a reminder to me that I probably need to recalibrate and adjust my 10:1 probe rather than any issue with the signal itself.

Download the Gerbers

Fancy building your own board? It’s not that difficult as long as you have the tools to deal with the QFN package. Click here to go to my downloads page where you can find a zip file containing all the Gerber files that an online PCB printing service would require from you. The board size is 50x50mm.

Some boards for sale

I’ve built up a few additional completed boards that are offered for sale here. They’re exactly like in the photographs here in this article and they’re all fully tested using my Port107V board.


Location




PCB soldermask colours: which one should you choose?

$
0
0

I’m sure that all electronics hobbyists have, by now, noticed that anyone can get their PCBs printed at the online Chinese services such as Seeed Studio, ITead Studio and Elecrow. Many of you will have even used those services. I have, and I’ve used them often.

Introduction

When selecting the parameters for your printed boards one of the options that you get is a choice of colours for the soldermask.

Along with the traditional green there are others such as red, blue, yellow, white and black available and sometimes at no extra cost. But which one should you choose? Are there any advantages or drawbacks to choosing something other than the standard green? That’s the question I hope to answer in this short summary of my experience.

Since I’ve now had multiple boards printed using every soldermask colour available I’ll go through each one with a short overview of the good and the bad.

Green

Click on any of the three images for a very large version

Bog-standard green with white silkscreen. Boring eh? Maybe so but green is probably the best of all the available colours in practical terms. The contrast between traces, planes and vacant space is high so you can inspect your boards easily with the naked eye to check for manufacturing defects. The white silkscreen contrasts well with the soldermask and flux residue cleans up well leaving a professional looking board.

Don’t be so quick to dismiss green as your choice of colour because if your board routing is a work of art then green might just be the best colour to show it off.

Red

Click on any of the two images for a very large version

I like red, it’s bold and it looks professional. The contrast between traces, planes and empty space is good, but noticeably lower than on a green board. Inspection of fine traces on the board for defects is best done with some form of magnification. Silkscreen stands out well against the red background and flux residues clean up well, just as well as on a green board.

Red can look bold and eye-catching, though not exactly unique since everyone’s doing it these days. If you want your beautiful routing to be the star of the show then green is still the best bet.

Blue

Click on any of the three images for a very large version

It’s a dark blue, the same as the one used on the Arduino. The contrast between traces, planes and empty space is quite low now and magnification is mandatory to inspect for manufacturing defects. On the plus side contrast between silkscreen and soldermask is very high so if your board is label-heavy then blue might be for you.

I also find that the combination of black integrated circuits, black pin headers and silver coloured connectors is aesthetically pleasing and looks very professional. This is also a good colour for mounting an LCD against as your eye is not drawn away from the screen by bright background colours and sharply contrasting edges.

Blue doesn’t clean up as well as red and green. As a dark colour it is prone to showing off dirt and if you’re not careful then flux stains can be stubborn to remove.

Blue can be a good choice if you’re not bothered about showing off the traces on your board or you need to match colours with your Arduino shield design against an Arduino host.

Black

Go on, click on them…

I don’t know what it is about gloss black that makes me keep coming back and choosing it? The contrast between traces, planes and empty space is virtually non-existent. Inspection of the board not only requires powerful magnification but you also have to angle a light just so it casts a shadow where the traces raise slightly above the board. A total nightmare to inspect.

At least the silkscreen contrasts well; in fact the silkscreen and the pads are pretty much all you can see on the board without optics and lighting to help you. At the time of writing only gloss black is available. The extremely cool matte black is yet to become available at the prototyping services.

Another peril with black is the way that it absorbs heat during reflow. You have to either scale down your profile or make sure that your temperature sensor is actually placed on the board itself. The silkscreen is also prone to turning light brown during reflow, presumably because the board under it takes on so much heat.

Cleaning is very hard indeed. It’s not that the flux stains are harder to remove; it’s that if you tilt the board to the light to see your routing then you also see where all the stains were!

On the positive side, and probably the reason why I keep coming back to black is that it really is the best colour for mounting an LCD against. Nobody’s going to be looking past your screen at distracting details when the backing colour is black.

Black’s a tough one to love and you really have to want it to ignore all the drawbacks. I’d like to say that I’ve scratched my black soldermask itch but it does look so good as a screen background.

White

You can see ‘em bigger, but they don’t get any clearer

OK I’ll get right out and say it; you’ve got to be crazy to want the gloss white soldermask option. There’s really nothing at all good about it. My excuse is that I knew this board was never going to be the final version so I thought I’d select white just so I could see it, knowing full well that I’d be complaining about it later.

If you think that the black soldermask makes your routing hard to inspect then you haven’t seen white. Contrast is the lowest of all the soldermask colours and even tilting to the light fails to properly show the traces. My microscope has a 45° stage light and even under that the traces are almost invisible.

Cleaning is just as tough as you’d expect. If you can get all the residue off then the board can look quite nice but it’s white so you really have to get every spot off for it to look good.

I find that the overall look of a built board is quite strange. Because the traces are invisible your components look like they’re floating on a sea of white. It’s just not right. Perhaps if your board is so dense that the component area outweighs the white area then you might pull it off. I remain to be convinced.

On the plus side the black silkscreen contrasts just as well as you’d expect black on white to do so any artwork you’ve got in the silkscreen layer will stand out very well.

The BeagleBone is probably the best known professional product with a white PCB out there and while I don’t think it looks terrible I do think it would have looked better in a different colour; maybe blue would have looked best with all that black and silver on there to set it off nicely.

Yellow

Click on any of the two images for a very large version

Why does no-one choose yellow? It’s not bad at all really. OK yes, it’s a bit of a murky shade of yellow and I’ve tried to adjust the photograph colours to correctly show it but unless your monitor is calibrated like mine is then it’s not going to show correctly I’m afraid. Take my word for it, it’s sort of murky and desaturated.

The contrast between planes, traces and spaces is very high. In fact it’s just as high as green so if you’ve got some beautiful routing to show off (and your PCB should be a work of art) then murky yellow may be the right choice for you.

About the only downside is that the white silkscreen does not contrast well with the board. Not so bad if you’re not producing something with lots of user instructions in the labels or you don’t go for fancy artwork in the silkscreen layer. I’d like to see an option for black silkscreen with this soldermask, I think it would work better for many designs.

I’ve found that yellow is a breeze to clean, no more difficult than green or red and any light residues that remain don’t stand out.

In summary I’m surprised that yellow isn’t an option chosen by more people. I may be unfairly maligning it by referring to it as ‘murky’. Perhaps ‘dark’ yellow is more appropriate.

Others I haven’t tried

Some services, e.g. OSH-Park offer a nice looking purple soldermask that matches their gold ENIG pad finish very well. I’ve never tried this but others speak well of it.


Image from Kickstarter project page

Other, more exotic finishes such as matte-black and matte-green aren’t available through the cheapest sources just yet but can be found at medium to high-cost houses such as PCBCart and PCBUniverse.


How the pro’s do it: the XFX 7970 graphics card (click for larger)

My XFX 7970 graphics card (bottom layer pictured above) is matte black and it looks very cool indeed, despite having all the trace visibility issues that gloss black has. Do you think the C3129 silkscreen label means that there really is 3000+ caps on that card or are they encoding more information in the label than just a sequential counter?

The end

There aren’t many resources out there to help you selecting a soldermask colour so I hope that this somewhat opinionated view of the available options goes some way to helping you make a decision.

Do you agree, disagree and have you got some photos of some of your boards in these soldermask colours that you want to share? Please feel free to drop in at the forum or leave a comment below.

GPS Disciplined Oscillator review and teardown

$
0
0

You may recall that about a year ago I built a frequency counter based on an FPGA and an Android user interface. I called it Nanocounter and you can read about it here if you haven’t already done so.

One of the basic requirements for building a frequency counter is the ability to calibrate it and to do that you need to have a reference clock source that you can rely on to be more accurate than the device you’re trying to calibrate. For my purposes that meant either a rubidium clock or a GPS Disciplined Oscillator (GPSDO).

Rubidium clocks are very precise, having an Allan Deviation (adev) of 10-12 but they do require periodic calibration which makes it difficult to trust the used devices that you can get on ebay.

By contrast, a GPSDO is a hybrid device that relies on a relatively cheap oscillator that is very accurate in the short term and can be gently nudged (disciplined) into line in the long term by a time signal received from GPS satellites. These devices never need calibrating but they do have a finite lifespan which is something I’ll explain when we get into the detail.

The lack of need for calibration is the reason why I decided to take a punt on an amateur device being sold on ebay. It was described as Symmetricom Inside with no further explanation as to what exactly they’d sourced from Symmetricom and what they’d done themselves. I took the risk and bought it anyway.

GPSDO basics

Many common and cheap GPS modules emit a 1 pulse per-second (pps) signal, generated by their internal clocks but locked to the signal from the GPS satellites which is itself locked to the caesium atomic clocks onboard the satellites. This is about as accurate a signal as you can get and never needs to be calibrated.

The issue with the 1pps signal is that it has, in the short term, a random jitter component introduced by such things as the prevailing atmospheric conditions between yourself and the satellites. That, and the fact that 1Hz is an undesirable frequency for a standard gives rise to the need for a hybrid design.

Short term stability is provided by the inclusion of an oscillator running at the desired frequency for your standard, e.g. 10MHz. The oscillator is chosen to have excellent short term stability and must also have the ability to be fine-tuned by a voltage input. Good choices for these oscillators include the Voltage Controlled Ovenised Oscillator (VCOCXO). This part is likely to be the most expensive in the GPSDO.

The basic idea is that a microcontroller receives the 1pps signal and also the reference oscillator signal. It uses the long-term stability of the 1pps GPS signal to figure out how much the reference oscillator is drifting over the long term and uses its control voltage to make appropriate adjustments. Here’s a block diagram of the system.

The algorithm inside the microcontroller is similar to a PLL. Edge triggers on both the signals are used to calculate the phase difference between the GPS and the oscillator and the control voltage is used to attempt to phase-align them. Once they are aligned in phase, the oscillator signal can be said to have been ‘disciplined’ by the GPS signal.

Now that we know the basic theory of how a GPSDO works I can elaborate upon my earlier statement that a GPSDO has a finite lifespan. The reason for that is due to the long-term drift of the reference oscillator due to ageing. All crystal oscillators drift from their initial frequency as they get older, even the best ones. The manufacturer will tell you in the datasheet how much per-year you can expect the device to drift.

That’s OK though isn’t it? Surely we can correct for the drift using the control voltage input. Well yes, but only up to a point. You need to check the datasheet for the oscillator that you’re using to make sure that it can be adjusted to spec over the number of years that you intend to use it.

Be careful with ebay, there are a lot of used VCOCXO units for sale on there which I suspect have been pulled from decommissioned cell-tower equipment. These older units may have been decommissioned due to the oscillator reaching end-of-life. You need to do your research.

Modern designs tend to be very good with adjustability. To give an example, the Connor Winfield DOC100V-010.0M has annual aging of 0.3ppm and an adjustment range of +/- 10ppm giving it many decades of potential use.

Powering it up

This ebay unit comes with an external 6V DC ‘wall-wart’ power supply and a GPS disc antenna on the end of a very long lead that must be more than 5 metres. That’s good because the nearest window to my desk location that has a good view of the sky is several meters away. The antenna is an active type with about 5.3V DC measured across the antenna terminal.

There are four LEDs on the front of the device. Two red ones indicate alarm conditions and the other two green ones are labelled ON and LOCK. The manual, which you can download here, tells you a bit about what to expect. When you first power up the ON and ALM1 LEDs light for a few seconds before going out. The LOCK LED will then light solid for at least an hour before the output signals become useable.

Why so long? It’s due in part to the need to wait for the OCXO to reach stable operating temperature but also because of that short-term jitter on the 1pps GPS signal that I told you about earlier. Because the jitter is random it can be averaged out over time but because you only get a new sample every second you need a lot of seconds to fully average out the jitter. My device takes about 90 minutes for the LOCK led to start flashing, indicating a useable signal but in the manual they recommend leaving it on for 24 hours for best stability. This delay is the main drawback of the GPSDO. You can’t just power up and start using it immediately.

There’s also a DB9 serial port on the front that supposedly allows us to talk to the GPS unit inside, presumably it’s just a passthrough that will allow us to send NMEA commands directly to the GPS chipset. We’ll definitely have a play with that.

The reference signals

Once the unit has decided it’s plucked enough samples from the sky to start operating then the front green LED starts flashing and a stable 10Mhz signal appears on that front-panel BNC connector. Here it is on the oscilloscope.

You can see that it’s a sine wave with a peak-to-peak amplitude of 3.3V. Let’s take a look at the 1pps signal as well whilst we’re here.

This time it’s a square wave with a duration of 50µs and an amplitude of 3.3V.

Using the 10MHz signal to calibrate a frequency counter

My Nanocounter project allows me to use an external high-accuracy 10Mhz clock as an external reference source in place of the onboard Connor Winfield TCXO. That’s very useful but I don’t always want to wait 90 minutes before I can start using the frequency counter so I also built in the ability to calibrate the TCXO using the GPSDO. In this mode the GPSDO is connected to the sample input and because the sample input is precisely 10MHz then the frequency displayed by the counter shows the error of the onboard reference oscillator.

My Android firmware then stores this error offset as a calibration value and uses it to correct the readings that it gives when using the onboard oscillator as a reference. That, in a nutshell, is the reason why I had to buy a GPSDO.

The serial port

At first it didn’t work using a ‘normal’ serial cable, exactly the same one as I use to connect an MCU UART to the PC via one of those little RS232 adapter boards that you can get on ebay. So I had a thought, maybe it needs a crossover, or null modem cable as they used to be called. I don’t have one of those cables any more so I hacked it using my existing cable and some jumper wires.

Success! it works. I was able to connect to the port using the 57600/8/N/1 settings. This is what the ‘help’ command gets you:

UCCM-P > help
*IDN?
ALARm:HARDware?
ALARm:OPERation?
DIAGnostic:OUTPut ON|OFF
OUTPut:ACTive:ENABle
OUTPut:ACTive:DISable
OUTPut:ACTive:HOLDover:DURation:THReshold <seconds>
OUTPut:ACTive:HOLDover:DURation:THReshold?
OUTPut:INACTive
OUTPut:INACTive?
OUTPut:STATe?
SYNChronization:HOLDover:DURation:STATus:THReshold <seconds>
SYSTem:PRESet
SYNChronization:TFOMerit?
LED:GPSLock?
SYNChronization:FFOMerit?
GPS:POSition N or S,<deg>,<min>,<sec>,E or W,<deg>,<min>,<sec>,<height>
GPS:POSition?
GPS:POSition:HOLD:LAST?
GPS:REFerence:ADELay <numeric value>
GPS:REFerence:ADELay?
GPS:SATellite:TRACking:COUNt?
GPS:SATellite:TRACking?
DIAGnostic:ROSCillator:EFControl:RELative?
SYNChronization:TINTerval?
DIAGnostic:LOG:READ:ALL?
DIAGnostic:LOG:CLEar
SYSTem:PON
OUTPut:MODE?
SYSTem:STATus?
SYSTem:COMMunication:SERial1:BAUD 9600|19200|38400|57600
SYSTem:COMMunication:SERial1:BAUD?
SYSTem:COMMunication:SERial1:PRESet
SYSTem:COMMunication:SERial2:BAUD 9600|19200|38400|57600
SYSTem:COMMunication:SERial2:BAUD?
SYSTem:COMMunication:SERial2:PRESet
OUTPut:STANby:THReshold <seconds>
changeSN
SYNChronization:REFerence:ENABLE LINK|GPS
SYNChronization:REFerence:DISABLE LINK|GPS
SYNChronization:REFerence:ENABLE?
STATus
POSSTATus
TOD EN|DI
TIME:STRing?
REFerence:TYPE GPS|LINK
REFerence:TYPE?
PULLINRANGE 0|1|2|...|254|255
PULLINRNAGE?
DIAGnostic:LOOP?
DIAGnostic:ROSCillator:EFControl:DATA GPS|<value>
DIAGnostic:ROSCillator:EFControl:DATA?
OUTPut:TP:SELection PP1S|PP2S
OUTPut:TP:SELection?
GPSystem:SATellite:TRACking:EMANgle <degrees>
GPSystem:SATellite:TRACking:EMANgle?
DIAGnostic:TCODe:STATus:AMASk
DIAGnostic:TCODe:STATus:OMASk
DIAGnostic:TCODe:ERRor:AMASk
DIAGnostic:TCODe:ERRor:OMASk
DIAGnostic:HOLDover:DELay
DIAGnostic:HOLDover:DELay?
GPS:SATellite:TRACking:IGNore <PRN>, ...,<PRN>
GPS:SATellite:TRACking:IGNore?
GPS:SATellite:TRACking:INCLude <PRN>, ...,<PRN>
GPS:SATellite:TRACking:INCLude?
GPS:SATellite:TRACking:<select>:ALL
Command Complete

That’s quite an impressive command set and we’re definitely not dealing with a simple passthrough to the GPS module. Let’s take a look at the output of the main status command:


UCCM-P > STATus

                 - UCCM Slot STATE -



1-1. #Now ACTIVE STATUS ---------------- [Master]
1-2. #Before ACTIVE STATUS ------------- [OCXO Warm]
2-1. #Reference Clock Operation -------- [Not Used]
2-2. #Current Reference Type ----------- [LINK]
2-3. #Current Select Reference --------- [GPS 1PPS]
2-4. #Current Reference Status --------- [Good Accuracy & Stable]
     #GPS STATUS ----------------------- [Available]
     #Priority Level ------------------- [LINK > GPS]
     #ALARM STATUS
     #H/W FAIL ---------- [ LINK ]
3-1. PLL STATUS ------------------------ [Enable]
3-2. Current: PLL MODE ----------------- [NORMAL 2 MODE]
Command Complete

The output from the status command shows that everything is in order. The unit is working with what it says is ‘Good Accuracy’.

A command is available to show the status of the GPS so let’s take a look at that.

UCCM-P > POSSTATus
-----------------------------------------------------------------------------
11/6/2016 18:20:00
-----------------------------------------------------------------------------
 Position : LAT(N 51:XX:XX.XXX) LON(E 0:XX:XX.XXX) H(27.50 m MSL)
-----------------------------------------------------------------------------
 Geometry : PDOP(0.0) HDOP(38.8) VDOP(38.8)
-----------------------------------------------------------------------------
 Channel Status
   num of visible sats > 10
   num of sats tracked > 4
   ------ Receiver Channel State ------
     CH 0 >  SateID(7) TrackMode(pos avail) SigValue(40)
     CH 1 >  SateID(9) TrackMode(pos avail) SigValue(36)
     CH 2 >  SateID(13) TrackMode(pos avail) SigValue(31)
     CH 3 >  SateID(30) TrackMode(pos avail) SigValue(33)
     CH 4 >  SateID(0) TrackMode(code search) SigValue(0)
     CH 5 >  SateID(0) TrackMode(code search) SigValue(0)
     CH 6 >  SateID(0) TrackMode(code search) SigValue(0)
     CH 7 >  SateID(0) TrackMode(code search) SigValue(0)
     CH 8 >  SateID(0) TrackMode(code search) SigValue(0)
     CH 9 >  SateID(0) TrackMode(code search) SigValue(0)
     CH 10 >  SateID(0) TrackMode(code search) SigValue(0)
     CH 11 >  SateID(0) TrackMode(code search) SigValue(0)
-----------------------------------------------------------------------------
  Rcvr Status(1) :
-----------------------------------------------------------------------------
  Antenna Voltage: 5725 mV,  Antenna Current: 22 mA
-----------------------------------------------------------------------------
Command Complete

I blanked out the GPS location

Apparently I am tracking four satellites out of a possible ten. I’ve run this command many times in a row and it does dynamically acquire and drop satellites as the signal strength changes.

Lastly, the all-important status of the timing loops:

UCCM-P > DIAGnostic:LOOP?
-------------------------------------------------------------------------------
11/6/2016 18:54:56
-------------------------------------------------------------------------------

   LOS      MEAS       NCO STATUS WEIGHT PBUC FBUC DBUC LBUC IBUC G  M TC
-------------------------------------------------------------------------------
 0:  1  0.000e+00  0.000e+00 0x07AC   0   700  700    0    0    0  2  4 29
 1:  1  0.000e+00  0.000e+00 0x068C   0   700  700    0    0    0  2 14 226
GPS: 0 -7.717e-08 -7.255e-08 0x0000   1  ---- 1000 1000 ---- ----  3  6 31
-------------------------------------------------------------------------------
freq cor  = -7.254663e-08
phase cor = -3.276800e-10
gps phase = 3.066667e-08
temp cor  = -5.801979e-11
Command Complete

I’d guess that the ‘cor’ values indicate corrections but unfortunately I can’t find a manual that would confirm that. Now that I’ve had a play with the unit I definitely had to have a look inside to see how it’s been put together.

Teardown

Taking the unit apart is very straightforward. There are four philips-type screws in the back that need to be removed first.

Secondly the nut holding the antenna connector to the back panel has to be removed. An 8mm spanner is required for this, or a pair of pliers will probably do it.

Finally the four hex screws that secure the front panel have to be removed.

With all the screws removed you simply pull on the front panel and the whole assembly slides out the front of the extruded aluminium case. With the case removed we can see the whole unit.

The design did actually surprise me. I was expecting to see an OCXO pulled from something else together with a modern GPS unit and an MCU to co-ordinate the synchronisation all mounted on a custom PCB. What we’ve actually got is an entire Symmetricom GPSDO pulled from some other device and mounted on to a trivial minimal PCB that does nothing more than break out connections to the front panel.

Let’s take a look at some of the parts on this.

The main GPS unit is a Furuno GT-8031F that still exists as a product to this day. It’s mounted under an RF shielding can as you’d expect from a sensitive unit like this.

The OCXO is made by Symmetricom themselves and the part number is STP2695. The date code is visible as the 12th week of 2005 which at the time of writing is eleven years ago. STP probably stands for “Symmetricom Time Provider” and UCCM-P is a Cisco product which gives us a clue as to the original use for this board. I can’t find any information online for the STP2695 but information on a similarly numbered device, the TP2700, is available and does give us an insight into what these boards are capable of. With no information available on the aging characteristics of the OCXO I cannot know how far it might have drifted in eleven years of use.

Here’s how the designers have bodged in a wire to link the 10MHz output to the front panel BNC connector. That component footprint is immediately recognisable to me as that of a mini SMA connector. The designers have desoldered it and bodged in a wire connected to the center signal pin and stripped back the insulation far enough to be able to solder down the ground shield to one of the corner pads. Ugly, very ugly. If there are any time-nuts out there reading this I’d be interested to know what impact this kind of bodge has on the integrity of the signal. If it’s detrimental to the performance of the unit then I may well get in there and replace the surface mount SMA connector and provide a quality link to the front panel BNC connectors.

And here’s a similar bodge for the 1pps output only this time for good measure there’s a second bodge wire that leads to the shield connections for the front panel BNC connectors. No attempt to clean up the mess made by the rework has been made. I couldn’t identify the IC on the right from its ID number but the only traces coming from it lead to the serial connector indicating that it’s probably an RS232 driver. The black smudge on the surface is heat damage from whatever tool they used to remove the SMA connector, probably a hot air gun.

Video

I made a video that shows the general operation of this GPSDO as well as the complete teardown. The video has a more complete view of the teardown and shows even more bodges hiding underneath the main board. You can watch it here in the web page but for the best full HD view you should watch it directly on the YouTube website.

Final words

I hope you’ve enjoyed this review and teardown as much as I have doing it for you. I’m happy that I’ve got a unit that’s based on something of very high quality even if the ‘aftermarket’ modifications made to it are of a low quality.

Feel free to leave a comment below, or if you’d like to start a discussion then click here to visit the forum thread.

A high current power supply built around a server voltage regulator

$
0
0

Regular readers of this blog will have already seen the article that I published about 4 months ago where I attempted to reverse engineer a voltage regulator module originally designed to fit into a Dell server. The theory was that these would be high quality, stable and robust designs that could prove useful if I could figure out how they worked. They’re certainly worth far more than the few pounds that you can get them for on ebay today.

I was able to determine the function of the key pins on the module myself by experimentation and then with some help from eagle-eyed readers out there on the internet we were able to identify the module as an Artesyn NXA66 and subsequently a summary datasheet was located that provided the full pinout. To summarise, the main features of the module are:

  • Selectable voltages of 3.3V and a ‘secondary’ level selectable by the VSP pin.
  • 12V input level.
  • 20A (66W) continuous current delivery at the 3.3V level.
  • Output enable pin.
  • ‘Power Good’ output signal.
  • Differential remote sense.
  • Ability to chain the modules to share current delivery.

It’s worth expanding a little on that ‘secondary’ voltage capability. I now have a few of these modules and some of them have a secondary level of 5V and others have 2.5V. All of these levels are useful but I suspect that if you’re planning to follow the design outlined here then you’ll want the 3.3V/5V module.

I don’t know of a foolproof method of differentiating the two modules from their ebay listings. What I can say is that the model with the black heatsink reviewed in the original article is a 5V module and all those that I received with a silver heatsink are the 2.5V module.

Designing a power supply controller

I decided that the best way to exploit the results of the reverse engineering effort was to design a controller board that would host the NXA66 and expose its functionality via a front panel. I’d throw in a few simple extras myself such as current monitoring and data logging and finally I’d implement it as a through-hole design so that it could be implemented by people of all skill and equipment levels.

The end result will be a bench-power supply that’s cheap to build and has a current supply level greater than that of most supplies priced at hobbyist levels.

Schematic




Click on the thumbnail for a PDF schematic

It’s a relatively simple and modular schematic. Let’s take a look at each of the modules in turn and describe the functionality in more detail.

Power input and control

The input to this design is 12V which I envisage to be supplied by a commonly available power brick. Care needs to be taken to choose a supply that can deliver the current required by your output load. At the high end 66W at 3.3V equates to 5.5A at 12V plus losses, plus consumption by the controller itself. If you plan on providing 66W to your load then you’d want at least a 7A 12V supply on the input.

I’ve included a relay between the 12V input and the NXA66 because I don’t want the module powering up by itself without being co-ordinated by my controller. I discovered during experimentation that the module goes into an undefined state if you attempt to switch between the two available voltage levels while the power is on and for that reason I want to be able to set the control pins to the desired state and then power up the module. If the user decides to switch voltages while power is on then I’ll programmatically cut the power, set the VSP pin accordingly and then power up the module. A power MOSFET could be used equally well for this switching purpose; I tossed a virtual coin and it came down on the side of the relay.

All the functionality of the module is exposed to the controller. The slot itself is a 2×25 card edge connector with a 2.54mm pitch and an inter-row spacing of 5.08mm. The VSP and OUTEN pins are switched by MOSFETs and linked directly to LEDs that show their current state. Artesyn hint at a requirement for an output capacitor in their datasheet so I include a 150µF electrolytic at the output terminal. The output and return terminals themselves are doubled up to provide a higher current carrying capacity.

PGOOD is an open-drain (or collector) output. This means that the module can drive it low but it floats when high so it must be pulled up to a high level by the controller. Open-drain outputs are used when the I/O levels of the controller are not known by the designer. It would be no use specifying this output as a push-pull pin at 12V when the MCU on the board is 3.3V, for example. I connect PGOOD to the MCU with a pull-up resistor and use a separate MCU pin to light the indicator LED.

Current monitoring is provided by the Texas Instruments INA226 in a surface mount MSOP-10 package with a 0.5mm pin pitch. Sorry about that. Try as I might I just couldn’t find a suitable current monitor in a DIP package that came close to the capabilities of this little chip. This is the only SMD package on the board.

The INA226 continually senses the voltage across a very low value (2mΩ) resistor placed in the path of the output current. An internal ADC converts this to a digital value that can be queried by an MCU using the I2C protocol. You can set an internal calibration register for fine tuning the current reading to compensate for the inaccuracy inherent in the sense resistor value. As well as the current you can also query the sensed voltage and the power. It can also alert you via an output pin if a voltage or power threshold that you program is exceeded. This is all great stuff and is ideal for this project.

The Microcontroller

Really, It had to be an Atmega didn’t it? The requirement for a though-hole design rules out the STM32 F0, my favourite general-purpose ‘do almost anything’ MCU. This design will use the same Atmega328p made famous the world over by thousands of Arduino users. That same level of success means that you can pick up this MCU for just a few dollars from your favourite components store.

There are a few points to note about how the MCU is configured in this design. Firstly I’m using the internal 8MHz oscillator as the clock source which, given the documented 10% tolerance, does pose a risk to the ability to run a reliable UART for data logging. If I do revise this design then I will try to squeeze in an external crystal. Secondly, since I’ll be using the ADC there is an LC filter on AVCC which will also be used to set the AREF level.

5V voltage regulator

The MCU and the external peripherals on this board all run at 5V so I need to drop the input level of 12V down to 5V to generate that supply. A 7V difference means that a linear regulator would be burning off a lot of power as heat so for efficiency reasons a switching buck regulator is the preferred option. I opted to use the Texas Instruments LM2574 Simple Switcher that can provide up to 500mA which should be more than enough for this design. TI’s Simple Switcher range are very reliable and easy to use. I’ve used them many times in the past and never had any issues with stability.

Display and user interface

The readout for a power-supply, or any bench instrument for that matter, needs to be LED, OLED or VFD. In my opinion LCD lacks the eye-catching ‘at a glance’ contrast of LED or VFD. VFD is beautiful to look at but expensive to buy in module form. Small OLED modules are available cheaply and were a contender for this design but I decided to go for two 4-digit 7-segment LED modules.

The controller is the MAX7221 from Maxim. This controller can multiplex 8 LED digits, it’s available in a DIP package and information about its usage is readily available on the internet. Basically you just send it a digit number and the state of the segments for that digit and it will hold the digit in that state without intervention until you come back and tell it a new state. Easy.

During normal operation the voltage and current readouts from the INA226 will be continuously displayed on each of the modules. I will reuse the displays as a rudimentary UI to set configurable parameters for this controller. These modules are a common-cathode design and are readily available on ebay. I’ll go into more detail in the bill of materials section.

Temperature sensing and fan control

There’s a heatsink built in to the NXA66 for a good reason. At the higher current levels the module will need to dissipate a significant amount of heat. To help with that I’ve included a temperature sensing module and fan controller. The MCP9700-E/TO is a thermistor in a TO-92 package that outputs a voltage proportional to the temperature that it senses.

I plan to tape this sensor to the heatsink of the NXA66 right above the power MOSFET. The sense voltage will be fed to the ADC on the Atmega328p where it will be converted into Celsius. If the temperature exceeds a preset value then a standard 12V 40mm DC fan will be switched on until the temperature drops back below another threshold.

As with anything ADC-related the sensed reading as well as the ADC supply and reference level need to be carefully filtered to avoid erroneous readings due to noise and glitches on the lines.

UART

Not a lot going on here. It’s a simple breakout of the pins on the Atmega328p. I will attach these to one of those small external UART driver boards that you can get on ebay. The MCU will continually output the voltage and current readings so that you can collate them on a PC for data logging.

USBASP

This is the programming header. The pinout exactly matches that of the USBASP programmer so it can be directly plugged on to program the MCU.

Switchgear

The front panel of this PSU will have SPST toggle switches for output-enable and voltage-select functionality. There will also be a rotary encoder with a built-in push-button function that I will use for adjusting the configurable controller parameters.

Both switches and the button are normally-open and will connect to ground when closed. Each one is attached to an input pin on the MCU that has its internal pull-up enabled meaning that it will read high when the switch is open and low when it’s closed. Wiring is simplified because after connecting one pin of each switch/button to the correct MCU input pin all the remaining pins are then common’d together and connected to a ground terminal.

Bill of Materials

Here’s a complete bill of materials for this project. Where a component is available from Farnell I’ve included the order code.

DesignatorValueQuantityFootprintDescriptionFarnell Order CodeNotes
C1, C2, C6, C7, C8, C9, C11, C13, C14100n92.54mm lead spacingCeramic capacitor2309020
C322u 16v12mm lead spacingPolarized Capacitor (Radial)1870976
C4220u12.5mm lead spacingPolarized Capacitor (Radial)1902883
C5220u12.5mm lead spacingPolarized Capacitor (Radial)1902883
C1010u12mm lead spacingPolarized Capacitor (Radial)1902913
C121u15.08mm lead spacingCapacitor21129101
D1, D71N40072DO-411 Amp General Purpose Rectifier23174172
D21LED-3MM3mm LED3
D31LED-3MM3mm LED3
D41LED-3MM3mm LED3
D51LED-3MM3mm LED3
D6SR1501DO-204ALSchottky Rectifier1861420
FB1BLM18PG221SN1D1AXIAL-0.3Inductor2292304
K1OJE-SS-112HMF,F0001Relay TE OJ/OJESingle-Pole Single-Throw Relay1891668
L1RLB0914-331KL1RADIAL 5x9x9.5Inductor2309243
L2RLB0712-100KL1RADIAL 10x7.2x3Inductor2434811
P1Edge connector12x25 2.54mm pitch, 5.08mm row spacing4
P2PM5.08/2/901PCB terminal block - 2 pinWEIDMULLER PM5.08/2/9011318555
P3PM5.08/2/901PCB terminal block - 2 pinWEIDMULLER PM5.08/2/9011318555
P4PM5.08/2/902PCB terminal block - 2x2 pinWEIDMULLER PM5.08/2/9011318555
P5Fan header1HDR1X3Header, 3-Pin588581
P612.54mmHeader, 4-Pin6
P712.54mm x 2Header, 5-Pin, Dual row6
Q1, Q2, Q3, Q4BS1704TO-92N-Channel MOSFET1017687
R1, R96802AXIAL-0.3Resistor2329545
R2, R34702AXIAL-0.3Resistor2329531
R4, R7, R10, R12, R13, R1410k6AXIAL-0.3Resistor2329474
R5, R62.2k2AXIAL-0.3Resistor2329584
R82m 1%12512Welwyn ULR2-R002FT21292491
R1168k1AXIAL-0.3Resistor2329546
SW1PM5.08/2/903PCB terminal block - 3x2 pinWEIDMULLER PM5.08/2/9011318555
SW21PCB terminal block (3 pin)Header, 3-Pin7
U1LM2574N-5.0/NOPB1DIP-80.5A Step-Down Voltage Regulator14691698
U2INA226AIDGST1MSOP-10Current Sense Amplifier1924807
U3MCP9700-E/TO1HDR1X3MCP9700 plus fan-style header1332166,5885819
U4MAX7221CNG1DIP-248-Digit LED Display Driver10,8
U5, U703641A20.36" 12 pin4 digit 7 segment LED common cathode410561X11
U6ATmega328P-PU1DIP-288-bit AVR Microcontroller171548712,8

Notes

Some of the components have note numbers against them. The following numbered paragraphs correspond to a numbered note in the bill of materials table.

  1. 2.54mm parts can also be used if you carefully bend the leads outwards to fit the wider 5.08mm pitch.
  2. Any of the 1N400x series will be fine. They all cost about the same so I tend to keep a stock of the biggest one, the 1N4007 around.
  3. Any colour of 3mm LED will work. I’ve used amber for Power/PGOOD, white for EN and blue for VSP for no other reason than I felt like it.
  4. This can be a tricky one. I got the edge connector on ebay, item number 140888533934. Part 2668415 at Farnell looks like it could be persuaded to fit – the row pitch is only 0.2mm off the required 5.08mm.
  5. PCB terminal blocks with the 5.08mm pitch are plentiful on ebay and they slot together to form longer blocks. There’s no reason not to use the cheaper ebay blocks for the switchgear but I would stick to a quality item for the power in/out blocks (P2, P3 and P4).
  6. The male 2.54mm pin headers are at their cheapest on ebay.
  7. The 3-way 5.08mm PCB terminal blocks are available on ebay.
  8. I mount all my ICs in sockets. You don’t have to but if you need to replace one…
  9. I’ve included a 3×2.54mm header footprint for the TO-92 temperature sensor and I chose to use a 3-pin fan header as the connector. You could just as easily use male and female pin headers if you have them, or even solder the sensor wires directly to the board as the most economical method.
  10. The MAX7221CNG is available cheaply on ebay in lots of 5. As usual with ebay there’s a good chance of them being clones but that’s where I bought mine from and they seem to work.
  11. The 7-segment 0.36″ LED display is available from many ebay sources. Make sure it’s a common cathode configuration. These displays all seem to share a common pinout, but just in case click here to see the datasheet for the one that I bought.
  12. You can probably buy an Arduino clone on ebay from China with an Atmega328P on board for less than Farnell will charge you for one piece of the IC alone.

PCB layout

The PCB was laid out to fit the 10x10cm shape that’s so economical to produce at the Chinese prototype houses. The NXA66 sits on the left with a direct, wide path along top and bottom ground fills from the current return terminal to the ground entry point. Further optimisation could be made here by ignoring the return ground terminals and wiring from the front panel ground directly back to the power entry ground at the back.

Indicator LEDs and switchgear terminals are placed at the front and power entry is at the back. I will place a 2.1mm panel-mount connector on the back of the case as well as an on-off switch.

The 3-pin fan and temperature connectors surround a rectangular cutout in the PCB. The cutout is designed for routing the wires from the sensor and fan below the board for aesthetic reasons.

The regulator output runs through top and bottom polygons to the sense resistor where it unavoidably shrinks to pass through the component whereafter it expands to a pair of polygons connected to dual output terminals. The INA226 is placed as close as possible to the sense resistor.

I mentioned earlier that the TI Simple Switcher regulators are an easy to use design, and they are, but good practice is still in order when laying out the components that take part in the switching loop.

TI’s application note AN-1229 covers it all but the basics are that the switching components should be placed as close to the regulator as possible (tricky with through hole), they should be arranged in the order shown in the schematic and the ground returns for each of them should be tied together and terminated to a ground plane at the ground pin of the IC. Hopefully you can see how I’ve done that in the above image.

PCB manufacturing

It’s the usual procedure to get these boards built. Generate the gerbers, upload to your favourite Chinese prototype manufacturer and wait however long you’re willing to pay for delivery.

For two layer boards such as these I think all the board houses such as Seeed, ITead, Elecrow and PCBWay are all much the same so I went with PCBWay as I’ve been using them recently and have found their quality to be great and their standard delivery seems to be a little quicker than some of the others, although it’s still at least two weeks.

I went with a green PCB to match the colour of the NXA66 module that will be plugged into it. For through hole designs I’m rather partial to white but in this case I think it would clash too much with the green of the NXA66 so green it is.

The designs were uploaded, I went on holiday and by the time I got back they were in my letterbox waiting for me. Let’s see how they look.

I think they’re looking great. No manufacturing flaws were expected because I wasn’t pushing any of the published limits and a quick once-over under a magnifying glass didn’t show up any problems. Time to assemble a board and do some testing.

Assembly

Assembling this board starts with the two surface mount parts, the INA226 and the current sense resistor. I tinned the pads with an iron and then reflowed the two parts in my reflow oven. It certainly felt like overkill for two parts but I have the oven so I use it whenever I can.

If you don’t have a reflow setup then it’s not difficult to do these two parts by hand with an iron if you have plenty of flux, good light and some hands-free magnification. There are many tutorials and videos on YouTube about how to hand-solder SMD parts.

After the two SMD parts are done it’s on to the simple but time-consuming task of bending, inserting, soldering and trimming the through-hole parts. It’s best to do these starting with the lowest profile resistors first and working up to the tallest parts. That way when you turn the board upside down to solder a part it will be held in place whilst you work by its own contact with your work surface.

Finally after a fairly long but strangely therapeutic soldering session it’s all done.

I know I’ve said it before but I do like the look of a project built with through-hole parts. It’s certainly a marvel to inspect a densely packed SMD board but the look of a through-hole project with all those chunky bits on it appeals more to me. Next I’ll insert the NXA66 into the edge connector and build up the temperature sensor cable.

I made a cable for the temperature sensor out of a cheap 3-pin computer fan cable that I got on ebay. The connector on the other end was snipped off and the resulting wire ends were soldered to the MCP9700 TO-92 leads. The fan’s not fitted in these photographs. I’m still waiting for that to be delivered. Hopefully it’ll be ready by the time I shoot the video that’ll accompany this article.

I used some heat-shrink tube to insulate the legs of the TO-92 and then kapton taped it to the back of the heatsink where I can see that it makes contact on the other side with the power MOSFET. This should be the hottest part of the heatsink and therefore the best place to take measurements.

The firmware

I connected up the 12V input to my bench PSU and switched on. The power LED came on which meant that the 5V regulator was working. Of course there was no output from the NXA66 because the relay that controls power to it was switched off. What I did next was to check that the AtMega328P would talk to me through the ISP header. It did, so now I’m good to go with writing the firmware.

The basic aims of the firmware are:

When the device is powered up it should restore its most recent settings from the onboard EEPROM. If the output enable switch is in the ‘on’ position then the output should be immediately enabled. This enables the device to continue where it left off in the event of an unexpected reboot or power outage.

The upper LED should show the voltage level reported by the INA226. The lower LED should show the current reported by the INA226.

There are two switches and a rotary encoder with an integrated button. Their functionality is as follows:

  • The VSP switch should switch between 3.3V and whatever alternative voltage level is provided by the NXA66, either 5V or 2.5V.
  • Since the NXA66 does not like switching levels while the power is on we will use the relay to cut the power before setting the new level and switching the relay back on.

  • The output enable switch electronically disables the output.

The rotary encoder and its integrated button will be used to navigate through a single-level menu of configuration options displayed on the upper display. Pressing the button will bring up the first menu item. Turning the knob will ‘scroll’ through the options. Pressing the button again will enter that menu option and then the knob can be used to adjust the configured value and the knob will confirm it and save the new value to EEPROM. Doing nothing for an idle time of 10 seconds will abort the menu process and go back to the main voltage/current display.

Rudimentary but intelligible letters can be displayed on the 4-digit LED display and that will be enough for me to provide the menu navigation. The configurable options will be:

  • Calibration. The INA226 is configured with a calibration value for a perfect 2mΩ current sense resistor. In the real world the actual resistor value will be off by a small amount and this option will allow me to compensate for that. By applying a load and monitoring the current flow with an accurate instrument I will be able to change the calibration so my displayed current matches the instrument.
  • Over-current protection. I will be able to program an upper-limit to the output current. If this limit is exceeded then the output will be automatically disabled. I’ll be doing this by monitoring the polled output from the INA226 so there will be a delayed response of a few hundred milliseconds — roughly the same as a slow-blow fuse.
  • Data logging. I will periodically transmit the measured current and voltage values to the UART port. This option allows me to configure how often that happens, or disable it.
  • LED brightness. A convenience feature to allow me to adjust the brightness of the LED displays.
  • Fan activation levels. The fan can be configured to switch on when a certain temperature is exceeded and then switch off when the temperature falls back below a lower level. I’ll be able to customise those temperature levels with this option.
  • Temperature display. There’s nowhere to continuously display the temperature so this option will allow me to check its current level.

It didn’t take long to write the firmware and I’m quite pleased with the result. It’s all open-source of course and you can view it here on github. In the bin directory you can find hex files that correspond to each release. These can be uploaded directly to the AtMega328p using the USBASP programmer.

avr-size nxa66.elf | tee nxa66.siz
   text    data     bss     dec     hex filename
   7486     180     191    7857    1eb1 nxa66.elf

Looking at the compiled size it appears that I could have fitted it into an AtMega8 but there’s no price advantage for me to do that since they both cost about the same in single units. That’s one thing I really like about the AVR 8-bit instruction set – the code density is so high that you get a lot of functionality into a small amount of flash. This firmware would have been at least double the size if implemented in the ARM 16-bit thumb instruction set.

The MCU operates on its internal 8MHz oscillator. The fuse values required for that are programmed using avrdude:

$ avrdude -c usbasp -p m328p -e -U lfuse:w:0xe2:m -U hfuse:w:0xde:m

The main loop of the firmware polls the INA226 every 200ms and updates the display. The configuration menu, if active, also runs in the main polling loop. Everything else runs asynchronously using interrupts:

  • A 1Hz timer periodically triggers an asynchronous ADC conversion on the MCP9700 analog input. The ADC completion interrupt is used to read and convert the temperature reading. Click here to see how that’s done. Note that conversion to celsius is done with purely integer arithmetic – we don’t want to pull in bloated and slow floating point libraries just for this.
  • The simple 8-bit Timer0 is used as a general purpose millisecond ticker. Click here to see how that’s done.
  • The power good signal from the NXA66 is connected to the INT0 pin. When the pin changes state we get an interrupt and set the onboard LED accordingly. Click here to see the code for that.
  • The UART transmitter uses an interrupt to tell it when the transmit register is ready for data. We use this to send out our data logging strings without having to do a blocking poll on the register that holds the ‘ready’ flag. Click here to see the code.

Programming using interrupts can greatly increase the efficiency of your firmware, indeed many of my STM32 firmware implementations are entirely interrupt driven. That is, the main loop does absolutely nothing at all. Programming using interrupts does require additional care and attention though. Some of the most important that spring to mind are:

The volatile qualifier must be used on data that is to be shared with code executed in a different context. This will stop the compiler re-ordering instructions or caching writes that could mess things up. Don’t use volatile unless you need it though because it restricts the optimiser’s attempts to make you look good.

You cannot read or write a variable in the main loop that is accessed in an interrupt context if that variable is wider than the MCU can write in an atomic instruction. For the AVR that means you cannot share an int because it’s 16 bits wide and requires two instructions to write. An interrupt can occur between the two instructions and leave the variable in an incorrect state. To get around this on the AVR, either disable interrupts while a wide variable is accessed or use an 8-bit flag to indicate a ‘locked’ state.

Don’t spend ages in an interrupt service routine if other things could be held up by you. In an environment without hierarchical interrupts everything else is suspended and important interrupts such as those that keep time will not be running.

You will not be able to use functionality that depends on other interrupts being serviced. For example, polling a millisecond timer counter in an interrupt routine isn’t going to work because the code that updates the counter cannot run.

Testing

Writing the firmware steadily opened up each feature of the board and I was quickly able to sign off everything as working as designed. An advantage of using the MCU embedded in the Arduino is that there’s a lot of open source code out there to drive ICs and MCU peripherals. Not all of it is great, but even the low-quality code can give you a head start with the fundamentals of how to begin.

I was able to re-use a character font library for the MAX7221 and the popular ‘Wire’ library for driving the I2C peripheral that I needed to talk to the INA226. Here’s a picture of the device up and running, supplying 95mA to a test load.

I’m going to design a case for this power supply with switchgear and banana sockets on the front but until that comes I needed a quick hack to access the switches and the rotary encoder. I did this with a dremelled and drilled piece of stripboard that I could push the switches through and solder them to some access wires on the copper side.

It’s a hack, and it’s had to be patched up once already but it has allowed me to get through this testing phase without a real case.

The temperature monitoring and dynamic fan control also seem to be working well. Here’s a snapshot of the configuration menu item that displays the current temperature.

Now let’s take a look at the data logging. I hooked up the UART pins to one of those little adapter boards that you can get on ebay. The protocol is 19200-8-N-1.

This allowed me to connect a serial cable to the back of my PC and receive the logged data. Each line of data contains the millisecond timestamp of the sample, the voltage and current readings and the 8-bit CCITT CRC of all characters preceding the CRC number itself.

The default configuration transmits a new line once per second but this can be modified in the configuration menu.

Room for improvement

There’s always room to make improvements to a project. Here’s a couple of things that I noticed that could be improved.

  • The board cutout is not large enough to allow a standard 3-pin fan plug to be threaded through it. I had to cut my fan lead and resolder it back together after plugging it in to the board.
  • I forgot to add silkscreen pinout information for the UART header so you have to have the design open for reference when connecting up the UART.

Both of the above issues are fixed in the Gerber files that you can download from my site.

Video

If you’d like to see me try to make something as mundane as a power supply appear interesting in a video then you can do so by clicking below. Better resolution can be had by viewing the video on the main YouTube site.

Build your own

If you’d like to have a go at building one of these yourself then I’d certainly recommend it. If you’re confident you can solder the surface-mount INA226 then everything else is a walk in the park.

Click here to go to my downloads page where you will find a link to download the Gerber files in a form that you can directly upload to a site like PCBWay, Elecrow, ITead, Seeed etc. The board is a two layer 10x10cm design.

Click here to go to the Github repository for the firmware. In the bin directory you will find a compiled .hex file for each release. If you have avrdude installed then the firmware can be flashed and the fuses set with the following commands:

$ avrdude -c usbasp -p m328p -e -U flash:w:nxa66.hex
$ avrdude -c usbasp -p m328p -e -U lfuse:w:0xe2:m -U hfuse:w:0xde:m

Replace nxa66.hex with the name of the hex file that you downloaded.

Blank boards for sale

I’ve got some spare boards remaining from the batch of ten in my original order. If you’d prefer to buy one rather than have your own set manufactured then you can use the PayPal form below to make an order.

Location



Final words

It’s nice when something works first time and I’m pleased that this project was one of those. I now need to finish it off with a nice transparent acrylic laser cut case which means spending some hours in front of Inkscape. I’ll do that and be sure to write up my experience in another article here.

If you’d like to comment on anything in this article the please feel free to use the comments section below. If you have more detailed comments or questions then please use the forum and I’ll get back to you as soon as I can.

A laser-cut acrylic case for my server power supply controller

$
0
0

In my last article I built a controller board around the Artesyn NXA66 server power supply module that I picked up very cheaply on ebay. This board gave me the ability to control the key functions of the module and formed the basis for a rather nice, high current desktop PSU for voltages of 3.3 or 5V with the ability to switch out the module for a 3.3V/2.5V module if I so desired.

To finish off the project I decided to make a case for the controller board with conventional switchgear and sockets for the front and rear panels. This is the writeup for that project and there’s an associated YouTube video that you can watch as well.

Specifications

Before I can design anything I need to know what it is that I’m going to design so these are the basic specifications for the case that I’ll be building.

  • 4mm front panel jack sockets for power output.
  • Front switches for output-enable and level selection.
  • Front rotary encoder for options selection.
  • Rear 2.1mm input for 12V supply.
  • Rear on/off switch.
  • Rear DB9 connector for RS232 data logging output.
  • Rear 4mm jack sockets for load voltage sense inputs.
  • Side mount 40mm cooling fan adjacent to NXA66 module.

With that in mind it’s time to decide upon the material that I’ll use.

Materials

The obvious choice for a hobbyist such as myself who does not (yet) own his own CNC machinery is laser-cut acrylic because of the ease with which it can be cut for me by an online service. I also need at least some part of my case to be transparent so that the large LED displays mounted on the PCB are visible from the outside.

So acrylic it is. It’s a material I’ve used before when designing the case for my reflow controller so I have at least a bare minimum of familiarity with the material.

This time I’ve decided to go for a transparent neutral grey colour because I recall that during the YouTube video for the controller I used a small piece of transparent grey perspex to filter the LED displays so that they showed up on the camera. It looked really nice and clear so I’ve decided to go ahead and use it for the whole case.

The last decision to make is the thickness. The online services offer 3mm and 5mm. I chose to use 5mm for the reflow oven case because it was quite large and I wanted to reduce the chance of it flexing. You have to be careful when choosing thicker panel materials because the shafts of many of the switches and particularly cable glands (if you’re using them) may not have a long enough thread to go through the panel and leave enough thread protruding to attach the locking nut.

This time my case is smaller in size than I made before so I’ve selected the 3mm thickness. Let’s hope it works out favourably.

Cutting service

The service that I use to get the materials cut must be selected before I start designing because there will be some constraints I have to abide by, for example:

  • The template size. They will be buying in sheets of raw acrylic and I’ll need to fit my design within the size that they offer, probably even on a template file I can download.
  • The laser kerf size. They’ll tell me the diameter of the laser beam that does the cutting and I’ll need to use this to calculate offsets from the ideal positions of parts of the case that are intended to fit snugly together. This is very important.
  • The colours and widths of lines in the design that I have to use to indicate areas that should be cut and areas that should be engraved.

Last time around I used RazorLab in the UK for the cutting service. It worked well and so I’ll be using them again. I downloaded their ‘P2’ template and got started with the design in Inkscape.

Inkscape design

Inkscape is a freeware drawing program that I used last time. It has a somewhat clunky UI, it’s certainly no Illustrator or Corel Draw, but I need only a small subset of the functionality to create a case design and it’s not hard to learn. It doesn’t hold me up or cause me to curse the screen so I’ve no problem recommending it for this purpose.

Designing the tabs used for the case closure is a laborious and repetitive task so to get me off to a headstart I used the MakerCase website to produce the initial outline. I use the T-slot option to create a design that includes captive screws to hold the case together.

The finger joints hold the panels in the correct position relative to each other and the captive screws in the T-slots pull them together so they don’t fall apart.

When you save a design from the website it asks you for the laser kerf width. It will use that width to shrink back the cut lines with the expectation that the width of the laser will spill over from the center line sufficiently to make a snug fit.

Last time I used a setting of 0.2mm and, while I could get my case together, it was very difficult and I had to apply silicone oil to get the finger joints to come together. This time I’ve reduced the kerf to 0.1mm to allow more beam spillover and hopefully an easier fit that’s not too loose. This is the kind of thing that you can only learn from experience.

Now with the basic case panels created I need to add the cutouts and engravings.

Switchgear

All the bits and pieces I need have been ordered from ebay so it’s just a matter of getting the calipers out and measuring the sizes of the holes that I need and then positioning those holes evenly at the best place on each panel.

I don’t necessarily follow a scientific method of doing this. I take a best guess at where I think the holes are going to have to be and then I’ll print out the design 1:1 on a piece of paper and use scissors to cut out the panels. Then by offering up the controller board to the mock panels I can accurately measure how far off I am from where I thought I’d be and make the necessary corrections before doing another round of printouts.




Click to download original SVG. There’s a second design for another, unrelated case at the top of this graphic.

Finally I’m ready to upload. The price charged is based on a flat rate for the piece of acrylic and then a variable charge for the amount of laser time that will be used to do the cutting. You can upload and test your design at Razorlab’s site as often as you like to see how much it’s going to cost before you submit it for making. Bear in mind that the cost quoted is before delivery and 20% VAT.

The flat rate for the acrylic is fairly low and I’ll use up most of my cost on the cutting time so it pays to optimise that as much as I can. Finger joints and T-slots require quite a lot of cutting so the MakerCase site does try to optimise by placing panels back-to-back where identical cuts can be merged into one.

I can optimise a little myself by reducing the number of finger joints to the minimum required and, since I have another PCB for a different design that needs a case, I will include it in the unused space. It’s a balancing act between the fixed cost of the acrylic and the additional cost of spending laser time on using up unused space.

It’s arrived

Razorlab were very quick this time and I got my bits back within a week at a total cost of about £40 including VAT and delivery. It’s quite expensive relative to the cost of having ten PCBs made in China but still much less than anywhere else I’ve seen.

All of the pieces are looking really nice and a quick check reveals that my choice of the grey transparent acrylic is a good one because LEDs show through easily.

Some of the parts had bits of the protective film missing but I’m happy to say that there were no scratches, just a few areas with some glue residue that was easily removed with some white spirit.

Assembly

The first thing I checked was that the finger joints would slot together. Bearing in mind the problems I had last time, I was really hoping the reduced 0.1mm kerf setting would fix it. I’m happy to say that it did. The panels slot together nice and easy with no lateral play at all. 0.1mm is the correct setting for kerf at Razorlab.

Next up is to make sure that the switchgear fit in the holes and this is where I hit two snags, both of my own making. Firstly the power switch doesn’t quite fit along the long side.

It’s really very, very close; much less than a millimeter and I’ve no idea how it happened. I measured it with calipers and I’m sure I added a little for safety but I guess not. The laser cut hole matches my design so it’s all my fault for sure.

The other problem is the anti-twist notch on the rotary encoder. It’s supposed to mate with a small hole on the case to prevent the encoder from twisting around the case.

The case hole for the notch is too small. Again it’s a tiny difference and this time I’m scratching my head as to how it happened because I did a copy-and-paste from my reflow oven design on to this one and somewhere along the way the hole got smaller. The only thing I can think of is that I must have had the hole accidentally selected in Inkscape when I applied a transformation or pressed a key combination that did a resize instead of a move. Again, user error.

Both of these errors could have been spotted with a more rigourous QA check before uploading and that’s something I’ll be doing in the future. Anyway, for now I can fix these problems quite easily.

I fixed the switch problem by using a dremel on high speed to very gently shave the case hole a little larger. In the video that accompanies this article I suggest that I might shave the switch instead but I tried that and the plastic seemed to be the type that would rather melt than shave so I gave up and gingerly tackled the case instead and it worked out fine.

The encoder notch was easily fixed by using a dremel to shave the notch so that it’s a little slimmer. It’s made of some cheap and soft base metal that easily gave way to the dremel.

Finally I can now assemble it.

It took the best part of an afternoon which was a little longer than I thought it would. It would have been easier if I removed the PCB terminal blocks and soldered directly to the board but I opted to leave them in case I ever wanted to lift the board back out from the case.

Fat 2.5mm wires are used to connect up the incoming power source. Solid core hookup wires are used to connect the load sense inputs. The UART connections go to a cheap UART-to-RS232 board that I got on ebay. This came as a kit of parts that I had to assemble.

The front panel has a nice balanced appearance with the encoder knob in the center, the power outputs on the left and the switches to the right. I already had clear acrylic screws available so I used those for the case fastenings. Initial misgivings about the constrasting colour of the screws were misplaced because I think it looks really nice like that.

The female DB9 connector for the RS232 output screws into the case leaving enough space for the shroud of the cable to slot in and fully mate with the pins. I had some concerns about the proximity of the screw holes to the connector opening but it does seem to be OK.

The fan opening is designed to be right by the power MOSFET on the module which ought to be the main source of heat. It’s a Gelid Silent 4 that looks nice in white. For consistency I used the same transparent acrylic screws as everywhere else and ran the connector under the board and back over the left to keep it tidy.

A set of four 3M Bump-ons complete the case. If you never used these things before then I highly recommend them. The rubber compound that they use is extremely tacky, easily preventing the case from sliding over your workspace when you operate the controls.

Testing

I kept testing it throughout the assembly. There’s little point in fully assembling it only to realise that something doesn’t work that would require it to be all taken to bits again. When I finally came to close the lid I was confident that it would work.

Here’s a photograph of it running. The effect of the transparent grey case is to attenuate the glare from the LEDs so that the digits become very clear to read and you can’t see the inter-segment parts at all. If you’ve ever used these seven-segment LEDs before then you’ll know what I mean about the readability and over-visibility of the material used to separate the segments. I highly recommend this material as a filter for seven-segment LEDs.

The rear power switch has a built-in 12V green LED that looks nice glowing at the back of the case.

Final words

Despite a couple of completely avoidable operator errors I did enjoy spending an afternoon putting this case together and the result is that I have a completed project that I can and will use in future.

I shot a fairly short video of the assembly process. If you’d like to watch it then click below, or better still open it on the YouTube site to view it in much better quality.

Please feel free to leave a comment below or if you’d like to ask a more detailed question and start a discussion then do please join the forum thread.


Process automation: building a process controller

$
0
0

Not a lot of people know this, but I brew my own beer as a hobby; and I’m not talking about the murky coloured astringent tasting dodgy brews of yesteryear. The beer I brew is probably best described as craft ale. I do the whole process, much like a brewery would. I design recipes, crush my own grain, culture yeast, adjust water chemistry and most important of all I control the temperature of the process at every stage from beginning to end. When I get it right, and it took about a year to nail the whole process then the result is a crystal-clear ale that you’d be happy to be served down the pub.

Many parts of the brewing process require you to set and hold the temperature of the soon-to-be-beer at a particular level for a set amount of time. At the beginning of the process there are stages known as the ‘mash’ and the ‘boil’. Both of these stages, the mash in particular, require careful setting and holding of the temperature of the ‘wort’ for a relatively short period of time. ‘Wort’ is the term given to the sugary solution derived from steeped and crushed malt that the yeast ferments to produce beer. There are many odd-sounding terms in brewing, and that’s one of them.

When I’m done boiling, cooling and the yeast has been added then the wort needs to be held within a particular temperature range while the yeast does the good work at the temperature where it works the best. This can take a few weeks and the ideal temperature can vary during this time.

For example, during the first few days the yeast is very busy indeed and produces its own heat while it’s working. This is the time to keep the temperature at the bottom of the working range if you want a clean tasting result, or perhaps slightly higher if you don’t mind the production of the fruity esters that are characteristic of a yeast working at a higher temperature. Later on when the reaction has died down it can help to raise the temperature a few degrees towards the top of the range to encourage the yeast to finish the job and start to drop to the bottom.

Later still, I like to activate just the fridge for about a week and chill the beer down as low as it will go. This sends the yeast into dormancy, clarifying the beer ready for bottling and the secondary fermentation stage.

It’s these parts that I’ve already partially automated and would like to improve upon.

In common with many hobby brewers I’ve got hold of a larder fridge in which to perform my fermentation, and subsequent chilling. The basic idea is that a very simple and cheap on-off controller known as the STC-1000 (in the image above) monitors the temperature via a sensor on a long wire and when it’s too cold it switches on a heater that you supply and conversely when it’s too hot it switches the fridge on.

It’s a very simple device and it has problems, one of which is overshoot and undershoot. Because it’s just relay-based and not a phase-angle controller it cannot vary the temperature of the heater to perform accurate PID operation. It does understand that fridges cannot be rapidly switched and does have a blackout period to protect the compressor but it doesn’t learn the extent of the overshoot and undershoot and compensate for it.

There are other issues. The STC-1000 is cheap and originates from China. The relays, upon which you are totally reliant, are an unknown brand supposedly rated at 10A. When a Chinese component says it’s rated at 10A then you’d better test and de-rate accordingly. The temperature probe is inaccurate. I have a calibrated digital thermometer that showed my STC-1000 to be out by 1.5°C. It has a very dumb single-point calibration offset that you can apply so I can adjust for that but really, I need to improve on this box.

Improving the brew fridge controller

A few people have already tackled this. Notably there’s a device called the BrewPi that uses a Raspberry Pi to create an entirely embedded solution. I tried it some time ago by following the DIY instructions that you can find here. It worked, but was a little messy and I thought the Web UI was a bit too ‘home made’. Their new completely embedded Spark V2 edition does look very nice though. Not convinced by the DIY Arduino solution and not wanting to fork out for the Spark V2, I went back to my STC-1000.

I started to have ideas of my own. A relative of mine is a professional brewer, founder of the Robin Hood Brewery in Nottingham, and my brother-in-law, a retired industrial electrician, set up his semi-automated brewery. We got to talking about how it’s done in the real world where a failed batch means significant financial loss. We talked about the PLC controllers he’d fitted and the network of platinum RTD temperature sensors that fed into them and I thought that the techniques and at least some of the technology could be adapted by the home brewer.

My plan then, is to repurpose a PC that I have just sitting around here doing nothing into a process automation controller that I can control over the internet using a secure Android app.

The PC will host add-in boards that I create and connect to it internally by the USB bus. The add-in boards will be detected and managed by the linux operating system. I will create the layers of firmware and software that enable me to control it remotely using an app. I like Android apps for this purpose because I can write them in a language that I know and use daily in my professional life — sorry but I don’t know Javascript very well so there will be no ‘web’ UI.

Physically, the add-in boards will be designed to fit the format of an internal hard disk so that they can be screwed into a hard drive bay caddy.

The screw holes are the standard spacing for a 2.5″ drive, as show in the CAD diagram. I chose this footprint over the larger one for 3.5″ drives because it fits nicely within the 100mm square PCB size that is so economical to have made in China.

The mounted boards can then be slid in and out of the bays in my PC case. There are so many bays in this case that I’ll have no problems fitting the boards inside.

That’s an Asus P5Q-Pro in there supporting my old QX9650 CPU. That was quite the CPU back in the day and can consume a lot of power if pushed hard. I’ve measured this PC as drawing around 65W of power when idling running Linux. Not too bad I guess. An Intel PC is never really going to fall into the ‘low power’ bracket.

I’ve bought a cheap PCIe expansion card that has internal USB connectors that I will use to hook up the boards. There are plenty of those to choose from, for example this one by CSL.

This is a USB 3.0 card based on the VLI VL806 chipset that I’ll be using in USB 2.0 mode. I’m hoping that Linux will be able to detect it as I’ve heard horror stories about USB 3.0 support on Linux that I hope are a thing of the past now. In any case, whichever card I choose it must be a root controller capable of supplying the full 500mA current to connected devices. Bus-powered USB 2.0 hubs will not work with the system I’m planning because of the 100mA per-device limit.

Boards to build

I know for sure that I’ll need a board to perform the switching and PID tasks. It’ll need to have relays for direct switching and a triac for phase-angle control. I’ll need to modify my case to allow mains cabling to enter safely and connect to the board.

I’ll need a temperature sensor board and, referring back to the professional system, I want it to be based on platinum RTD sensors. RTDs are the industry standard for this application because of their high precision, low noise and high linearity. Good RTD sensors are expensive and so is the circuitry required to carry though the accuracy of the RTD all the way to the digital conversion. I think it’s worth it though as everything else you do depends entirely upon being able to believe the temperature readings that you’re getting.

Those two boards should be enough to get going with others to come later. I could add a board to display status readings on a display mounted in a PC front drive bay. I could do a cheaper temperature sensor board using the DS18B20 sensors to sense non-critical parts such as the local ambient temperature. There are many possibilities.

That’s all for now. I’m off to go and design some boards and install the latest Ubuntu on an SSD in my old PC to bring it back to life. More articles will be posted here as I make progress with this project.

Process automation: relays and triacs

$
0
0

In my previous article I discussed how I intended to convert an old PC into a controller that I could use to automate the temperature control required to ferment and conditional beer. If you haven’t already read that introduction then I’d encourage you to do that so you know what it is that I’m trying to achieve.

This article will give full details of the first board that I’ve built, designed to fit inside the PC and control relays and triacs.

Design

The heaters, fridge and fans that control the temperature in my brew fridge need to be switched on and off and that’s what this board is designed to achieve.

As you can see from the diagram the main features of the board are:

  • Three relays for basic on/off switching. To solve one of the issues with the STC-1000 controller I will use 16A, name brand relays for maximum reliability.
  • A triac. This will give me the ability to do phase-angle ‘dimming’ control and will be used for proportional heater or fan control.
  • USB connectivity to the host PC via a USB-to-UART IC.

I expect that many of my readers will want to build this board so I’ll design it to be all through-hole using easy to find components so that you can build it with just a basic soldering iron. Fancy reflow ovens (much as I love mine) and hot air guns will be not required for this build!

With the design criteria in mind I set about producing a schematic for the board.




Click on the thumbnail to download a PDF

Let’s do a quick overview of the various functional blocks within the schematic.

Relays

The relay block is repeated three times in the schematic, labelled Heat, Chill and AUX1. There’s no reason why they should actually be controlling heat and chill but I preferred those labels to the more boring 1, 2, 3 etc.

The relays are the RZ03-1A4-D005 by TE Connectivity. They’re 16A rated with a 5V coil and a rated contact voltage of 250VAC. These will be more than enough for my requirements and should last for many years.

The microcontroller activates the relay by driving the base of the MOSFET Q1 and as long as the base is held high then the relay will remain switched on. This is known as a non-latching relay and it’s important because we want the system to fail-safe, i.e. off, if there’s a system power failure. R5 pulls the MOSFET base down to ground so that the relay remains off during system power-up when the driving signal is in an indeterminate state.

The signal that drives Q1 also drives Q4 to light an indicator LED that will give a visual indication that the relay is on. With the board inside the case I won’t be able to see this but if you’re building it and have one of those cases with a window on the side then you will see it.

D4 is the protection diode and I’m using the 1N4007 because I’ve got lots of them. In practice any of the 1N400x series will work fine. P1 is a two-terminal connection block that will have an appropriate current rating for this board.

I’ve created a net class for the high-voltage parts of the circuit so that when it comes to the PCB layout I can apply design rules such as minimum trace width and clearance to just those parts of the circuit.

The triac and its controller

The triac gives me the ability to do phase-angle ‘dimming’ control of the mains AC sine wave. If you’re coming to this from a microcontroller background where the supply is both DC and low voltage then you’re probably familiar with the common Pulse Width Modulation (PWM) technique that’s used to give the appearance of dimming, often with LEDs, by rapidly switching the DC supply on and off and relying on our human persistence of vision to give the perception of a constant light level.

This works well for DC supplies but can the same technique be applied to the mains AC supply where the signal is a dangerously high voltage and a sine wave? The straight answer is that no you can’t, but you can achieve the same result with what we call Phase Angle Control.

Here’s what the mains supply sine wave looks like in the UK where I live. We have a nominal supply voltage of 240V RMS at a frequency of 50Hz although the actual RMS voltage delivered to the house will vary depending on all sorts of things such as your distance from the local substation and the demand in the area at the time. As I write this on a Sunday lunchtime I’m seeing 238V and I’ve seen it up to the full 240V at other times.

To give the appearance of dimming a load we rely on the same basic rapid on/off switching of the supply as we do with the PWM technique but we must observe an additional constraint. We cannot choose any frequency we like, instead we must use the same frequency as the mains supply and we must stay in sync with it. In the UK that means we’ll be operating our dimmer at 100Hz (once for the positive half of the 50Hz signal and again for the negative half).

So, we need a technique for detecting when the sine wave passes through the zero point on the X-axis so that we can stay in sync with it and we also need a device that can rapidly switch a load on and off regardless of the polarity of the signal.

Let’s deal with the switching device first. The most commonly used component is the triac because it’s small, cheap and can be made to handle large currents and voltages. A triac is a type of bi-directional thyristor that only conducts when triggered by a current at its gate.

Triac’s have three terminals. Two of them are intuitively labelled MT1 and MT2 for ‘Main Terminal’ one and two. Your load current flows between these two terminals, but only when the triac has been triggered by a current at its gate. In traditional dimming circuits where the mains is the only supply available a component called a diac is often used to drive the gate of a triac.

These days with low voltage DC logic circuits so popular ‘logic level’ triacs have become available that are easily triggered by a low gate current and they are often accompanied by a dedicated triac optocoupler to provide isolation between the low and high voltage sides. The Fairchild MOC30x0 series is very popular and that’s the one I’ll be using in my circuit.

Once activated by a pulse of sufficient duration and magnitude at its gate then a triac will latch ‘on’ until the zero crossing current point when it will attempt to turn itself off.

Triggering in a Triac is explained by dividing up the relative polarities of the 3 terminals into quadrants. You can read about these in the Wiki article so I won’t go into it again in depth here. One important takeaway from the Wiki article is the note about three quadrant, or snubberless triacs.

When driving loads with a significant reactive component the triac is particularly vulnerable to false triggering in quadrant four. By disabling triggering in this quadrant this vulnerability is reduced so much that the typical RC snubber circuit that you often see may not be required. I usually drive loads that are almost purely resistive such as heating and lighting but I still exclusively use 3Q triacs because they cost the same as 4Q triacs and I don’t need the RC snubber components.

Now we know how we’re going to turn the mains on and off we need to examine when we’re going to do it.

Zero crossing detection

There are countless ways to do this. Sometimes you don’t even need to design a specific sub-circuit for it because it will be naturally available at a particular node in your design, as was the case in my reflow oven power-supply circuit. Some designs are potentially unsafe because they don’t isolate the mains voltage from the low voltage detection circuit.

The design I’ve chosen is both safe and cheap but does come at the cost of about 0.5W of power consumption. It’s safe because it optically isolates the mains from the low voltage side and it’s cheap because it uses just three resistors, an AC optocoupler and a MOSFET.

This circuit will provide a falling edge pulse at the zero current crossing point. R8 limits the current flowing through the optocoupler to the smallest value that we can get away with so that the power dissipated inside R8 is minimised. The optocoupler provides safe isolation between the mains and the low voltage side of the circuit.

When current is available from the mains to light the LEDs inside the optocoupler then the opposing phototransistor will conduct current from 5V through R2 and so to ground. A microcontroller input pin attached to the ZSENSE point will sense 5V through the resistor R21 and read a high level. It doesn’t matter whether the load has inductive or reactive components because the LEDs in the optocoupler are lit by current, not voltage.

At or very near the point of the zero crossing the phototransistor will stop conducting. The 5V voltage will then activate the gate of the MOSFET causing current to flow through R21 and across the MOSFET to ground. Our microcontroller will now sense a low level and we have a falling edge of a logic signal that we can use to trigger an interrupt and start our work.

The values for R21 and R13 are important to the accuracy of the zero crossing detection and were derived empirically with the aid of a test circuit and an oscilloscope.

Now we’re in full possession of all the pre-requisites for controlling the mains supply we can illustrate what we’re going to do with a simple flow chart.

If we put that into action then our ‘dimmed’ mains signal might look like this.

The orange lines indicate where we activate the triac and switch on the load. The triac automatically switches off at the next zero crossing and we start our timing cycle again. The chopped waveform is sufficent to give the appearance of dimming in loads that can support it.

To complete the design R12 is a MOV to protect the triac from voltage spikes and C7 is for interference suppression.

The microcontroller

Yes it’s the venerable old ATMega328P as popularised by the Arduino and its clone army, chosen for this design because its wide availability, easy programming model, through-hole packaging and 5V voltage compatibility. I’ve opted to include an 8MHz crystal on the board in case I decide that the firmware should have the ability to run a timed program over the course of several days even if the main host computer has crashed. External crystal’s are just about accurate enough to do that whereas the internal RC oscillator is not.

The USB controller

There are many USB-to-UART bridge chips around but when my design criteria for through-hole devices is taken into account that list drops down to one, the MCP2221 from Microchip. It’s trivial to use in that you just hook one side up to your MCU, the other side to the USB port and you can immediately start talking to it at 9600 baud. Rumour has it that it’s actually a hard-wired PIC inside that package.

I’ve connected up the I2C bus even though I probably won’t use it and I’ve also added the three LEDs that indicate activity. These will be normally-on, flashing briefly off when there’s activity.

Now we’ve seen the design, it’s time to move on to the PCB layout.

PCB Design

There are two main constraints that I need to work with when laying out the board. Firstly, the mounting holes must fit the footprint of a 2.5″ hard drive so that the board can be mounted in a PC drive bay. Secondly the overall size of the board should fit within the 10x10cm format that’s so cheap to have manufactured in China.

With the board outline defined and the screw holes placed I then positioned the relays and the triac so that their associated terminal blocks would all face in one direction off the board. The USB ‘B’ connector was placed at the other end of the board so that we have data coming in at one side and the high voltage section all at the other side.

Now it was simply a matter of placing all the other components in as much of a logical place as possible and routing them all up. The board layout is finalised with top and bottom ground pours in the low-voltage areas and teardrop connections to all the pads for better connection integrity.



I added a silkscreen border and some text to make it clear where the high voltage areas are located on this board.

Another useful step in the verification process is to view the board in 3D mode where I can check the heights of all the components that I have models for and easily see common errors such as silkscreen overlapping pads.

All of the parts are available at Farnell so I put in an order for those and also for 10 copies of the PCB from Seeed Studio in China since they were having a crazy offer of US$4.90 for 10 before postage.

Bill of Materials

Here’s the bill of materials for this design showing component values for a 220-240V mains supply. Refer to the schematic for suggested values for a 120V supply.

DesignatorValueQuantityDescriptionFootprintFarnell codeNotes
C1, C2, C3, C5, C8, C9100n6Ceramic capacitor2.54mm2309020
C410n1Ceramic capacitor2.54mm2309024
C6, C1110µ2Electrolytic capacitor5x11mm1902913
C7100n1Panasonic ECQU2A104KLA X2 Film Capacitor1673308
C101Ceramic capacitor5.08mm2112910[1]
C12, C1322p2Ceramic capacitor2.54mm1100369
D1, D2, D3White3LED3mm[3]
D4, D5, D61N40073DiodeDO-412317417[2]
D7Green1LED3mm[3]
D8, D9, D10Amber3LED3mm[3]
FB1BLM18PG221SN1D1Ferrite beadAXIAL-0.32292304
K1, K2, K3RZ03-1A4-D0053Single-Pole Single-Throw Relay2325624
P1, P2, P3, P5, P6PM5.08/2/903WEIDMULLER PM5.08/2/90PCB terminal block - 5.081131855
P4USB B1USB B RECEPTACLE1177885
P72x5 header1ISP connector2.54mm[4]
Q1, Q2, Q3, Q4, Q5, Q6, Q9BS1707N-Channel MOSFETTO-921017687
Q72N55511NPN Bipolar TransistorTO-929846751
Q8BTA16-600BW1TriacTO-2201175636
R1, R2, R36803ResistorAXIAL-0.32329545
R4, R5, R6, R7, R1710k5ResistorAXIAL-0.32329474
R8100k 2W 500v1ResistorAXIAL-0.92329558
R9, R18, R19, R204704ResistorAXIAL-0.32329531
R103601ResistorAXIAL-0.52329779
R111k1ResistorAXIAL-0.32329486
R12275VRMS1MOVVaristor 20D series1856919
R13, R216.8k2ResistorAXIAL-0.31700247
R141001ResistorAXIAL-0.32329473
R15, R162.2k2ResistorAXIAL-0.32329584
U1MCP2221-I/P1Microchip USB-SerialDIP-142434892
U2FOD814A1OptocouplerDIP-42322510
U3MOC3020M1Opto-TriacDIP-61471017
U4ATMega328p18-bit AVR MicrocontrollerDIP-281715487
Y18MHz1Crystal OscillatorHC49 thru hole2063945
Ohmite FA-T220-38E1Triac heatsink2097690

Notes

Some of the components have note numbers against them. The following numbered paragraphs correspond to a numbered note in the bill of materials table.

  1. 2.54mm parts can also be used if you carefully bend the leads outwards to fit the wider 5.08mm pitch.
  2. Any of the 1N400x series will be fine. They all cost about the same so I tend to keep a stock of the biggest one, the 1N4007 around.
  3. Any colour of 3mm LED will work and they’re cheapest on ebay.
  4. These 2.54mm headers are cheapest on ebay.

Assembly

The boards arrived in about 3 weeks and they look very good. It’s just ridiculous how cheaply these can be produced these days.



Assembling a purely through-hole board is as simple as sitting down with my soldering iron, a pair of snips and just getting on with it. You can make life easier for yourself by doing the components in ascending height order. That means starting with the resistors and moving up through the diodes, capacitors etc.

I choose to use sockets for all my DIP ICs though you don’t have to if you’re confident that your design is going to work first time.

An important thing to note is that you must solder the triac so that when the heatsink is attached there is at least a millimetre or two of air clearance between the heatsink and the board because there are high voltage traces running under the heatsink and you wouldn’t want to be depending on the soldermask and the heatsink’s own coating for insulation.

Looking good with all those chunky through-hole parts on board. Here’s another shot showing it mounted on a hard disk caddy.

As you can see it fits nicely into the footprint of the hard disk caddy where the screw holes in the board mate up with the footprint for a 2.5″ hard disk.

Testing

Firstly I need to plug it in and see what happens. The MCP2221 should be completely autonomous and enumerate as a USB device without any firmware even being flashed on to the ATmega328p. Since the target operating system for the computer hosting this device is Linux I’ll be doing all my testing on a Ubuntu VM running under the free VMware player. My main PC runs Windows 10 as the host OS because nearly all the applications that I use on a daily basis are for Windows but since I have 48Gb of RAM it’s no problem at all to have at least one Linux server VM running for those processes that just run better on Linux.

I plugged it in. There was a chime from the PC, VMware found the device and I told it to send it to the guest OS. Time to see if it’s there.

$ ls -l /dev/usb* /dev/ttyACM*
crw-rw-rw- 1 andy dialout 166, 0 May 27 14:28 /dev/ttyACM0

/dev/usb:
total 0
crw------- 1 root root 180, 0 May 27 14:28 hiddev0

Cool, looks like I have new CDC and HID devices. Let’s get some info on it.

$ lsusb
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 002 Device 004: ID 04d8:00dd Microchip Technology, Inc.
Bus 002 Device 003: ID 0e0f:0002 VMware, Inc. Virtual USB Hub
Bus 002 Device 002: ID 0e0f:0003 VMware, Inc. Virtual Mouse
Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub

There it is on the second row. All the descriptors retrieved during device enumeration are retrievable by passing the -v flag to lsusb

$ lsusb -vd 04d8:00dd

Bus 002 Device 004: ID 04d8:00dd Microchip Technology, Inc.
Couldn't open device, some information will be missing
Device Descriptor:
  bLength                18
  bDescriptorType         1
  bcdUSB               2.00
  bDeviceClass          239 Miscellaneous Device
  bDeviceSubClass         2 ?
  bDeviceProtocol         1 Interface Association
  bMaxPacketSize0         8
  idVendor           0x04d8 Microchip Technology, Inc.
  idProduct          0x00dd
  bcdDevice            1.00
  iManufacturer           1
  iProduct                2
  iSerial                 0
  bNumConfigurations      1
  Configuration Descriptor:
    bLength                 9
    bDescriptorType         2
    wTotalLength          107
    bNumInterfaces          3
    bConfigurationValue     1
    iConfiguration          0
    bmAttributes         0x80
      (Bus Powered)
    MaxPower              100mA
    Interface Association:
      bLength                 8
      bDescriptorType        11
      bFirstInterface         0
      bInterfaceCount         2
      bFunctionClass          2 Communications
      bFunctionSubClass       2 Abstract (modem)
      bFunctionProtocol       1 AT-commands (v.25ter)
      iFunction               0
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        0
      bAlternateSetting       0
      bNumEndpoints           1
      bInterfaceClass         2 Communications
      bInterfaceSubClass      2 Abstract (modem)
      bInterfaceProtocol      1 AT-commands (v.25ter)
      iInterface              0
      CDC Header:
        bcdCDC               1.10
      CDC ACM:
        bmCapabilities       0x02
          line coding and serial state
      CDC Union:
        bMasterInterface        0
        bSlaveInterface         1
      CDC Call Management:
        bmCapabilities       0x00
        bDataInterface          1
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x81  EP 1 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0008  1x 8 bytes
        bInterval               2
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        1
      bAlternateSetting       0
      bNumEndpoints           2
      bInterfaceClass        10 CDC Data
      bInterfaceSubClass      0 Unused
      bInterfaceProtocol      0
      iInterface              0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x02  EP 2 OUT
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0010  1x 16 bytes
        bInterval               0
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x82  EP 2 IN
        bmAttributes            2
          Transfer Type            Bulk
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0010  1x 16 bytes
        bInterval               0
    Interface Descriptor:
      bLength                 9
      bDescriptorType         4
      bInterfaceNumber        2
      bAlternateSetting       0
      bNumEndpoints           2
      bInterfaceClass         3 Human Interface Device
      bInterfaceSubClass      0 No Subclass
      bInterfaceProtocol      0 None
      iInterface              0
        HID Device Descriptor:
          bLength                 9
          bDescriptorType        33
          bcdHID               1.11
          bCountryCode            0 Not supported
          bNumDescriptors         1
          bDescriptorType        34 Report
          wDescriptorLength      28
         Report Descriptors:
           ** UNAVAILABLE **
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x83  EP 3 IN
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0040  1x 64 bytes
        bInterval               1
      Endpoint Descriptor:
        bLength                 7
        bDescriptorType         5
        bEndpointAddress     0x03  EP 3 OUT
        bmAttributes            3
          Transfer Type            Interrupt
          Synch Type               None
          Usage Type               Data
        wMaxPacketSize     0x0040  1x 64 bytes
        bInterval               1

Endpoints are visible for the CDC and HID devices in that list. I don’t care about the HID device and will just be addressing the ATMega328p’s UART through the CDC interface at the default speed of 9600 baud.

In its default state Linux will only allow root to talk to a /dev/ttyACM device. I need to change that so my ordinary unprivileged user can use it and thankfully Linux provides a way. Adding the following udev rules file did the trick.

$ cat /etc/usdev/rules.d/20-brewery-controllers.rules
SUBSYSTEM=="tty" ATTRS{manufacturer}=="Microchip Technology Inc." SYMLINK+="Andy%n" MODE="0666", OWNER="andy"

With this in place I’ll get user-accessible symlinks automatically created and removed that point to the real devices.

$ ls -l /dev/Andy*
lrwxrwxrwx 1 root root 7 May 27 14:28 /dev/Andy0 -> ttyACM0

Next I needed to test that the MCU is alive so I connected up my USBASP programmer and read out the fuses using avrdude. I had to do this from Windows because for reasons I never got to the bottom of the USBASP programmer will not with Avrdude when the Linux host is running in a virtual machine. This means that I’m developing in a split-personality system where the MCP2221 is connected to the Linux guest and the USBASP is connected to the Windows host with the systems sharing source code using a VM mount on to the host filesystem. Fun times.

There were no problems. The MCU was up and running so now it’s time for me to create some firmware.

The firmware operates in a simple command/response mode over the UART. You can see the source code here on github. Each command must be terminated by a CRLF pair and the single line response will also be terminated the same way. The accepted command set is:

CommandsExpected responseDescription
HEAT ON
HEAT OFF
OKTurns the HEAT relay on or off
CHILL ON
CHILL OFF
OKTurns the CHILL relay on or off
AUX1 ON
AUX1 OFF
OKTurns the AUX1 relay on or off
AUX2 <percent>OKSets the AUX2 triac 'dimmer' level to the specified percentage
IDJSON textReturns the board identifier string as JSON
COPYplain textReturns a copyright statement valid for this board
VERJSON textReturns the hardware and firmware versions of the board
CAPSJSON textReturns the capabilities (number of relays etc) of this board
UPTIMEOK:<ms>Milliseconds
VALID heat chill aux1 aux2OKSets the 'valid mask' of permitted switch combinations

Most of the commands are obvious in what they do and the reason for some of them returning JSON is so that they can be easily parsed by the linux server process that I will create to manage the communication between the user interface and the hardware.

The VALID command is a safety feature that sets which switches are permitted to be on at the same time, so for example if the user asks for the heater and chiller to be switched on at the same time the firmware will not permit it. The default setting in the firmware is for all the controls to be isolated from each other. That is, no relay or triac may be on at the same time as any of the others. The documentation for how to specify the four parameters can be found in the source code here on github.

If you don’t want to build the firmware from source then you can find a pre-built hex file in the bin directory of the github repo. Building from source requires a local installation of avr-gcc that supports C++11, I used 4.9.2, and also an installation of the scons build system.

To build from source and simultaneously upload using the USBASP programmer the command is:

$ scons mains=50 upload
scons: Reading SConscript files ...
scons: done reading SConscript files.
scons: Building targets ...
avrdude -c usbasp -p m328p -e -U flash:w:brewery-relays.hex

avrdude.exe: warning: cannot set sck period. please check for usbasp firmware update.
avrdude.exe: AVR device initialized and ready to accept instructions

Reading | ################################################## | 100% 0.00s

avrdude.exe: Device signature = 0x1e950f
avrdude.exe: erasing chip
avrdude.exe: warning: cannot set sck period. please check for usbasp firmware update.
avrdude.exe: reading input file "brewery-relays.hex"
avrdude.exe: input file brewery-relays.hex auto detected as Intel Hex
avrdude.exe: writing flash (4196 bytes):

Writing | ################################################## | 100% 2.85s

avrdude.exe: 4196 bytes of flash written
avrdude.exe: verifying flash memory against brewery-relays.hex:
avrdude.exe: load data flash data from input file brewery-relays.hex:
avrdude.exe: input file brewery-relays.hex auto detected as Intel Hex
avrdude.exe: input file brewery-relays.hex contains 4196 bytes
avrdude.exe: reading on-chip flash data:

Reading | ################################################## | 100% 2.16s

avrdude.exe: verifying ...
avrdude.exe: 4196 bytes of flash verified

avrdude.exe done.  Thank you.

If you live in a country where the mains supply is 60Hz then simply change mains=50 to mains=60.

The first time you build, you must also set the fuses so that the MCU uses the external 8MHz crystal. The command for that is:

$ scons mains=50 fuse

Linux offers a range of serial comms programs that I can use for testing. I briefly tried screen, cu and minicom and couldn’t get them to work well in reasonable time because their defaults are set up for real terminals on the ‘other end’. Python came to the rescue with the miniterm utility built in to the PySerial package. Here’s an example of me using it.

$ python -m serial.tools.miniterm /dev/Andy0
--- Miniterm on /dev/Andy0  9600,8,N,1 ---
--- Quit: Ctrl+] | Menu: Ctrl+T | Help: Ctrl+T followed by Ctrl+H ---
"Andy's Workshop Brewery switching controller"␀
{"type":"switching","relays":["HEAT","CHILL","AUX1"],"triacs":["AUX2"]}␀
{"hardware":1,"firmware":1}␀

My firmware does not echo back the characters that you type because it would be pointless in an automated system. You have to take it on trust that the device is actually receiving what you type, or if you stare intently at the middle of the three indicator LEDs then you’ll see it briefly flash off and on again each time you press a key. In the above session I typed id, caps and ver.

Now let’s switch a relay on and see the result. The command to enter is heat on

--- Miniterm on /dev/Andy0  9600,8,N,1 ---
--- Quit: Ctrl+] | Menu: Ctrl+T | Help: Ctrl+T followed by Ctrl+H ---
OK␀

The firmware responded with OK, there was a solid sounding clunk from the relay and the white indicator LED switched on.

The relays have an advertised maximum switching frequency of 360 times per hour when loaded, or once every 10 seconds. The application software that I write will not allow anything like this rate but one should never rely on software to always do the right thing so the firmware is coded to prevent any one relay being switched on twice in a 10s period. If you try, this happens.

--- Miniterm on /dev/Andy0  9600,8,N,1 ---
--- Quit: Ctrl+] | Menu: Ctrl+T | Help: Ctrl+T followed by Ctrl+H ---
ERROR:05:Relay 10s blackout active␀

The other two relays all checked out good so now I need to test the triac before I can call it a day. This isn’t as simple as testing the relays because I’ll need to hook up the mains input and a test load to the AUX2 output. I’ll use a portable worklight for the load. Let’s give it a go.

My ‘Quick Test’ box by Cliff Electronics gives me a quick way to hook up the mains supply safely to the board. I tested the triac using the AUX2 <percentage> command with percentage values from 0 to 100 in steps of 10. The light responded correctly which indicated that the mains dimming logic and all the timers associated with that were working as designed.

That’s enough testing for today. I now have a working board and can move on to the next phase of the project which I think will be the RTD temperature sensors board.

Video

I made a YouTube video showing the board in operation. You can watch it here using the embedded video but much better quality can be had by visiting YouTube and watching it there.

Build your own

Although I have a specific use for this board the general concept of an internal PC board that can be used to switch and dim mains loads may have wider appeal. Visit my downloads page to get the Gerber files for this project. You can upload those to one of the cheap online services such as Seeed, ITead, PCBWay and get some copies manufactured for yourself.

You build and use this board at your own risk. Please be extremely careful when handling mains wiring. Never touch any part of the board while the supply is connected and always secure cabling inside a housing in such a way that it cannot move. If in doubt, get a qualified electrician to do it or at least have one review your work.

All the firmware and application source code is here on github.

Blank boards for sale

I’ve got some spare boards remaining from the batch of ten in my original order. If you’d prefer to buy one rather than have your own set manufactured then you can use the PayPal form below to make an order.


Location




Final words

This is the first concrete implementation of a board for my process controller and I’m very happy with the results so far, especially since I have no previous experience with the MCP2221 USB controller and yet it worked first time. I’ll be moving on now to the next stage of the project which will be to build a temperature sensor board based on RTD probes.

If you’d like to leave a comment then you can do so down below in the comments section or if you’d like to add to the discussion over in the forum then please also feel free.

Process automation: temperature sensing

$
0
0

My previous article documented how I designed and built a PCB that hosted three relays and a triac that could be mounted inside a PC case and connected up via the USB bus for host control using simple commands.


The relays and triacs board

That board is of course the output part of the system, responsible for executing the decisions made as a result of reading the inputs and executing control algorithms. Today’s article will document the development of the temperature sensors board used to sense the environment and provide the inputs to the system.

Temperature sensor technology

The first decision that I need to make is which technology to use for sensing temperature. I need a working range of 0 to 100°C and an accuracy of better than 1°C within the ranges of 60 to 70°C and 18°C to 22°C. The first of those ranges covers where brewers mash their grains and the second range is where fermentation is performed. Different characteristics in the finished beer are obtained by accurately controlling the temperature of those two processes.

Many technologies are available that differ in operating range, linearity, accuracy and stability. I had a look around at the availability and type of probes available for the different technologies and came up with the following summary.

Thermocouples

Thermocouples are subdivided into types differentiated by the type of metal junction at the hot end (J, K, N, E, R, S, T, and B). They can measure temperatures from as low as –265°C to over 1800°C. Thermocouples generate a voltage as a function of the temperature difference between the tip of the probe and the electrical connection on the PCB (the cold junction temperature).


Long stainless thermocouple probes are available

The K-type is the most commonly available thermocouple with a working range of –200°C to 1372°C. It’s easy to implement on a PCB thanks to all-in-one ICs such as the Maxim MAX31855. Unfortunately the standard accuracy of ±2.2°C isn’t good enough for me. The T-type would be much better for my application with a range of –250°C to 400°C and a standard accuracy of ±1.0C.

If I were to select a thermocouple as the best-fit technology then it would be a T-type.

NTC and PTC thermistors

As the name suggests these type of sensors are simply a resistor that changes value with temperature. An excitation current is applied to the sensor and a ratiometric measurement is made. Once the resistance is known an equation or lookup table can be used to convert the resistance to a temperature.

The accuracy of a thermistor can be very good. If carefully designed and calibrated then an accuracy of ±0.2°C is possible. The issue that rules this technology out for me is that the operating range is typically -40°C to 85°C. I need to measure up to the boiling point of water, around 100°C depending on your altitude amongst other things.

RTD

The resistance temperature detector (RTD) is a type of thermistor where the element is a length of wire wrapped around a core that’s often made of glass. The most common type is the platinum RTD and of the platinum types the PT100 is the most commonly used. PT100 means that it has a nominal resistance of 100Ω at 0°C.


Long stainless PT100 RTD probes are common

Platinum RTDs have excellent accuracy, linearity and long-term stability. In the range that I care about the cheapest ‘Class B’ probes offer accuracy of ±0.5°C. Class A probes reduce this error to just ±0.25°C.

The drawback to RTDs is the cost. Both the measurement circuitry and probes obtained from a quality source, i.e. not ebay (see later) are priced comfortably in excess of thermocouple or NTC thermistor probes. Nevertheless, this is a cost I can justify and it places RTDs at the forefront of the technology choice for my application.

Digital sensors

In recent times sensors such as the Maxim DS18B20 have emerged that combine the sensor and the conversion circuitry into one small integrated package.


Maxim bought Dallas…

These sensors are small enough to be completely integrated into a probe. All the user has to do is connect up the probe wires to a digital MCU and read out the temperature values. It really could not be easier.


This probe style would work if placed in a thermowell

This is a compelling proposition since Maxim claim sub-1°C accuracy and the ease of integration is unmatched. What put me off this option is that good quality (not ebay) long stainless steel probes are not available. With all the technology embedded into the DS18B20 it shouldn’t matter too much if the probe is sourced from an unknown shop in China but I would still be left open to the possibility of the DS18B20 inside being a fake or the probe material not actually being food-grade stainless steel. For these reasons I decided not to use this new technology.

The winning technology

It’s the PT100 RTD. The best combination of accuracy, linearity and long-term stability won the day. It also helps that I know this is the technology used in commercial breweries because an electrician relative of mine fitted out a small craft brewer’s setup. It’s going to cost me more than any of the other technologies but I don’t mind since this is the part of the system that I need to get right. If you can’t trust your inputs then the rest of your system is making decisions based on faulty data.

Selecting a converter

To get the most out of my RTD probe the conversion circuitry needs to be carefully designed. A very small excitation current is applied to the sensor and will ultimately be converted to a digital value using an ADC. All of the components in the signal path need to be of the highest quality and the supply voltages need to be noise-free and accurate.

All-in-one measurement and digital conversion ICs are available and I decided to use one of these rather than try to role my own from discrete components and run the risk of introducing sources of error that could take me ages to isolate and eliminate.

Maxim offer the MAX31865 in an annoying QFN package and it’s available quite cheaply from Farnell. I’d need two of them for the design that I’m planning but this is a good option.

Linear Technology offer the LTC2986 in an easy to work with quad flat pack package. In fact this IC can handle multiple types of sensor technology and lots of probes attached simultaneously. The only drawback is that it’s quite expensive at around £25 plus VAT from Farnell.

After careful consideration and much reading of datasheets I chose the LTC2986. Despite the higher price it was the excellent datasheet, the customisation options and Linear’s reputation for producing the highest quality analog components that won me over eventually. At the price I’d better make sure I get this design right the first time or I’ll be spending more time than I’d like with my desoldering braid!

Design parameters

Now I know the conversion technology and the IC I’m going to use to do the conversion I can come up with the features that I’d like to have on my board.

  • Up to two attached 3-wire PT100 RTD probes.
  • On-board continuous temperature display using 7-segment LED.
  • USB attachment to the host PC in the same way as my relays and triacs board.

Schematic

I translated my design parameters into a proposed schematic, and here it is.



Click on the thumbnail to see a full-size PDF. Let’s take a look at the different sections of the schematic in more detail.

The power supplies

This one presented a few challenges due to the variety of acceptable supply voltage ranges of the different ICs on the board. Power comes in on the USB bus and that can range between 4.6V and 5.25V. To satisfy the parameters of the MCU, USB-to-serial converter, LTC2986 and the LED driver I decided to operate the board at 4.0V. Even though all the ICs on the board will run at the USB supply voltage I wanted to run the LTC2986 from a dedicated ultra-low noise regulator which meant that there would be some degree of dropout voltage that would affect the high/low levels of the digital lines connected to the MCU. For that reason I compromised on 4.0V across the board.

The USB power supply is notoriously noisy and I wrote an article on that a while back. This simple LC filter is designed to remove some of that noise.

The first LDO regulator is the fixed 4.0V version of the ST Micro LDK220. It can provide up to 200mA and will be used to supply everything apart from the LTC2986.

The second regulator is a rather special one. The LT3042 from Linear Technology is an ultra-low noise LDO designed to supply sensitive components such as an ADC. In my design it’s used as a dedicated regulator for the LTC2986. The 40.2kΩ resistor sets the output voltage to a nominal 4.02V.

The package is a nice and easy MSOP with the only pain being the pad on the bottom that needs to be soldered to the board. It’s quite common to see these on high-end regulators because they provide a good large ground connection as well as a useful heatsinking capability.

The USB-to-serial IC

Just as with my relays and triacs board I’ll be using the Microchip MCP2221 USB-to-serial IC in a nice and easy DIP package. It’s resposible for presenting a USB CDC device to the host PC and translating to a 9600 baud UART that gets connected to the MCU.

The MCU

The venerable ATMega328P makes a familiar appearance again because it was a success in the relays and triacs board and so much of the firmware code can simply be a copy-and-paste job from there. It’s set up here to run from an external 8MHz crystal.

The GPIO lines are configured for SPI communication to the ISP header, the LTC3042 and the LED driver. Separate CS lines are used to ensure that only the correct device is listening at any one time.

I connected up a red LED to a pin and called it Alarm. The idea here is that I can control it as a visual indicator of a problem, for example a temperature threshold being exceeded.

The LED driver

It’s the familiar MAX7221 in DIP format. I’ve used them before, I’ve got library code that I copy and paste and I bought a pack of ten on Ali Express for a very cheap price so it’s a no-brainer to include it here. It’ll be driving a pair of 3-digit common cathode LED displays.

The LTC2986

The LTC2986 provides ten input channels that can be configured according to the sensors that you plan to attach. With my two 3-wire PT100 RTDs I will need all ten channels. The remainder of the components are really just high quality X5R ceramic decoupling capacitors distributed according to the recommendations in the datasheet.

The probes

The probes are set up according to the example in Figure 39 in the datasheet 3-Wire RTD Kelvin Current Mode. This allows the use of ordinary input protection resistors that don’t need to be exactly matched.

3-wire RTD probes are probably the most commonly available type. One end of the sensor has a single wire attached and the other end has two. The end with two attached is used to sense the resistance of the leads and, as long as they are closely length matched, then that resistance can be cancelled out from the reading.

R10 is the sense resistor and this one really must be accurate. I selected a high quality 0.01% resistor that, at more than £2 each is probably the most expensive 1kΩ resistor that I’ll ever buy!

A great thing about the LTC2986 is that Linear Tech have provided free design assistance software that you can use to configure the probes that you’re going to use. Not only does it show you the wiring you need to do it also generates C code to set up the internal registers according to your configuration.

I love it when manufacturers do this. The LTC2986 has an excellent datasheet and I was pretty sure I knew what the probe wiring was going to be but to have the software confirm it and provide driver code was just a fantastic confidence boost and time saver.

Bill of materials

Here’s a complete bill of materials for this project.

DesignatorValueQuantityDescriptionFootprintFarnell codeNote
C1, C2, C3, C4, C5, C11, C15100n7Ceramic capacitor2.54mm2309020
C61Ceramic capacitor5.08mm2112910[1]
C7, C1210µ2Electrolytic capacitor5x11mm1902913
C8, C9, C26, C314.7µ4Capacitor08051759420
C1010n1Capacitor2.54mm2309024
C13, C1422p2Capacitor2.54mm1100369
C16, C17, C20, C21, C24, C29, C32, C33, C34, C41100p10Capacitor06031759066
C18, C22, C23, C27, C28, C30, C35, C3610µ8Capacitor08052320851
C19, C25, C37, C38, C39, C40100n6Capacitor06031759037
D1, D2, D3Amber3LED3mm[2]
D4Green1LED3mm[2]
D5Red1LED3mm[2]
FB1BLM18PG221SN1D1Ferrite beadAXIAL-0.32292304
P1USB B1USB B RECEPTACLE1177885
P22x5 header1ISP connector2.54mm[3]
R1, R2, R3, R710k4ResistorAXIAL-0.32329609
R4, R5, R6, R233904ResistorAXIAL-0.32329519
R868k1ResistorAXIAL-0.32329546
R94701ResistorAXIAL-0.32329531
R101k 0.01%1Resistor08052112790
R11, R12, R13, R14, R17, R18, R20, R21, R221k9Resistor08052447587
R1511Resistor08052447598
R16100k1Resistor08052447551
R1940.2k 1%1Resistor08052447658
U1ATMega328P18-bit AVR MicrocontrollerDIP-281715487
U2MCP2221-I/P1Microchip USB-SerialDIP-142434892
U3LKD220M40R1ST Micro LDO regulatorSOT23-5AM2435558
U4MAX7221CNG1LED Display DriverDIP-24[4]
U51red 3 digit 7 segment 0.36" LEDcustom[5]
U61blue 3 digit 7 segment 0.36" LEDcustom[5]
U7LTC2986ILX#PBF1Linear Tech temperature to digital converterLQFP482629645
U8LT3042EMSE#PBF1Linear Technology LDO regulatorLT-MSE-10-EDP2475652
Y11Crystal Oscillator - ABLS-25.000MHZ-B2F-THC49 thru hole2063945

Notes

Some of the components have note numbers against them. The following numbered paragraphs correspond to a numbered note in the bill of materials table.

  1. 2.54mm parts can also be used if you carefully bend the leads outwards to fit the wider 5.08mm pitch.
  2. Any colour of 3mm LED will work and they’re cheapest on ebay.
  3. These 2.54mm headers are cheapest on ebay.
  4. The MAX7221 seems to be cheapest on Ali Express.
  5. Make sure you get the 0.36″ common-cathode variety. The red ones are easy enough to find but the blue ones are more elusive. I got mine from Ali Express. Search for item #32789229519.

PCB Design

The PCB design was given a jump-start by the success of my previous relays and triacs board. I set the extent of the board to a 10x10cm square and placed the mounting screw holes where they needed to be to match up with the footprint of a 2.5″ hard drive.

Next I knew I’d need a large area for the probes to attach. Probes designed to fit a specific hand-held reader will come with their own connector — often you’ll see some form of DIN or XLR connector used for this purpose.

General-purpose probes will either come with bare-wire termination or often you get spade-type connectors designed for screwing to a board.

I decided that spade or ring-type screw connectors would be the ones that I’d use. All I need to do on the board is provide screw holes of about 3.5mm with exposed copper around them to make a good contact with the screw connector.

The top-right of this two-layer board is dedicated to the probes, the LTC2986 and the LT3042. This area is all surface-mount. In an attempt to eliminate the potential for digital noise to get into this area and interfere with the readings there is a split in the ground plane the forces return currents from the digital and analog sections of the board to stay separated from each other.

There’s a liberal sprinkling of vias to the bottom of the board, particularly around the decoupling capacitors. The bottom ground layer is unbroken by traces or components along the path to the ground pin of the USB connector.

The larger bottom-left section of the board is the digital stuff and indicator LEDs. There’s nothing really sensitive here so component placement is made for convenience and to match up with physical constraints. For example, the USB connector should be at the same side as it is on the relays and triacs board, the ISP header should be in an accessible position and the 7-segment LEDs should be at the edge of the board.

Instead of naming the probes something boring like #1 and #2 I decided to name them as ‘red’ and ‘blue’ and use red and blue 7 segment LEDs to display the readings. If the board gets mounted inside a PC as planned then I won’t be able to see the board readings unless I cut out a window or run the displays to the outside with wires but I can live with that.


3D view is good for catching placement and overlap issues

Once I was happy that the design looked OK I sent it off to Seeed Studio for manufacturing because at the time they were the only ones offering the $4.90 deal for a pack of 10. I see that everyone’s got in on that price now which has to be a good thing for all of us.

Assembly

Before assembling the board I thought I’d better check out the precision 0.01% resistor with the best multimeter on my bench because I remember watching an EEVblog video where Dave got a precision resistor for his µCurrent project that turned out to be not so precision after all. After nulling out the test leads I got a measurement.


This is what 0.01% tolerance buys you

There’s certainly nothing wrong with that resistor. Hopefully it’ll age slowly enough over the years to come to not cause any measurement issues.

Here’s a picture of the front and back of the blank boards. No manufacturing boundaries are being pushed here so it was no surprise to find that the boards all looked to be perfectly made.

First I need to get the surface mount stuff out of the way. I tinned the pads with leaded solder so each one had a little bump of solder on it, applied more flux and placed the SMD components on to the little bumps.

Next I reflowed the board in my homemade halogen reflow oven. This was uneventful and worked perfectly. The solder bumps reflowed and all the components sat down on to the board. No post-reflow touch up was necessary. I was happy.


Reflowed solder fillets

With the surface-mount parts all in place I sat down with my soldering iron and did all the through-hole parts. I use sockets for my ICs just in case I mess up a prototype board design and need to recycle the parts for the next iteration.

It’s looking nice but does it work? I needed to spend some time writing the firmware.

Testing

I was off to a flying start thanks to the existing firmware for the relays and triacs board, the sample code generated by the LTC2986 application and my existing driver code for the MAX7221. The firmware is designed to poll the two sensors and display their readings at 1Hz intervals. I also implemented a suite of commands to be executed over the serial bus that provide the following functionality:

  • Retrieve sensor readings.
  • Set or retrieve calibration offsets for each sensor.
  • Set or retrieve calibration dates for each sensor.
  • Turn on, off or flash the red alarm LED.
  • Enable the red, blue or both on-board displays.

Now all I need to do some testing is a PT100 RTD probe. At this point I wasn’t keen on the idea of paying the relatively high price for a probe from a reputable source. I just wanted to know if my board worked so I bought a couple of cheap probes from ebay. This is the first one that I bought. It claims to be stainless steel.


Stainless steel 3-wire probe from ebay

Before attaching it to the board I used my Keysight U3402A 5½ digit bench meter to take some resistance measurements. After nulling out the test leads I measured the resistance across the two blue terminals. This would give me the lead resistance.

The reading was jumping around a bit due to the contact I was making between the Keysight probes and the spade terminals. I’ll use 658mΩ for this test but I could be off by a couple of hundred milliohms. Now I took a reading across the PT100 element itself.

Subtracting the lead resistance gives me a value of 113.427. A resistance-to-temperature lookup table is available online for PT100 probes and I used that to get the temperature.

Hmmm. This probe is off by a mile. The ambient temperature is a comfortable 22°C in this room and the probe is reading a positively scorching 35°C. OK, fine, you pay peanuts and you get peanuts. Fixed offsets can be calibrated out but my concern is whether this probe is actually platinum at all and whether it would change resistance on the correct PT100 scale.

Anyway, I bought it to test the board and I can certainly do that so I hooked it up and switched on. Unfortunately I did grab the probe briefly by its business end while attaching it to the board so it might have heated up slightly while I set up the test.


LED readouts are too intense to photograph well

That was a relief. My board reads a value that’s close enough to the lookup table to be my measurement error that was at fault. I sat there watching it for a few minutes, as you do, just to make sure that it was stable. It was. I swapped over to the blue channel and it read the same. Both channels were working.

The unused channel displays ‘Err’ as an indication that it can’t read a value, in this case because there’s no probe attached. More detailed information about the type of error is provided by the firmware serial commands.

I also bought a pair of even cheaper probes from ebay at the same time, just £2.89 gets you one of these.

Is it really possible to get an accurate PT100 probe for under three quid? Well no, actually. I won’t bore you with the photos and measurements again but suffice to say that the two probes I bought didn’t even agree with each other. One was off by 4.7°C and the other was off by a much more respectable 0.3°C.

At this point you might be wondering how I know what the true temperature is. The answer is that I have one of these handheld type-T thermocouple probes.

This was calibrated by the manufacturer less than a year ago so it should still be close to the true value. This is the probe that I currently use for all my home brewing measurements and it’s what I’m going to use to set the calibration offsets for these dodgy ebay PT100 probes.

The firmware

The firmware source code can be found here on github. If you don’t want to compile it yourself then you can just download the .hex file from the bin directory and flash it directly to your ATMega328P.

Self-build

If you do want to compile it yourself then the relevant source is in the firmware/rtd directory. Clone the repo, change to the source directory and execute the scons command to compile it. I’ve tested this with the old avr-gcc 4.9.2 and the very recent 7.2.0 release and both work fine.

Here’s the example output when building with 7.2.0:

$ scons
scons: Reading SConscript files ...
scons: done reading SConscript files.
scons: Building targets ...
avr-g++ -o AlarmFlasher.o -c -mmcu=atmega328p -Os -g -DF_CPU=8000000 -DBOARD_SERIAL=2429286624 -std=c++1y -Wall -Werror -Wextra -pedantic-errors -fno-rtti -mcall-prologues -ffunction-sections -fdata-sections -fno-exceptions AlarmFlasher.cpp
avr-g++ -o Max7221.o -c -mmcu=atmega328p -Os -g -DF_CPU=8000000 -DBOARD_SERIAL=2429286624 -std=c++1y -Wall -Werror -Wextra -pedantic-errors -fno-rtti -mcall-prologues -ffunction-sections -fdata-sections -fno-exceptions Max7221.cpp
avr-g++ -o MillisecondTimer.o -c -mmcu=atmega328p -Os -g -DF_CPU=8000000 -DBOARD_SERIAL=2429286624 -std=c++1y -Wall -Werror -Wextra -pedantic-errors -fno-rtti -mcall-prologues -ffunction-sections -fdata-sections -fno-exceptions MillisecondTimer.cpp
avr-g++ -o ProgStrings.o -c -mmcu=atmega328p -Os -g -DF_CPU=8000000 -DBOARD_SERIAL=2429286624 -std=c++1y -Wall -Werror -Wextra -pedantic-errors -fno-rtti -mcall-prologues -ffunction-sections -fdata-sections -fno-exceptions ProgStrings.cpp
avr-g++ -o Program.o -c -mmcu=atmega328p -Os -g -DF_CPU=8000000 -DBOARD_SERIAL=2429286624 -std=c++1y -Wall -Werror -Wextra -pedantic-errors -fno-rtti -mcall-prologues -ffunction-sections -fdata-sections -fno-exceptions Program.cpp
avr-g++ -o Uart.o -c -mmcu=atmega328p -Os -g -DF_CPU=8000000 -DBOARD_SERIAL=2429286624 -std=c++1y -Wall -Werror -Wextra -pedantic-errors -fno-rtti -mcall-prologues -ffunction-sections -fdata-sections -fno-exceptions Uart.cpp
avr-g++ -o brewery-rtd-v1.elf -Wl,-Map,brewery-rtd-v1.map -mrelax -Wl,-u,vfprintf -lprintf_flt -lm -Wl,--gc-sections -mmcu=atmega328p AlarmFlasher.o Max7221.o MillisecondTimer.o ProgStrings.o Program.o Uart.o
avr-objcopy -j .text -j .data -O ihex brewery-rtd-v1.elf brewery-rtd-v1.hex
Install file: "brewery-rtd-v1.hex" as "bin/brewery-rtd-v1.hex"
avr-objdump -S brewery-rtd-v1.elf > brewery-rtd-v1.lst
avr-size brewery-rtd-v1.elf | tee brewery-rtd-v1.siz
   text    data     bss     dec     hex filename
   8850     226     126    9202    23f2 brewery-rtd-v1.elf
scons: done building targets.

If you’d like to flash the .hex file directly to the board using a USBASP programmer then the command is scons upload.

Serial commands

Commands are entered over the USB virtual serial port at 9600 baud. Each command is a single line with optional parameters. An affirmative response is also a single line and is always a valid JSON document to make for easy parsing by the host PC controller.

At the time of writing the following commands are implemented.

CommandParametersDescription
IDReturn the board identifier string.
CAPSReturn the board capabilities.
VERReturn the version numbers.
COPYReturn a copyright statement.
UPTIMEReturn the uptime in milliseconds.
READINGSReturn the last temperature readings.
RCAL/BCALReturn the red/blue calibration offset.
RCAL/BCALdecimal numberSet the red/blue calibration value.
RCALDATE/BCALDATEReturn the red/blue calibration date.
RCALDATE/BCALDATE32-bit positive integerSet the red/blue calibration date as a Unix time_t value.
SERIALReturn the unique serial number generated for this board instance.
ALARMON/OFF/FLASHChange the state of the red alarm LED.
DISPLAYSRED/BLUE/BOTH/NONEChange which of the 7-segment LED displays to show. Temperature readings are unaffected.

The serial number returned by the SERIAL command is generated when you first run the scons command and is stored in the serialnumber.txt file. The purpose of this number is to facilitate multiple boards of the same type being used in the same PC. The serial number differentiates them.

I don’t know whether I’ll ever do that but the facility is there if I do. All I need to remember to do is regenerate the serialnumber.txt file when I program a second board of the same type.

Here’s an example interaction with the board using the sendcommand utility that you can find in the bin subdirectory.

$ ./sendcommand /dev/Andy0 READINGS
{"red":{"value":"21.597851","code":"1"},"blue":{"value":"21.554688","code":"1"}}

If the readings are valid then code will be 1. Any other value indicates an error. The error code can be decoded by referring to Table 35. RTD Fault Reporting in the LTC2986 datasheet.

Where it gives a bit position, shift right by 24. So D24 is actually D0.

A good quality probe

Now I’m happy with my board it’s time to stop fooling around with random ebay probes of suspect quality and get hold of a decent one. Here’s the one that I bought.

It’s from a company called Thermosense in the UK. It’s stainless steel, 6mm in diameter and 250mm long with a 2 metre lead. I had to crimp on my own loop connectors at the end because it came with a bare-wire termination.

I knew this one was going to be good because it’s used by my professional brewer relative in his automated setup and best of all it was only £22 delivered. I connected it up and sure enough without any calibration at all it was just a tenth or two off my Therma-1T, near enough for me.

I only need one probe at the moment but what I plan to do is to run the cheap ebay stainless one calibrated and side-by-side with the good one from Thermosense. I’ll watch over time to see if the ebay probe responds throughout the range the same as the Thermosense probe and if it does then then there’s no reason to not use the ebay probes if you can calibrate them yourself.

Video

I made a YouTube video showing the board in operation. You can watch it here using the embedded video but much better quality can be had by visiting YouTube and watching it there.

Build your own

If you’d like to build your own board then all the gerbers and firmware are freely available.

Get the gerbers here.

Get the firmware from github here.

Blank boards for sale

I’ve got some spare boards remaining from the batch of ten in my original order. If you’d prefer to buy one rather than have your own set manufactured then you can use the PayPal form below to make an order.


Location




Next time…

Another successful project comes to a conclusion and I’m one big step forward in my goal of producing an automated, PC-based process controller. In fact, except for some physical PC case modifications all the hardware work is done.


The sensors and the switching boards

The next thing I need to do is move ‘up the stack’ and create the PC controller software that interacts with the hardware. That’s going to be a spring-boot java application distributed in a docker image for easy installation. More on that one in the next article in this series, coming to this blog soon!

If you’d like to leave a comment then you can do so down below in the comments section or if you’d like to add to the discussion over in the forum then please also feel free.

Process automation: another RTD sensor board

$
0
0

In a previous article I described the design and build of a temperature sensor board based around a high precision LTC2986 part from Linear Technology. The project was successful so you may be wondering why I’m bothering to design another board when the LTC2986 probably cannot be bettered by any other fully integrated part on the market.

Well, I have no clear answer except that with a pile of left over parts from the LTC2986 board BOM and seeing that the Maxim MAX31865 RTD-to-digital converter is quite cheap compared to the LTC2986 then why not? I could always justify it to myself by calling it a backup unit in case something goes awry with the board I’ve already built.

So without further ado and before I talk myself out of it, let’s get on with the design.

The MAX31865

The MAX31865 is a single-sensor, fully integrated resistance-to-digital converter requiring very few external parts to operate. This is the block diagram taken from the datasheet.

Once the usual supply decoupling capacitors are accounted for the only other part that’s required is a precision reference resistor which, for PT100 sensors is recommended to be 400Ω. More good news for hobbyists is that the communication interface is SPI and the final pin count is low enough for Maxim to be able to offer it in an easy-to-handle SSOP package.

Now I’ll move on to building a schematic around this part.

Schematic




Click on the thumbail for a much larger version

In order to make this design operationally as close as possible to my LTC2986 design I will include two MAX31865 ICs on this board (the LTC2986 can control two three-wire sensors on a single chip). Let’s take a look at the details.

Power supplies

It all starts with the USB input from the computer. I add an ESD diode to the 5V line and then filter it through a combination of capacitors and a ferrite bead. I wrote an article on this approach to filtering the USB supply a while ago, click here to read it.

The MAX31865 is a 3V3 part with separate power inputs for the digital and analogue parts. A cheap design could just tie these two inputs together and live with possibility of digital switching noise interfering with the ADC but it doesn’t cost much to do this properly and I have a few Texas Instruments LP5907 ultra-low noise regulators in stock so I’ve used a pair of them here, one each for the digital and analogue supplies.

The sensor controller

Nothing radical here, the MAX31865 doesn’t need much in the way of supporting parts. The 400Ω reference resistor is a 0.1% part and the MAX31865 datasheet tells you how to wire up the various RTDIN and FORCE pins for a three-wire probe.

I’ve attached optional ESD protection diodes on the RTD pins as these are circuits that are likely to be touched by humans with their annoying tendancy to harbour a static charge. I say optional because all the IC pins come with integrated ±2kV ESD protection anyway and these ESD diodes tend to come in horrible tiny packages that disintegrate at the slightest provocation.

The MCU

It’s the venerable ATmega328p, simply because that’s the one I used on the LTC2986 design, I have a few left and much of the firmware will be reusable saving me considerable time and effort. I’ll run the MCU at 8MHz which is plenty fast enough for this design.

The onboard 7-segment LEDs

The trusty MAX7221 makes another appearance here. It’s so easy to control and requires only one external resistor to set the LED current. The only annoyance with it is that it’s a 5V device and the logic inputs have a VIH minimum level of 3.5V. That means I need a level translator to hook it up to the MCU.

Step-up translators are not as common as step-down, but the Texas Instruments TXB0104 does the job for up to four signals and is easily integrated between the MCU and the MAX7221.

The USB-to-serial interface

The usual Microchip MCP2221 that I’ve been using in the last couple of designs makes an appearance here. It’s a trivial to use plug-and-play chip that comes in a nice convenient DIP package. The rumour on the internet is that this is actually a hardwired PIC.

The ISP interface

I program these ATmega devices using the popular USBASP programmer that you can get on ebay for just a few pounds and I’ve got one that has a jumper on it for selecting 3.3V or 5V.

For years I’d just assumed that this jumper switched the supply to the onboard ATmega8A so that the whole system would run at 5V or 3.3V.

It doesn’t.

It only switches ISP output pin 2 (Vcc). All the SPI pins and the RESET pin remain at 5V whatever the jumper is set to and for 3.3V circuits where multiple devices are connected to the SPI bus this may cause a problem because the other devices may not be tolerant to the 5V levels that they’re going to be hit with during MCU programming.

To be safe I’ve opted to use cheap zener diodes and resistors to cap the levels on the ISP bus to near-enough 3.3V. This works for the slow speed of the programming bus but wouldn’t work at high speeds so it’s not a cookie-cut approach that you can take and apply everywhere. I’ve also disconnected pin 2 because the board will be powered from the USB bus and not from the USBASP programmer.

Bill of materials

Here’s a complete bill of materials for this design. Where possible I’ve included a sample Farnell order code to make it easy to search for parts. In my case I actually put the parts together for this BOM from Digikey UK because they have the MAX31865 in SSOP format and the 400Ω precision resistor.

DesignatorValueQuantityDescriptionFootprintFarnell codeNote
C1, C2, C3, C4, C10, C14100n6Ceramic capacitor2.54mm2309020
C51Ceramic capacitor5.08mm2112910[1]
C6, C1110µ2Electrolytic capacitor5x11mm1902913
C7, C8, C17, C184Ceramic capacitor06039227776
C910n1Capacitor2.54mm2309024
C12, C1347p2Capacitor2.54mm2395776
C15, C16, C19, C20, C21, C22, C23, C24100n8Capacitor06031759037
D1, D2, D3Amber3LED3mm[2]
D4Red1LED3mm[2]
D5, D11, D12, D13, D14, D15, D16
D5V0P1B2LP-7B7Bi Directional TVS Diode
0402[6]
D6, D7, D8, D9BZX79-C3V34Zener DiodeAXIAL-0.31097229
D10Green1LED3mm[2]
FB1BLM18PG221SN1D1Ferrite beadAXIAL-0.32292304
P1USB B1USB B RECEPTACLE1177885
P22x5 header1ISP connector2.54mm[3]
R1, R3, R16, R17, R1810k5ResistorAXIAL-0.32329609
R2, R1268k1ResistorAXIAL-0.32329546
R4, R5, R6, R8, R9, R10, R11
3307ResistorAXIAL-0.32329514
R73901ResistorAXIAL-0.32329519
R13
4701ResistorAXIAL-0.32329531
R14, R15400 0.1%2Resistor0603
U1ATMega328P18-bit AVR MicrocontrollerDIP-281715487
U2MCP2221-I/P1Microchip USB-SerialDIP-142434892
U3, U7LP5907-3.32TI voltage regulatorSOT23-5AM2492304
U4MAX7221CNG1LED Display DriverDIP-24[4]
U5TXB0104PWR1Level converterTSSOP141607891
U61red 3 digit 7 segment 0.36" LEDcustom[5]
U81blue 3 digit 7 segment 0.36" LEDcustom[5]
U9, U10MAX31865AAP+
2Maxim resistance to digital converterSSOP20[7]
Y11Crystal Oscillator -
8MHz 30pF
HC49 thru hole2063945
  1. 2.54mm parts can also be used if you carefully bend the leads outwards to fit the wider 5.08mm pitch.
  2. Any colour of 3mm LED will work and they’re cheapest on ebay.
  3. These 2.54mm headers are cheapest on ebay.
  4. The MAX7221 seems to be cheapest on Ali Express.
  5. Make sure you get the 0.36″ common-cathode variety. The red ones are easy enough to find but the blue ones are more elusive. I got mine from Ali Express. Search for item #32789229519.
  6. The TVS diodes are optional and the design will work safely without them due to the built-in ESD protection on the MAX31865. These 0402 parts are hard to work with due to the fragile package and the small pads that are completely underneath the package body.
  7. The SSOP package is available from Digikey.

PCB layout

This layout is totally copied from the previous LTC2986 design. The digital side is more or less identical and required only a few changes. The major change is, of course, on the analogue side where the MAX31865s are located and the good news for me is that it’s considerably simpler even though there are two ICs instead of one.

The mounting holes are of course in the same place as before as this 10x10cm PCB is designed to mount on to a 3.5″ hard disk bay.

With the design laid out I previewed it in 3D to make sure that there were no silly errors such as silkscreen overlapping pads, components too close to the edge or to each other and other such gotchas. It all looked good so I sent it off to be manufactured. The Gerbers for this project are freely available if you’d like to get your own copies printed at one of the cheap fabrication houses.

The manufactured PCBs

This time, and for no particular reason, I used Seeed Studio for the manufacturing as these 2-layer boards don’t have anything on them that tax the manufacturing tolerances. It cost about US$5 for 10 copies before shipping. Crazy prices and I’m looking forward to the day that they start discounting 4-layer boards.

I sent off my order and waited. A healthy dose of patience is a requirement when using China Post for shipping. They always quote 2 or 3 weeks for shipping and I’ve heard anecdotally that the way this works is that there is a shipping container at the major Chinese ports destined for each foreign port. Over time it steadily fills up and when it’s full off it goes on the next ship. If you’re dead lucky yours will be the last parcel on board and your parcel will arrive in a week. At the other end of the scale if your parcel is first into an empty container then you may be waiting for some time. About two weeks later my boards turned up which is neither fast nor slow, just average.


They’re all looking good, which is entirely unsurprising because this is not a difficult design to manufacture. It’s time to get assembling.

Assembly

Assembly is a two stage process because this board has both SMD and through-hole parts. The first step for me is to tin the SMD pads, apply a tacky flux and then use it to hold the parts in place on top of the little solder bumps. Then I take the board and reflow it in my android-controlled reflow oven.

After the reflow I place the board under my microscope for inspection and touch up any parts that look like they didn’t reflow properly. On this board everything that reflowed did so correctly; the only issue was that two of the 0603 capacitors got blown across the board by the fan in the reflow oven before they could reflow. These parts were easily put back in their place with my hot air gun.

After washing the board to get rid of flux residue I sat down and soldered in all the through-hole parts with my soldering iron. I opted to use sockets for my ICs as I always do because removing a through-hole IC that you suspect to be damaged is no fun whatsoever if it’s been soldered directly to the board.

There it is, ready for testing, but first just for fun let’s see it next to the LTC2986 design.

Now you can see how physically similar these boards are. It’ll be easy for me to switch them in and out as needed.

The firmware

Writing the firmware took less than two days of effort because I could lift-and-shift the LTC2986 code almost in its entirety. All I had to do was take out all the LT interface code and replace it with an equivalent that controlled the MAX31865.

You can see where some of the cost savings have been achieved with the MAX31865 vs. the LTC2986 when you come to write the interface code. The LTC2986 is hugely configurable and also directly outputs a temperature in Celsius which hints at quite a powerful core inside.

By contrast the MAX31865 has very limited configurability and outputs raw data from the ADC so you need to do the conversion to Celsius yourself. I decided to use the tried-and-trusted open source conversion implementation from Adafruit. You can see it here on Github.

The Adafruit implementation is targeted at the Arduino so I had to make a few tweaks to get it to work in this standalone firmware but nothing too serious and it was up and running in a matter of hours. My firmware is available on Github and you can view it here.

Building the firmware

As usual I use the scons system to build my firmware. It’s as simple as this:

$ scons
scons: Reading SConscript files ...
scons: done reading SConscript files.
scons: Building targets ...
avr-g++ -o AlarmFlasher.o -c -mmcu=atmega328p -Os -g -DF_CPU=8000000 -DBOARD_SERIAL=1027957644 -std=c++1y -Wall -Werror -Wextra -pedantic-errors -fno-rtti -mcall-prologues -ffunction-sections -fdata-sections -fno-exceptions AlarmFlasher.cpp
avr-g++ -o Max7221.o -c -mmcu=atmega328p -Os -g -DF_CPU=8000000 -DBOARD_SERIAL=1027957644 -std=c++1y -Wall -Werror -Wextra -pedantic-errors -fno-rtti -mcall-prologues -ffunction-sections -fdata-sections -fno-exceptions Max7221.cpp
avr-g++ -o MillisecondTimer.o -c -mmcu=atmega328p -Os -g -DF_CPU=8000000 -DBOARD_SERIAL=1027957644 -std=c++1y -Wall -Werror -Wextra -pedantic-errors -fno-rtti -mcall-prologues -ffunction-sections -fdata-sections -fno-exceptions MillisecondTimer.cpp
avr-g++ -o ProgStrings.o -c -mmcu=atmega328p -Os -g -DF_CPU=8000000 -DBOARD_SERIAL=1027957644 -std=c++1y -Wall -Werror -Wextra -pedantic-errors -fno-rtti -mcall-prologues -ffunction-sections -fdata-sections -fno-exceptions ProgStrings.cpp
avr-g++ -o Program.o -c -mmcu=atmega328p -Os -g -DF_CPU=8000000 -DBOARD_SERIAL=1027957644 -std=c++1y -Wall -Werror -Wextra -pedantic-errors -fno-rtti -mcall-prologues -ffunction-sections -fdata-sections -fno-exceptions Program.cpp
avr-g++ -o Uart.o -c -mmcu=atmega328p -Os -g -DF_CPU=8000000 -DBOARD_SERIAL=1027957644 -std=c++1y -Wall -Werror -Wextra -pedantic-errors -fno-rtti -mcall-prologues -ffunction-sections -fdata-sections -fno-exceptions Uart.cpp
avr-g++ -o brewery-max31865-rtd-v1.elf -Wl,-Map,brewery-max31865-rtd-v1.map -mrelax -Wl,-u,vfprintf -lprintf_flt -lm -Wl,--gc-sections -mmcu=atmega328p AlarmFlasher.o Max7221.o MillisecondTimer.o ProgStrings.o Program.o Uart.o
avr-objcopy -j .text -j .data -O ihex brewery-max31865-rtd-v1.elf brewery-max31865-rtd-v1.hex
Install file: "brewery-max31865-rtd-v1.hex" as "bin/brewery-max31865-rtd-v1.hex"
avr-objdump -S brewery-max31865-rtd-v1.elf > brewery-max31865-rtd-v1.lst
avr-size brewery-max31865-rtd-v1.elf | tee brewery-max31865-rtd-v1.siz
   text    data     bss     dec     hex filename
   9958     230     126   10314    284a brewery-max31865-rtd-v1.elf
scons: done building targets.

I’m currently using avr-gcc 7.2.0 but I’ve also tested it works with versions as early as 4.9.2. The newer versions do seem to be a bit more efficient on code generation, not that it matters when I’m only using 10Kb of the available 32Kb on this MCU.

When I’ve got the USBASP connected I can use scons upload to build and upload in one step and when I first connect up a fresh ATmega328p then I must use scons fuse to set the fuses to run the board at 8MHz using an external crystal.

Power supply noise rejection

The MAX31865 has built-in noise rejection filters tuned to a selectable 50 or 60Hz, the idea being that you choose one based on the mains frequency of your location. Since this is a set-and-forget option I chose to compile it in to the firmware instead of having it selectable as a command.

Here in the United Kingdom our mains frequency is 50Hz so if you’re building this firmware and you live in a country that has a 60Hz mains supply then you will need to edit Max31865.h and change the FILT50HZ value shown below to FILT60HZ.

template<class TGpioCS, class TGpioDRDY>
inline void Max31865<TGpioCS, TGpioDRDY>::setup() {

  writeByte(Register::CONFIG, ConfigBits::WIRE3 | ConfigBits::FILT50HZ);
  clearFault();
}

Testing the firmware

To test the firmware I need a serial terminal emulator. When on Windows I use the free Realterm program and when on Linux I’ll use the miniterm utility that comes with the PySerial Python package. I find Linux more convenient for working with serial peripherals and command-line based code in general so I fired up one of my Ubuntu Server virtual machines and got started. If you’re still working in a world where you only have a single operating system on your computer then I seriously recommend you try installing a VM or two. It’s liberating.

I’ve used the Linux udev system to create an alias to the /dev/ttyACM0 USB-to-serial device so that non-root users can access it. You can read a bit more about how I set that up in this article.

$ python -m serial.tools.miniterm /dev/Andy0
--- Miniterm on /dev/Andy0  9600,8,N,1 ---
--- Quit: Ctrl+] | Menu: Ctrl+T | Help: Ctrl+T followed by Ctrl+H ---
32565:"Andy's Workshop Brewery MAX31865 RTD temperature sensors"

That’s the ID command that shows it’s alive and well. The integer prefix to the command response is a CRC16 checksum of all the characters following the colon and not including the CRLF at the end. Now I know that the board is healthy I’ll switch to using the custom sendcommand utility that you can find with the firmware on Github.


Installed into a hard drive caddy and powered up with two probes

I used two probes for testing. One was bought from Thermosense in the UK and the other one is a no-name cheapo probe from ebay. As you can see from the readings they perform very differently.

$ ./sendcommand /dev/Andy0 READINGS
50156:{"red":{"value":"20.017590","code":"0"},"blue":{"value":"24.796766","code":"0"}}

The Thermosense probe is showing an accurate reading of around 20C and the ebay probe is way off the mark. If the ebay probe is consistently off by a constant value then that’s something I can correct for using my firmware’s RCAL and BCAL commands. Here’s how I corrected the ebay probe:

$ ./sendcommand /dev/Andy0 "BCAL -4.79"
18061:"OK"

Now if I re-run the READINGS command I get a much better result.

$ ./sendcommand /dev/Andy0 READINGS
18088:{"red":{"value":"20.143177","code":"0"},"blue":{"value":"20.132557","code":"0"}}

The caveat to using this method of correction is that it only works when the offset from true value is a constant. If it’s non-linear then a full and time-consuming characterisation of the response would be required and for a cheap ebay probe that’s just not worth the effort.

For completeness, here’s a list of all the serial commands accepted by this firmware.

CommandParametersDescription
IDReturn the board identifier string.
CAPSReturn the board capabilities.
VERReturn the version numbers.
COPYReturn a copyright statement.
UPTIMEReturn the uptime in milliseconds.
READINGSReturn the last temperature readings.
RCAL/BCALReturn the red/blue calibration offset.
RCAL/BCALdecimal numberSet the red/blue calibration value.
RCALDATE/BCALDATEReturn the red/blue calibration date.
RCALDATE/BCALDATE32-bit positive integerSet the red/blue calibration date as a Unix time_t value.
SERIALReturn the unique serial number generated for this board instance.
ALARMON/OFF/FLASHChange the state of the red alarm LED.
DISPLAYSRED/BLUE/BOTH/NONEChange which of the 7-segment LED displays to show. Temperature readings are unaffected.

Limitations

The MAX31865 is known to be prone to self-heating in certain modes of operation. The issue is a side-effect of the way in which RTDs are sensed. To measure the resistance of the probe a small excitation current is transmitted through the probe tip and the voltage drop is measured. In the MAX31865 implementation they also pass the excitation current through the 400Ω reference resistor and measure the drop across that.

The problem is that passing a current through any resistor causes heating which is obviously a very bad thing for a temperature sensor and this is called self heating. To get around this devices such as the LTC2986 use an extremely small current and in my LTC2986 design I’ve been selecting 500µA from the configurable list. The MAX31865 generates a much higher current of 4mA for a PT100 so we must take care to keep the excitation current switched off between measurements and also to limit the measurement frequency to avoid the problem of self-heating. I poll the sensor at 1Hz.

Video

I put together a short YouTube video showing this board in operation. You can view it using the preview link below.

Spare blank boards for sale

I’ve got a few spare boards left over from the build that I’ll sell off for approximately what they cost me. I’d feel guilty though if I didn’t remind you that you can download the Gerber files yourself from my site and get 10 copies for less than US$5 plus delivery from China.


Location




Final words

I didn’t really need to build this; but then I don’t need to build anything. This was a case of an itch that just needed to be scratched. I needed to know if the MAX31865 was any good and so I had to build this to find out. The answer I found is that it’s not bad at all. It’s cheap and it works as advertised. It’s certainly no LTC2986 but it’s a decent alternative if your needs are not at the high-end.

Please feel free to leave a comment down below or if you’d like to start a conversation then do head on over to the forum.

Resources

Firmware on Github
Gerbers for the blank PCB

Directly driving a 7-segment LED display with the STM32

$
0
0

Seven segment LEDs are an extremely cost effective way to add a large, bright and very readable numeric display to your project.

Displays similar to the one pictured above can be had for as little as 50 cents each on ebay in the common heights of 0.56″, 0.36″ and 0.28″. You can choose anywhere between one and four digits in the same package. They’re referred to as seven segment but really they’re eight because each digit comes with a little decimal point down at the bottom right.

Configuration

The multiple digit packages utilise a wiring configuration designed to minimise the number of pins required to drive it without having to embed any logic at all within the package.

If you count the number of segments on, for example, a three digit display you’d quickly realise that a simple configuration that exposed each LED on its own dedicated pin would require (8 * 3) + 1 = 25 pins on the package, of which you would need to attach 24 to your MCU to drive it. That’s far too many and is the reason why they come in common cathode or common anode configurations.

Common cathode configuration

Let’s look at common cathode first.

In this configuration there are dedicated power pins for each of the 7 segments but the same segment on each digit are all connected together. On the other side of the LED you can see that all eight cathodes for a digit are tied together and presented at a single pin.

If you take a moment to digest this you can see how we can light up a segment of our choosing on any digit. For example, to light up segment A on digit two we would apply a current to pin 11 while grounding pin 9. Pins 8 and 12 must be disconnected or otherwise prevented from allowing current flow.

To light segment A on digit 1 we would disconnect pin 9 and ground pin 12, and finally for digit 3 we would disconnect pin 12 and ground pin 8.

Multiplexing

Now you should be getting an idea of how these displays are intended to be driven. Let’s look at a fully worked example of how we would display the number “123”.

Firstly we would light segments E, F on digit 1 by enabling current flow through pins 1, 10 and 12. Then we would light segments A, B, D, E, G on digit 2 by enabling pins 11, 7, 2, 1, 5 and 9. Finally we would light segments A, B, C, D, G by enabling only pins 11, 7, 4, 2, 5 and 8.

If we repeat the above actions fast enough then the human eye will perceive all three digits to be constantly lit even though we are switching them on and off very quickly.

Common anode configuration

This article is going to focus on the common cathode type of display but for completeness I’ll show you the other configuration, just so you know that two incompatible types are available.

In the common anode configuration we again have separate pins for each segment and again all equal segments on all digits are wired together but this time the cathode ends of the segment LEDs are individually exposed and it’s the anodes that are all connected together on each digit.

The multiplexed driving technique is exactly the same for common anode displays but that doesn’t mean you could use common anode where a design calls for common cathode because you can’t, you would have to change the design.

Driving with an MCU

There are a few options available if you have an MCU and you want to drive one of these displays. If you have an MCU with a limited number of IO pins, such as an Arduino Uno then your best option is to use a dedicated driver IC that will do the work for you.

The Maxim MAX7221 will drive common cathode displays of up to a whopping eight digits while requiring just a three wire SPI interface to the host MCU. Using just one of these ICs you could have two of the biggest four digit displays in your project at a cost of just three MCU pins. I’ve used this IC many times before in projects that you can read about on this site. The main drawback of this IC is that it requires a 5V supply and 5V levels at the SPI pins. This is no problem for the Arduino Uno but it means it can’t be used with an STM32 without a level shifter.

If you’re using an MCU with a large number of GPIOs, such as most of the STM32 packages, then you have the option of driving these displays directly for the cost of just eight resistors and three n-channel MOSFETs, and that’s the method that we’re going to explore here today.

Direct drive circuit

Here’s the circuit diagram that I use to drive a three digit display that has blue LEDs. It’s a snapshot from a much larger circuit that I’m working on.

The choice of resistor is important because it limits the amount of current flow and sets the overall brightness of the display. I’ll be using the STM32 F0 discovery board that hosts an STM32F051 MCU to implement this circuit.

The first thing that I need to do is find the MCU datasheet and determine the maximum current that the device can source and sink.

Those limits make reference to another table earlier in the datasheet that tells us the total current source and sink for all pins.

So we have a per-pin absolute limit of 20mA and an overall device limit of 120mA. To avoid heat buildup and allow the device to actually do other work as well we will stay far away from those limits.

Are there any other limits? Yes there are. It pays to read the entire datasheet because hidden away in a footnote there is a very important limitation regarding GPIOs PC13 to PC15.

We will not use these pins.

Resistor calculation

To calculate the resistor values we need to know the forward voltage of the LEDs in the display. This is easily tested by using your multimeter in its diode testing mode.

The meter shows a forward voltage of about 2.6V which is average for a blue LED. Now I’ll take a wild guess that because modern LEDs are very bright at low currents then 2mA will be sufficient current to get a nice, readable brightness. To match the STM32 F0 Discovery board I’ll test this with a 3.0V supply. That means a resistor of (3.0 – 2.6) / 0.002 = 200Ω is required.

LEDs don’t photograph well so please take my word for it that this is nice and bright. Can the STM32 handle it? 2mA falls well below the per-pin limit and the worst case scenario is going to be all eight segments lit at the same time giving a total current source of 8 * 2 = 16mA. No problem at all. The package shouldn’t even get warm.

The problem with using the 200Ω resistor that we calculated is that each digit is only lit for 33% of the time which will make it appear three times as dim as we are expecting. Therefore we need to lower the resistor by a factor of 3 and use a value of 68Ω instead.

This will raise the peak current seen by the LED to 6mA but the average current will still be 2mA. In the worst-case scenario where your MCU hangs or crashes while driving all eight segments of a digit then it will be sourcing 8 * 6mA = 48mA. This is still within safe levels and will not burn up the package.

This figure of 48mA is the reason for each digit pin being switched on or off using a MOSFET. If we were to directly connect these pins to the MCU then we would be in danger of sinking 48mA into a single pin which would probably permanently damage it.

The resistors, MOSFETs and jumper wires are all in place and we are ready to develop the firmware. My project circuit specifies the Vishay SI2374DS MOSFET which is a surface mount device. For this test I am using the through-hole BS170 instead. The choice of n-channel MOSFET is not important but for efficiencies sake you should choose one with a low on-state drain-to-source resistance. Less than 1Ω is easily found.

Firmware

I chose to implement the firmware as an example project within my stm32plus library. The concepts are simple so you should have no issues porting it to whatever framework suits your project. The firmware is implemented in a single file that you can view here on Github.

The design works by using Timer 1 to generate interrupts at a frequency of 180Hz. Each time the interrupt fires we turn off the digit that we were last displaying and move on to setting the GPIOs necessary to light the next digit. Therefore each digit flickers rapidly at 180/3 = 60Hz, a figure I selected to match the refresh rate most commonly used by PC monitors. This gives a display that appears stable to the human eye.

Here’s a breakdown of the important parts of the firmware.

static const uint8_t AsciiTable[]= {
  0,  // SPACE
  0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,  // skip
  0b11111100,  // 0
  0b01100000,  // 1
  0b11011010,  // 2
  0b11110010,  // 3
  0b01100110,  // 4
  0b10110110,  // 5
  0b10111110,  // 6
  0b11100000,  // 7
  0b11111110,  // 8
  0b11110110   // 9
};

We want to allow the controller to display ASCII text strings so we need a table to convert ASCII to a bitmap of which segments should light up for that character. Printable ASCII starts at 32 (space) so we start our table there.

Each entry in the table is a single byte with one bit per lit-up segment in the format ABCDEFG0. Unused ASCII codes are set to zero. In this example I only need the digits 0-9 so that’s all there is in there. You can easily see how to extend this.

enum {
  SEGA = 0,   // PA0
  SEGB = 3,   // PA3
  SEGC = 8,   // PB8
  SEGD = 4,   // PB4
  SEGE = 3,   // PB3
  SEGF = 1,   // PA1
  SEGG = 2,   // PA2
  SEGP = 5,   // PB5
  DIG1 = 9,   // PB9
  DIG2 = 2,   // PB2
  DIG3 = 10   // PB10
};

The pins used for each GPIO are stored in an enum for easy reference. The seemingly random assignment matches a project I’m currently working on and also shows that the pin placement is completely flexible.

GpioA<DefaultDigitalOutputFeature<SEGA,SEGB,SEGF,SEGG>> pa;
GpioB<DefaultDigitalOutputFeature<DIG1,DIG2,DIG3,SEGC,SEGD,SEGE,SEGP>> pb;

All pins are initialised as outputs. To light a segment I will set the segment output and the corresponding digit MOSFET gate output HIGH. Current will flow from the segment output GPIO, through the LED and the MOSFET and the LED will light. To switch a digit off I simply switch off its MOSFET.

/*
 * Initialise timer1 running from the high speed internal APB2 (APB on the F0)
 * clock with an interrupt feature
 */

Timer1<
  Timer1InternalClockFeature,       // the timer clock source is APB2 (APB on the F0)
  Timer1InterruptFeature            // gain access to interrupt functionality
> timer;

/*
 * Set ourselves up as a subscriber for interrupts raised by the timer class.
 */

timer.TimerInterruptEventSender.insertSubscriber(
    TimerInterruptEventSourceSlot::bind(this,&Timer7SegmentTest::onInterrupt)
  );


/*
 * Set an up-down-timer up to tick at 80kHz with an auto-reload value of 444
 * The timer will count from 0 to 444 inclusive, raise an Update interrupt and
 * then go backwards back down to 0 where it'll raise another Update interrupt
 * and start again. Each journey from one end to the other takes 1/180 second.
 */

timer.setTimeBaseByFrequency(80000,444,TIM_CounterMode_CenterAligned3);

/*
 * Enable just the Update interrupt, clearing any spurious pending flag first
 */

timer.clearPendingInterruptsFlag(TIM_IT_Update);
timer.enableInterrupts(TIM_IT_Update);

/*
 * Start the timer
 */

timer.enablePeripheral();

Setting up the timer in stm32plus is a simple task of declaring it with the clock and interrupt feature, inserting ourselves as a subscriber to the interrupts, setting the desired frequency and then enabling the peripheral.

int value = -1;

for(;;) {

  value++;

  if(value>999)
    value = 0;

  // translate value to ascii, left justified

  _display[0]=_display[1]=_display[2]=0;
  StringUtil::itoa(value, const_cast<char *>( _display), 10);

  // wait for 100ms

  MillisecondTimer::delay(100);
}

The example code then goes into an infinite loop counting up from zero to 999 and then wrapping around and starting again.

/*
 * Subscriber callback function. This is called when the update interrupt that we've
 * enabled is fired.
 */

void onInterrupt(TimerEventType tet,uint8_t /* timerNumber */) {

  // verify our expectation

  if(tet!=TimerEventType::EVENT_UPDATE)
    return;

  // turn off the last digit we displayed. This needs to be done first to avoid
  // switched off segments becoming faintly visible during multiplexing

  _digits[_currentDigit].reset();

  // advance to the digit we just set up

  if(_currentDigit>=2) {
    _currentDigit=0;
    _currentDigitPtr=_display;
  }
  else
    _currentDigit++;

  // get the character to display at this position

  uint8_t c=*_currentDigitPtr++;

  // check the bottom end of the range

  if(c<=' ')
    c=' ';

  // get the segment state bitmap from the table

  uint8_t bits=AsciiTable[c-' '];

  // for each bit in the map, set/reset the correct state in the segments

  for(uint8_t j=0;j<7;j++) {
    bool state=(bits & 0x80)!=0;
    _segments[j].setState(state);
    bits <<= 1;
  }

  // process the decimal point if there is one

  if(*_currentDigitPtr=='.') {
    _segments[7].set();
    _currentDigitPtr++;
  }
  else
    _segments[7].reset();

  // switch on the digit we have set up

  _digits[_currentDigit].set();

  // we'll be back in 1/180s which means we are displaying each digit at 60Hz
}

This is the interrupt handler where the real work happens. We switch off the previous digit before setting up the seven segments that form the next digit. We then explicitly check to see if we need to turn on the decimal point before finally lighting up the next digit.

You’ll need to view the entire file to see the types of the member variables that are used but you should be able to understand the logic flow from this snippet.

Here’s a photograph of the display in action. It works as expected and the display is a comfortable and even brightness with no artifacts or flickering observed.

Adapting this technique for your project

If you want to use this technique in your own project then you should follow the same procedure that I did. To summarise:

  1. Count up the pins you’ll need and verify you have enough available on your MCU.
  2. Measure the forward voltage of your LED segments and experiment to find a low current level that gives a usable brightness.
  3. Calculate a resistor that limits the LED current to your selected value and then divide it by the number of digits on your display.
  4. Verify that your MCU can source the current you will draw, taking into account the worst case scenario where the MCU hangs and a digit is permanently on with all segments lit.
  5. Select an n-channel MOSFET with a low drain-source resistance (less than 1Ω is easily achievable) and check that the on-state power dissipation is well below the maximum the package can support.

If you need any help with driving these displays then please feel free to contact me or leave a message down below in the comments.

How to use a 4k TV as a computer monitor

$
0
0

I like big, high resolution monitors. The bigger the better. I can’t understand how so many young developers and engineers seem to be content to peer into the tiny screens on their Macbooks that offer only a few visible lines of code in perhaps two simultaneous columns, with any web-based reference material in a window hidden behind the main IDE. I never use a laptop unless I’m forced to by circumstance and much prefer a desktop with the biggest monitor I can lay my hands on.

For the last few years I’ve been quite content with my 27″ AH-IPS display, a Crossover 27qw which is basically a reject LG panel repackaged and sold off cheap by Korean entrepreneurs.


My 27″ Korean monitor

It was and still is very nice but for the last year or so I’ve had my eye on an upgrade and last week I finally did it and am now the proud owner of a 43″ Sony Bravia TV doing duty as the highest quality monitor with the most perfect picture I’ve ever seen.


Stock image of the Sony 4k TV

The rest of this blog is a guide to walk you through the minefield that awaits should you decide that you really want to use a 4k TV as a computer monitor. I don’t play games or watch videos on my computer so I will focus entirely on its use as a developer and engineering workstation display.

Do you really want to do this?

Large, real computer monitors exist. Just do a Google search for 43″ monitor and you’ll see what’s out there. So why bother with a TV? Price is one reason. 4k TVs can generally be had cheaper than monitors and, being consumer electronics, tend to show up regularly in special offers and sales.

Another reason is the panel type. My Crossover 27qw has a glossy panel popularised a decade or so ago by the monitors that came with the Mac Pro. After I got over the reflections it produced I now can’t do without the vibrant colours and crisp detail that you get. Monitors typically, but not always, come with a matte panel that reduces reflections, softens detail, and in the strongest cases produces a ‘sparkly’ image. You may want this. If you do then you have to buy a monitor because there are no matte TVs.

Why not just add another monitor so you’ve got two? This is the configuration I run with at work. Granted, a pair of 2560×1440 monitors delivers more pixels than a single 4k monitor but you have to contend with the obvious split in the middle and also the inconsistency in colour, no matter how slight, that there will inevitably be between any two monitors. It will always feel like two displays rather than one big one.

Decide on the panel technology

The two major LCD panel types are VA and IPS. Samsung popularised VA, LG and Philips did the same for IPS. These days though the lines are blurred with manufacturers offering both panel types. For the purposes of this article I’m going to ignore OLED because they’re so far out of my price range that I never even considered them when choosing a TV to use as a monitor. I’m also going to ignore TN panels because they’re cheap rubbish that look awful from any angle. Never buy a TN panel fitted to anything.

IPS has the edge when it comes to accurate colour reproduction and consistent colours when viewed at an angle. VA wins when it comes to contrast and deep, solid blacks. If your main use is to play games and consume online content like YouTube then VA is probably for you. If you process photographs or create professional content then IPS is likely to be the panel for you. A developer or engineer using desktop applications is likely to be happy with either. For my use I decided I wanted to stick with the IPS technology that I’ve been using for years.

I actually went to several local stores where they have banks of TVs on display and looked at every one of them from the kind of close-up angles that I might use as a monitor and to be honest it was impossible for me to point at one and say: “that’s IPS” or “that’s VA”. They all looked great running their canned store demo modes, but to me the Sony Bravia panels looked the best of all and that went a long way in influencing my decision.

Decide on the screen size

There are two parts to this. Firstly and most obviously the TV is going to be sitting on your desk so get your tape measure out and work out if it’s going to physically fit. Take into account that it won’t have a VESA mount so you’re going to have to use the supplied stand. Work out where the feet of the stand are going to land and how far forward towards you it’s going to place the display. You don’t want a 43″ panel a few inches from your face.

If this is your first glossy panel then take into account the position of any light sources such as house windows. A window directly behind you will appear as a bright reflection, potentially ruining your ability to see the screen.

Secondly, work out the number of pixels per inch. There’s a calculator here to help you. My old 27″ monitor has a resolution of 2560×1440 giving a pixel pitch of 108ppi. I’m completely happy with the way that looks on-screen. With no Windows scaling everything is the right size for my seating position and eyesight.

I entered the 3840×2160 resolution and 43″ screen size for a 4k TV and found that the pixel pitch was 102ppi, more or less the same as my 27″. This was looking good. Just to make sure I entered the next step up: 49″ and it came up with 90ppi which is the same as the 24″ 1920×1080 monitors that I’m obliged to use at work and frankly these look terrible to me because I can easily make out the individual pixels and fonts look blocky and low resolution. For my requirements it had to be 43″. You may differ so take care with this decision. There’s little point in buying a monitor with 4k of screen resolution and then having to magnify everything and ending up with the same amount of information on display as you had with your existing monitor.

Pick a model

Do your Googling and find a model that looks right and is within your price range. Read the reviews and try to identify whether it’s VA or IPS. This information may be hard to find because the consumer TV market isn’t interested in technical terms that the marketing department haven’t found a way to dumb-down into a way to sell more sets. Don’t assume that because you’ve found out that the 51″ model is the right technology then the 43″ model will also be. Some manufacturers use different panel types within the same range.

Avoid anything by LG because they’ve invented their own “RGBW” pixel substructure that is totally unsuitable for monitor use. Some say it’s not exactly great for TV use either. A bit of a foot-shot from LG there.

Remember that you’re buying a monitor not a TV. Buying “up the range” often means the same panel with more firmware features like picture processing, HDR, local dimming, embedded “smart” functions etc. You will disable all these features and you will never use smart functions in a monitor so don’t be tempted to pay extra for them. You just want the panel.

Will it connect to a computer?

Time to get technical and read the manual. It’ll be online at the manufacturer’s website as a PDF. Find it and download it.

The TV market has standardised on HDMI connectors. That’s fine, but the HDMI specification has been revised several times as display resolutions have increased. Your selected TV must support HDMI 2.0 (or higher). It must explicitly state this in the manual. If it doesn’t then move on to the next model. Often HDMI 2.0 support is limited to a subset of the available connectors.


The highlighted specs, here in the Sony manual, are what you need to look for

Next, the HDMI 2.0 connectors must support 3840×2160 at 60Hz with 4:4:4 chroma. All of those three terms are critically important. If the manual fails to explicitly state support for them then don’t risk buying it. Lower models might only support 4:2:0 or 4:2:2 chroma or 30Hz which all look terrible on a monitor.

Now look at your graphics card specifications. If you’re lucky and it’s a recent card then it’ll support HDMI 2.0 directly and if it’s recent enough to have that support then it’ll almost certainly support 3840×2160 at 60Hz as well. Don’t take it for granted that it’ll do 3840×2160 at 60Hz. You must find positive confirmation somewhere that it will.

I wasn’t lucky. My seven year old ATI 7970 will happily support two 4k monitors but only through the mini-DisplayPort connectors on the back. If, like me, you have a DisplayPort connector then it must support at least version 1.2 and you will need an ‘active’ converter to turn it into HDMI.

The converter I bought from Amazon is this one by a company called Pluggable. They’re available in full-size and mini DisplayPort sizes. It functions as a dongle and doesn’t require its own power supply.

In these times where reviews are bought and paid for by sellers I feel obliged to say I have nothing to do with the Pluggable company. I found this gadget by searching Amazon and paid for it out of my own pocket.

Buy a new HDMI cable

Yes, you’re going to need a new cable. All those guides on the internet that tell you how all HDMI cables are the same were written in the days of low, undemanding data transfer rates.

3840×2160 at 60Hz with 4:4:4 chroma requires a bandwidth of 18Gb/s. I did a quick (too quick) search on Amazon, found a cheap one that stated it would support 4k 60Hz and even mentioned 18Gb/s support. Great, I bought it. Oops. Whilst it was able to show a steady display it also produced ‘sparklies’ – a handful of bright random pixels scattered about the screen.

This was when I discovered the HDMI premium certified program. This is where manufacturers get their product tested for compliance at 18Gb/s by the HDMI organisation labs and if it passes then they get to display a unique hologram sticker on their packages to prove it. A mobile app is used to scan the QR code and read the hologram to verify it. Cables with this certificate are not really any more expensive than lesser cables so I bought this one.

I swapped out the sparkly cable with the new one and bingo, all my sparklies are gone and I had a pixel-perfect display. That was a lesson learned the hard way.

Download the 4:4:4 chroma test

The rtings website have produced a test image that proves whether you’re running 4:4:4 chroma or not. I’ve reproduced it here in case the above link goes dead.


Click the image to download it

Download the image and open it full size, unscaled in an image viewer. Using your current monitor look closely at the red and blue bands on the bottom. Remember how sharp and legible the text appears because you’ll use this image later to satisfy yourself that you’re running 4:4:4 chroma on your TV.

Buy it and set it up

The model I decided on was the Sony KD43XF7073SU. I lurked for a while watching the price fluctuate and then pounced when I saw it for £419 with an additional sale coupon of £30 off and I got it for £389 with free delivery.

It arrived, I hooked it up to the correct HDMI port and switched it on and… it looked terrible! The mouse was so laggy it appeared to be inebriated, text looked ragged, the colours were a mile off. Clearly I had more work to do. Here’s what I had to do to make it work properly.

Firstly, although the screen resolution was being detected as 3840×2160 the mouse lag clearly pointed to a 30Hz refresh rate. I went into the AMD Radeon Settings application and found where you could create a custom resolution. No matter how much I tried it would always complain that 3840×2160, 60Hz was not supported by my monitor. Not good.


The AMD settings – it displayed 30Hz at first

Googling around I discovered something called Custom Resolution Utility. This seemed to offer a way forward so I ran it, created a 60Hz resolution and moved it to first in the list. Nothing changed except my stress levels increased a notch.

Eventually I solved the problem from Windows by right-clicking on the desktop and selecting Display Settings -> Advanced display settings -> Monitor and then selecting 60Hz from the Screen refresh rate drop-down.

Instantly the display was much better. The mouse lag was gone and the picture was just generally ‘better’. It was now time to move on to the TV settings. This will vary depending on the model that you’ve bought. I’ll show what I did with my Sony and you can apply the principal to the model you’ve bought. I’ll try to photgraph the on-screen display but please excuse the poor quality because cameras are really bad at photographing monitors.

The Sony has a Scene Select option that controls how the picture is filtered and processed before being displayed. It needs to be turned off. In the Sony they call ‘off’ Graphics mode.

This tells this part of the TV firmware to not interfere with the input signal. Next up are the general picture settings. On the Sony this is called Picture.

Sharpness must be set to whichever option is ‘off’. On the Sony this is the middle setting of 50. Any less softens the image and any higher sharpens it.

Select a colour temperature that looks most neutral to you. On the Sony this seemed to be Expert1. Reality creation is some kind of Sony feature and should be switched off along with any other such feature that you find.

Contrast and brightness should be set to your taste. Brightness probably refers to the intensity of the LEDs that light the display. On my Sony the IPS panel is edge-lit and increasing the brightness of a dark background can result in internal reflections from the extreme corners of the panel when viewed at a sharp angle. Thankfully the display brightness is so good that I generally operate at between zero and 5 on the brightness scale and never see reflections at the extreme corners.

With the scene and picture settings set up I actually thought I was done but I had a nagging feeling that something wasn’t right because the colours seemed a little ‘off’ in a way that I can best describe as the blues appearing to have too much cyan in them. The 4:4:4 test image looked good but not great. The text just wasn’t as legible as it could be.

Flicking around the menu system I found the crucial setting buried under the Home -> Settings -> System Settings -> Setup -> AV Setup -> HDMI Signal Format option. This was set to Standard by default but must be set to Enhanced to receive the 60Hz 4:4:4 chroma signal. Presumably this is for compatibility with most consumer electronics out there.

Changing it to Enhanced was like night and day. Suddenly all the iffy colours were perfect and I was genuinely looking at a monitor-quality display (this was where the sparklies showed up as mentioned above – fixed by the Omars brand cable). The 4:4:4 chroma test came up perfect.

There is zero ghosting on the mouse pointer or as windows are dragged around. There is zero image retention if you leave a window in the same place for a while. I cannot perceive any input lag compared to a monitor, but I’ll leave the gamers to pronounce judgement on that one.

A quick check in the Windows Advanced display settings showed that everything was correct. Don’t get excited about the 10-bit claim. Windows is sending 10-bit to the graphics card but the link to the monitor is 8-bit.

Pixel substructure

Each pixel is made of 3 vertical stripes: Red, Green and Blue. Monitors and some TVs arrange these in the order Red, Green, Blue (RGB). Many TVs reverse the order to Blue, Green, Red (BGR). To find out which one you’ve got you need to display something that’s completely white like most web page backgrounds and then use a magnifying glass or loupe to look very closely at the display. You should be able to see which substructure you have.


A macro shot of my TV’s BGR subpixel layout

I have BGR, arranged in a rather curious chevron pattern (if anyone knows the reasoning behind that pattern I’d love to hear about it). So why does this matter? Windows and web browsers like Chrome and Firefox display fonts with an anti-aliasing system that smooths out rough edges. Windows calls it ClearType. It works by placing coloured pixels alongside the actual text colour to fool your eye into seeing a smooth transition to the background colour.

On an RGB display the colour chosen will be a shade of red if the coloured pixel is to be placed on the right of the shape being smoothed. On a BGR display it must be a shade of blue. Get it wrong and the text will look blurred in a way that’s difficult to describe. It’s sort of like watching one of those red/green 3D movies of old without the glasses. There’s a misalignment of colour.

Windows cannot detect whether you have a BGR or RGB display so you have to tell it in software. Press the Windows key and type ‘Clear’. The Adjust ClearType Text option should be displayed. Click on it and walk through the five stages of tuning ClearType. I think the first stage is the BGR/RGB test and the others are about getting the weightings right.

Unfortunately ClearType is not universal. I still see the misaligned colours in some old Windows dialog boxes. You have the bizarre state where the window frame is correct but the content isn’t. I tried to photograph the effect for you but unfortunately failed miserably, succeeding only in photographing really bad moire interactions between the camera and the screen. I have not found a way to fix this and to be honest I haven’t put any effort into it because I see these forms very rarely.

Other operating systems, some applications and web browsers will need custom tuning. I’m a heavy user of Ubuntu in a VMWare Player virtual machine and I had to make a change there so it looked right. See this answer for how to do it.

Firefox now looks fine in Ubuntu as do the pages rendered by Chrome. The Chrome address bar and tabs are showing incorrect anti-aliasing but I’m not putting any effort into fixing this because Firefox is my browser of choice in Ubuntu.

The Sublime Text editor in Ubuntu wasn’t picking up the system settings so I had to add the following to the settings to fix it:

"font_options":
[
  "gray_antialias"
],

Verdict

Was it worth it?

Oh yes. My goodness yes. The Bravia IPS panel quality is simply sublime. Photo editing is a joy. Laying out PCBs is so much easier when you can see more of the board and have the schematic adjacent to it. Text and icons are the same size as they were on my 27″ monitor; I simply have so much more available space.


One possible layout for coding

I never thought I’d look at my 27″ monitor and think of it as small.

Drawbacks? Well, I still have to put up with the occasional program that doesn’t understand the BGR pixel substructure. There is no auto-sleep synced to the computer. If I walk away from the computer for lunch or something then I have to use the power button on the remote to turn off the monitor. I find using the remote control to adjust brightness no problem compared to most monitor OSD control buttons.

I hope this article helps anyone that’s considering doing what I’ve done. If you’d like to discuss it some more then feel free to leave a comment below or visit the forum where I’ve started a thread on this subject.

A development board for an STM32G081 MCU

$
0
0

I’ve been an avid user of ST’s F0 series ever since it was launched. The 48MHz Cortex M0 is almost always the perfect MCU for every project that I tend to build and it’s so easy to program and debug that, for me, it’s the default answer to ‘which MCU should I use for this project?’ So when I noticed that ST had launched a ‘G0’ range I just had to have a closer look.

What’s the difference?

In short, there’s a Cortex M0+ core at the heart of the G0 series instead of the M0 that’s in the F0. To find out the difference between the M0 and the plus we have to visit ARM’s website.


Cortex M0+ block diagram

There’s a slight increase in performance and an optional Memory Protection Unit (MPU) that the RTOS guys may get excited about but really there’s not much else in the way of additional features.

The headline claim made by ARM is a further decrease in the already class-leading power consumption and a scarcely concealed attack on the remaining market for 8-bit MCUs. ARM really want you to choose this core for projects that might previously have used an 8-bit device.

A shorter pipeline

One interesting way that they’ve managed to decrease power consumption is by reducing the number of clocks required for the instruction pipeline from three to two by squashing the instruction decode stage across the two clocks and reducing the need for power-expensive access to flash memory.


Pipeline clocks and stages

I have to say that I’m not convinced that this is a good thing for performance. The marketing blurb correctly states that a shorter pipeline increases responsiveness but we engineers will immediately point out that faster response means lower throughput. High-end devices such as the Intel i7 have more than 20 pipeline stages and even ARM’s own Cortex-M7 has 6 stages.

However, ST have given us more features that I think will make the G0 series the default choice for my future designs.

F0 vs G0

Clock speed. You can’t fail to notice that the maximum clock speed for the G0 is 64MHz compared to 48MHz for the F0. The cynic in me says that this is compensation for the performance hit incurred by shorter instruction pipeline.

Flash memory. The F0 goes up to 256k. ST has plans for the G0 to go up to 512k but so far they’ve only released the G071 and G081 that go up 128kb.

SRAM. The F0 goes up to 32k. The G0 improves slightly on that with 36k. More SRAM is always good so that’s a welcome improvement.

Price. They’re about the same. If I compare the STM32F091RBT6 to the similarly spec’d STM32G081RBT6 then they come about about the same with the G0 edging it with the 36k SRAM and 64MHz maximum clock.

ST’s development boards

At the time of writing ST are offering two development boards. There’s the low cost G071 Nucleo that features a degree of Arduino compatibility though goodness knows why they bother with that.


The G071 nucleo board

At just $10 the Nucleo is really cheap and a great way to kick the tyres of the G0 series if you want to get started really quickly.

The other board on offer is a full-featured eval board featuring the STM32G081RBT6 MCU and costing an eye-watering $400.


The G081 eval board

To be fair, for that $400 you do get a ton of peripherals to play with. If your employer is paying then this might be the board you want to buy.

I almost bought a Nucleo board but then reconsidered and decided that the best way for me to learn about the ins and outs of the G0 would be to build my own development board. So that’s what I did.

My development board design

I decided that I’d go for the following features on my board.

  • STM32G081RBT6 MCU with 128k/36k flash/SRAM in the LQFP-64 package.
  • All GPIO pins to be available on top pin headers, grouped by port and ordered by pin.
  • All pins also to be available in breadboard-friendly 100mil spaced headers on the bottom of the board.
  • Three selectable power supplies: 3.3v, 2.5v and 1.8v.
  • Three user LEDs routed to ports that have timer outputs available so flash/fade effects can be protyped.
  • A third-party USB-to-serial IC will connect an MCU USART to the outside world through a mini-USB port.
  • External 8MHz HSE oscillator so that the USB-to-serial IC can be used at high speeds.
  • External LSE 32.768kHz oscillator for RTC use with backup coin-cell battery.

The STM32G081RBT6 was selected because it’s currently the biggest one that ST have released and it always helps to have the most resources available on your dev board.


The 64-pin G081 IC

When it comes to transferring your design from the dev board to a custom PCB then that’s when you can downsize to a device optimised for your resource usage.

I decided right away that the pin headers would expose every GPIO port and pin group and the headers would be grouped by port and ordered by pin. I’m seriously bored with hunting for pins on development boards that have been ordered to facilitate easy PCB routing!

Pin headers will be provided on the top and bottom of the board, just like the ST Discovery boards except mine will have pins on top that are long enough to actually accept a jumper wire without them falling off because they’re too loose.

A selectable power supply has always been something I’ve wanted from a development board. If my project is standalone then I’ll probably run it at 3.3v but there are times when I need something else. For example interfacing with an FPGA might require 2.5v and a battery-powered design might be spec’d to run at 1.8v to optimise power consumption.

A standalone IOT device might be required to keep accurate time so I decided to include a 32.768kHz LSE oscillator and the associated backup battery. This will be a 3v CR1220 coin-cell.

It’s quite common for an MCU to have to connect back to something more powerful such as a PC or Raspberry PI so I’ve included a USB-to-serial device – the cheapo CH340E – on my board. To enable driving the CH340E at high data rates I’ve included an 8MHz HSE oscillator so that the MCU peripheral clocks will be as accurate as they can be.

The CH340 saga

I’ll start with a confession. This is my second attempt at building this board. The first attempt was largely the same as the second except that first time I used the CH340G instead of the CH340E. And the CH340G didn’t work.

The CH340/341 series are a family of USB to serial converters that fulfil the same role as the much more well known FTDI chip that you can find on official Arduino boards. The CH340 is made by a Chinese company, WCH, and is much cheaper than the FTDI chip. I bought my CH340’s for about £3 for 10 on Ali Express. The internet says good things about the CH340 so I felt confident.

The CH340 is actually a series of ICs that come in different packages that provide different levels of functionality. The datasheet is all in Chinese but has been translated to English and can be found here.


All the CH340s (so far)

The 340G is the first one that I bought, and it arrived in packaging that I would not describe as confidence-inspiring.


Loose packed ICs

Nevertheless I made a board with it and selected a good quality Abracon surface mount 12MHz oscillator to support it. Unfortunately it would not enumerate as a USB device. I think it was having trouble getting the oscillator to start as I couldn’t detect a signal coming from it and I’m fairly sure my probes weren’t loading the crystal down as I could probe another, similar oscillator on the board. On power up I could see short spurious waveforms come from the oscillator but they never stabilised and the output always went flat.

I tried higher load-capacitors, then lower. Then a new CH340G. Nothing worked and I was out of options so I decided to cut my losses, and my time, and replace the CH340G with the CH340E that has a built-in oscillator and requires only decoupling capacitors to use it. I could of course have switched to an FTDI but that would be giving up and I wasn’t ready to do that. Unfortunately the IC package was different so I also had to redesign that part of the board.

Three weeks later and the new boards arrived and so have the new ICs, again from Ali Express. This time though the packaging was much more like they’d actually come from a production reel and not some trader’s back pocket.


Proper IC packaging

We’ll see how I get on later in this article, but now let’s have a look at the schematic for the development board.

Schematic


Schematic image

Click on the full image to see a PDF of the schematic. It’s quite a modular design so let’s break it down into its parts.

The power supplies and the USB interface

I wanted to have a choice here and linear regulators are cheap so I designed in three of them to give power supply options of 3.3v, 2.5v and 1.8v. A jumper is provided to select the desired level. Since the power input comes from the potentially very noisy 5V USB line I’ve included my usual filter network consisting of a ferrite bead and some capacitors.

The USB interface is provided by a mini-B connector. I only ever use the mini-B or the full size B connector on my boards because they’re the most reliable. The micro-B connector used by most smartphones before they started switching to type-C is too delicate and easy to tear off the board. The on-board components are protected from static by the USBLC6 IC that I include as a matter of course with every USB design.

The CH340E can only run at 3.3v and can be enabled with the P2 jumper. D1 ensures that when 2.5v or 1.8v are selected then that level does end up being routed to the VCC pin of the CH340. The voltage drop of that same diode also has the effect of making VDD for the MCU actually about 3.0 to 3.1v instead of 3.3v. Something to be aware of.

The CH340E only requires a pair of decoupling capacitors to run at 3.3v, one on the VCC pin and the other on the V3 pin. The use of the diode D1 actually means that the CH340E will be powered at 3.3v and the signal levels will be about 3.0v but that’s within spec so I don’t expect any issues.

The STM32G081RBT6 MCU

Here’s the heart of the system, the MCU. All the GPIOs are broken out to pin headers Except those that have crystals attached, though I don’t think those are any great loss to a designer.

External 8MHz and 32.768kHz surface mount crystal oscillators are provided as the HSE and LSE clock sources. There’s a reset button and the SWD programming interface is broken out to the ungainly 20 pin ARM connector that connects directly to my ST-Link programmer via the ribbon cable.

Three LEDs are provided that source current from the 3.3v supply regardless of the selected core voltage so that the brightness won’t vary — and in the case of the blue LED so it will actually light at all — when the supply is 1.8v.

ST have optimised the pins on the G0 series so that you get more GPIOs. In their older chips you’d find a pair of VDD/GND pins on each side of the package which I think tells you a bit about the internal floor planning of the device and how things are arranged internally on the die. This G0, however has a single VDD/VDDA pin and a single GND pin. VREF is there but they’ve done the decent thing and placed it right next to VDD/VDDA so you can tie them together and right next to those is VBAT, which obviously has to be separate.

I’ve designed in a coin-cell holder for a CR1220 3v backup battery, protected from reverse-current by a schottky diode with a low forward voltage. According to ST’s datasheet the worst-case current consumption for VBAT when VDD = 3.0v is 470nA. A quick look at Energizer’s datasheet shows that you can expect to get about 30mAh from it before the voltage starts to drop off. 0.00047mA into 30mAh gives 63829 hours or 7 years. I’m sure there are other losses and inefficiences to consider but still, that’s a long time.

IO connectors

They’re all there with the exception of those used by oscillators. One of the great things about ST’s package redesign is that there are now more GPIO pins available. GPIO ports A, B and C are fully available although in my design PC14 and PC15 are taken by the 32.768kHz LSE. This is great for designs that might need to bit-bang a parallel bus. GPIO port D has pins 0 to 9 available.

Bill of materials

Here’s the full bill of materials for this project.

IdentifierValueQuantityDescriptionFootprintComment
B11CR1220 holderCustomavailable on ebay
C122µ1Polarized Capacitor (Radial)CAPPR2-5x11
C2100µ1Polarized Capacitor (Radial)CAPPR2-5x11
C310n1Capacitor0603
C4, C12, C13, C17, C20, C23100n6Capacitor0603
C5, C224.7µ2Capacitor0805
C6, C7, C8, C9, C10, C11, C217Capacitor0603
C144.7n 250v1Capacitor0805
C15, C1622p2Capacitor0603
C18, C1910p2Capacitor0603
D1, D5STPS0520Z2Schottky RectifierSOD123
D21LED2012RED
D31LED2012YELLOW
D41LED2012BLUE
FB1BLM18PG221SN1D1Ferrite bead0603
P11Header, 3-Pin, Dual rowHDR2X3POWER
P21Header, 2-PinHDR1X2CH340 EN
P31USB Mini BUSB Connector
P4, P52Header, 16-PinHDR1X16GPIOA
P6, P72Header, 14-PinHDR1X14GPIOC
P8, P112Header, 10-PinHDR1X10POWER
P9, P122Header, 10-PinHDR1X10GPIOD
P101Header, 2-Pin, Dual rowHDR2X2BOOT
P13, P142Header, 16-PinHDR1X16GPIOB
P151Header, 10-Pin, Dual rowHDR2X10SWD
Q1, Q2, Q3SI2374DS3MOSFET-NSOT23-3N
R1, R33302Resistor0805
R2, R410k2Resistor0805
R51501Resistor0805
R6, R7, R8100k3Resistor0805
R91M1Resistor0805
SW11ButtonPCB ButtonRESET button
U1MCP1700T-3302E/MB1SOT-89-MB3_NLDO, 3-Pin SOT-89
U2MCP1700T-2502E/MB1SOT-89-MB3_NLDO, 3-Pin SOT-89
U3MCP1700T-1802E/MB1SOT-89-MB3_NLDO, 3-Pin SOT-89
U4USBLC6-2SC61SOT23-6_L
U5CH340E1MSOP-UN10_N
U6STM32G081RBT61ARM Cortex-M0+STM-LQFP64_L
Y1ABRACON ABM3B18MHz crystalCustom
Y2Epson FC-135132.768kHz crystal2012

PCB layout


3D view of the board

The physical layout of the board is optimised for ease of use. I’m tired of boards that lazily route the GPIOs to the headers based on where they are on the IC package making me hunt up and down the tiny and often faded silkscreen legend for the right pin. On this board the GPIOs are grouped by port and ordered by pin number. It only takes a few extra seconds of routing time, even with ST’s knack of placing a port’s pins at seemingly random locations around the package.

I’ve provided rows of downward and upward facing pins for each port. The downward facing pins are designed to be the standard 100 mil apart all down the board and the rows on the opposite side of the board are a multiple of 100 mil away so that the board will mate with a breadboard.

The upward-facing pins are designed to accept the usual jumper wire-interconnects that we use so often.

Overall the board was easily routed on two layers with a ground plane top and bottom and is a little larger than ST’s discovery boards. I even had space for four M3 screw holes if the board ever needs to be permanently mounted into a chassis.

I uploaded the boards to JLPCB in China and paid the ridiculously low fee of $2 for a pack of 5. The lowest price shipping option added about $7 shipping to the price. Delivery took about 3 weeks.

Building the board



Top and bottom view of the bare board

The boards arrived and are looking great. I was particularly pleased to see that JLPCB have ignored the solder mask expansion around the pads. Solder mask expansion is the gap between the edge of a pad and where solder mask starts and is used to counter inaccurate placement of the solder mask in the manufacturing process.


Accurate solder mask placement

I set a rule of 4 mil which would leave a tiny solder mask sliver of about 1 mil between the MCU pins. It appears that JLPCB have enough confidence in their process to ignore the expansion rules and place the mask right up to the pins. The benefit of this is that I’ve got, as you can see above, a complete mask between the 0.5mm pitch pins of the MCU and that will greatly reduce the chance of accidental solder bridges.

My process for building a board is quite slow but I find it reliable. I first tin the pads with solder, then I reapply flux to the pads, then I place the surface mount components on the tinned pads and then I reflow the board in my halogen reflow oven. When the reflow is complete I touch up any problems manually and then solder in the through-hole components by hand. Finally I wash the board using hot soapy water and a toothbrush, rinse it off with cold water and leave it out to dry for at least a day.


All built and ready for testing

The reflow went OK this time. The only issue was that one of the 0603 capacitors got blown off its pads by the oven fan before it could reflow. I fixed that manually with my hot air gun.

One interesting thing I noticed while inspecting the board under my microscope was the design of the 32.768kHz crystal package.


It took a 2x macro and many megapixels to shoot this

It appears that the designers wanted to show off the internals so they fitted a clear plastic window into the top of the 1206 package so you can see how it works. I have no idea how crystals are constructed but I do find it fascinating and it looks a bit like a tuning fork. Does anyone out there know what’s going on in there?

Testing the board


ST-Link hooked up to the board

With the first revision of this board giving me so many CH340-related problems I was naturally a little apprehensive when connecting this for the first time.


USB View looking good

I need not have worried, the CH340E enumerated first time, showing up immediately there in the Windows USB View utility. I could proceed with firmware testing.

I decided that the quickest way to test out the board would be to use ST’s STM32CubeMX software to auto-generate some code that would set up USART2 to talk to the CH340E and also the three pins that are connected to the LEDs.


CubeMX pin configuration

I decided to enable RTS and CTS for USART operation even though I was going to operate in asynchronous mode for this test. I clicked through the CubeMX screens to enable the core clock at 64MHz using the external HSE as a clock source and then got it to generate a project for the free System Workbench for STM32 IDE. I’m pleased to see that this is yet another Eclipse-based IDE and I’ve been an Eclipse user for more than 15 years so I immediately feel at home and ready to code.

/* USER CODE BEGIN Header */
/**
  ******************************************************************************
  * @file           : main.c
  * @brief          : Main program body
  ******************************************************************************
  * @attention
  *
  * <h2><center>&copy; Copyright (c) 2019 STMicroelectronics.
  * All rights reserved.</center></h2>
  *
  * This software component is licensed by ST under BSD 3-Clause license,
  * the "License"; You may not use this file except in compliance with the
  * License. You may obtain a copy of the License at:
  *                        opensource.org/licenses/BSD-3-Clause
  *
  ******************************************************************************
  */
/* USER CODE END Header */

/* Includes ------------------------------------------------------------------*/
#include "main.h"

/* Private includes ----------------------------------------------------------*/
/* USER CODE BEGIN Includes */

/* USER CODE END Includes */

/* Private typedef -----------------------------------------------------------*/
/* USER CODE BEGIN PTD */

/* USER CODE END PTD */

Auto-generated code

ST’s auto-generated code isn’t too bad at all. I’ve seen all types of auto-generated code that range in quality from ‘that compiles?’ to some that really look like a human wrote it. ST’s code contains comments that mark out where you can place your code without fear of it being overwitten by re-runs of CubeMX when you need to change your design. That’s great but my advice would be to keep your modifications of auto-generated files to a minimum and always check in or stage your changes before you re-run CubeMX.

  /* USER CODE BEGIN WHILE */
  while (1)
  {
    /* USER CODE END WHILE */

    /* USER CODE BEGIN 3 */

    HAL_GPIO_TogglePin(GPIOB, GPIO_PIN_13);
    HAL_Delay(200);
    HAL_GPIO_TogglePin(GPIOB, GPIO_PIN_14);
    HAL_Delay(200);
    HAL_GPIO_TogglePin(GPIOB, GPIO_PIN_15);
    HAL_Delay(200);

    HAL_UART_Transmit(&huart2, (uint8_t *) "hello world\r\n", 13, 1000);
    HAL_Delay(400);
  }
  /* USER CODE END 3 */

My small additions to test the firmware

I added a few lines of test code to write out Hello World every second to the USART at 115200kbps and also cycle the LEDs. Compilation was immediately successful and the firmware was uploaded through the IDE’s connection to ST-Link via OpenOCD.

That CH340 again…

The firmware started running and the LEDs were cycling as expected but Windows was now having problems detecting the CH340E. What the? It was working before. Had I broken it? I fired up USB View and found it there with a different vendor ID than it had before.


USB View showing the unexpected VID

When I first plugged in the board it had a VID of 1A86 which matches the VID in the Windows driver package that I downloaded. Now it’s got a VID of 9986 that doesn’t match anything in the driver. Weird.

My first reaction was to modify the driver package to include 9986 in the .INF file and re-install it. This worked so I was half pleased and half uneasy about what had actually happened. I decided to do some more research and it turns out that if the RTS pin ‘has too much load resistance’ at boot time then the CH340E will enumerate with a VID of 9986. This behaviour is undocumented (thanks for that) and my guess is that it’s a bootstrap configuration feature designed for a specific customer.

One solution is to not configure RTS if you’re not going to use it, or if you are then wait until USB enumeration is complete before you configure the pin and keep your fingers crossed that the CH340E doesn’t reset independently of the MCU. Another solution is to configure your drivers to accept both possible VIDs. That’s what I’ve done with Windows.

With my VID driver hack still in place I fired up Realterm, my favoured serial port test utility and configured it with the correct virtual port number and configuration. It immediately started ticking away with the expected Hello World message every second.


Realterm receiving data from the board

That’s a relief. Now I’m pleased that it all seems to work and I can go on to develop actual projects using the STM32 G0 series confident in the knowledge that they’ll work as expected.

Free gerber files

If you’d like to build this board yourself then you can download the Gerber files from here. They should be accepted by your favourite online PCB manufacturer.

Video and conclusion

If you’d like to watch a video of me faffing about with the PCB, explaining badly how it works and then stumbling through the firmware testing then it’s here on YouTube.

Embedding a video in the webpage looks cool but really you want to go to YouTube to watch it for the full high definition experience.

If you’d like to comment on this article then please leave a comment below or click here to visit the forum thread that I’ve started if you’d like to have a more lengthy conversation.


A USB microphone for online meetings

$
0
0

Here in the UK the new reality of working in the IT business over the past year has been that we’re all at home working remotely over virtual desktop connections and for someone engaged in software development this is a setup that works well. Having to commute 90 minutes each way into London every day on the train is not something I’ll ever miss.

Team meetings are still an important part of the day though and that meant digging out and dusting off my old webcam, a Logitech something-or-other that works fine in every scenario except when I use it through my company’s Citrix-hosted virtual desktop. The video is fine but the audio frequency on the VDI is mismatched to the actual frequency on the physical device. I sound like Mickey Mouse on helium.

This is a well-known problem and the solution is to change the frequency on the VDI, which requires administrator level access. And that is never going to happen. I could try calling our support desk just to see how long it takes before they realise it’s not company-issued hardware and therefore the ticket has just been closed and is there anything else they can help me with today? No, not going to do that.

So I’ve been muddling through by using a rather useful Android USB microphone app called Wo-Mic. You install a PC server component, connect your phone by cable and voila your PC has a new USB microphone that’s actually your phone. It’s not a bad solution and the audio quality is very good but I’d really like a dedicated microphone that I can just plug in and place on the desk in front of my keyboard.

Any normal person would open Amazon and either buy the cheapest microphone available, or perhaps a Røde if they know decent audio when they hear it. Not me though. This sounds like a project if ever I heard one. A good chance to see if I can build a USB microphone and learn a thing or two along the way.

The design

I came up with two possible options for the design. One would be part-analogue and the other all-digital.

Both options are designed around a MEMS microphone for translating sound pressure levels into an electrical signal. Physically a MEMS microphone comes in a miniature metal ‘can’ that houses the required circuitry inside. The can serves both as physical protection for the sensitive receiver and as electrical protection from interference for the analogue circuitry. There is a small hole drilled in either the top or the bottom of the can to allow sound waves to enter.


This is a TDK ICS43432

If the port is on the bottom then you need to provide a hole of the same size in your PCB. This is the most tricky design for a hobbyist to work with because you absolutely must not get any gunk in that port which means being super-careful where you put flux on the pads that will be very, very close by.


The tiny circuit board on the bottom with those difficult pads

In option 1, the part-analogue design, I would connect a MEMS microphone up to some analogue signal conditioning and amplification circuitry before feeding it to an AD converter and then into an MCU for digital signal processing and output over a USB interface using the USB audio device class.

The tricky parts in this design are the signal conditioning and the ADC. MEMS microphones output a very low AC signal typically around ±1V. This would need to be amplified using an op-amp before sending to an analog-to-digital converter. I’ve modelled this design in LT-Spice and am convinced that I could do it but the devil would be in the implementation details such as noise problems from the mixed-signal circuit board, choice of op-amp and ADC.

Option 2 does away with those tricky analogue signal conditioning parts and uses a digital MEMS microphone. These devices have the analogue processing inside the metal can and their output is a digital I²S signal. From a noise-reduction and signal quality perspective keeping those sensitive analogue parts inside the shielding of the can is the best decision.

I decided to go with Option 2, the all-digital (to me) design.

Selecting a MEMS microphone

ST’s range of digital microphones provide a PDM output which needs some fairly involved but perfectly do-able software decoding in the MCU to get the PCM samples that I need. Best though are the microphones from TDK and Knowles that provide an I²S output. I²S is supported in hardware by many of the STM32 devices (it’s just a specific application of SPI) so the microphone can be hooked up directly to the MCU and we’ll be provided with a constant stream of PCM samples that we can work with directly.

I’ve had mixed sucess in other projects where I’ve used these microphones. The first build I did was a total write-off because like a complete idiot I washed the board after building it, immediately destroying the microphone by getting water in the port. Undeterred I built another one and it did work but, being an analogue microphone the audio quality wasn’t great because my first attempt at filtering and amplification didn’t hit the mark.

The next board I built was a different design that used a Knowles I²S microphone and I had big problems getting it to reflow to the board. I think the metal can caused problems with heat transfer to the pads underneath. And that’s the crux of the problem – the pads are completely underneath the microphone so it’s impossible to check for bad joints. I decided that going forward I’d simply buy a breakout board with the microphone already on it and design a carrier board around it.

This is the one I decided to use. It’s the INMP441 by TDK, Invensense or whatever they’re calling themselves this week. It’s actually gone NRND now but that doesn’t matter because I’m not going into mass production and I can get these breakout boards on ebay for less than £5.

MCU selection

It’s an STM32 and that’ll come as no surprise to anyone that follows this blog because they’re just so versatile. I need one that has I²S and USB peripherals and will be capable of translating the I²S format into the USB format in real time. Since this is a one-off build overkill is not an issue so I’ve selected the STM32F446RCT7 in the LQFP64 package.

This has 256kB of flash, 128kB of SRAM and can run at up to 180MHz. A lot of those resources are going to go unused but for the sake of a few pounds I’d rather have resources to spare than find I needed more to complete the project.

Detailed design

Here’s a more detailed view of the proposed design.

The STM32 sits at the center of the design and acts as an I²S master, providing the I²S clock and word select (WS) signals and receiving serial data as the response. Here’s a view of the protocol from the INMP441 datasheet.


The protocol in the middle is the one I’ll be using

The STM32 will provide a 48kHz WS signal and since there are 64 clocks per period then the clock will be 3.072MHz. Providing this clock accurately with zero error will require a dedicated external oscillator that you will see in the schematic. The data from the INMP441 is provided as 24-bit signed PCM samples, MSB first and left-justified into 32-bit words. The data bits are offset by one clock from the change in the WS signal and are available to read on the rising edge of the clock. This is known as the ‘Philips’ standard. Presumably the one clock offset was to allow early hardware implementations to use a clock cycle to reset their registers and prepare for a new sample.

The I²S peripheral will be connected to the DMA peripheral so it can operate in ‘hands free’ mode. The firmware will receive interrupts when buffers of data are ready to be processed and sent.

A buffer of data from the microphone will be presented as 64-bit samples with 24-bits of data in the left channel and zero in the right channel. ST provide a reference implementation of the USB audio device class that operates on 16-bit samples and I’ve decided not to try to change that, at least not in my first release. Therefore the first task is to downsample the sparse array of 64-bit samples into a new buffer of 16-bit PCM samples.

The next task in the pipeline is to apply any ‘graphic equalizer’ filtering that might be required. For example I might need to suppress high or low frequency noise to clean the signal up. If I’m really lucky the signal will be perfect out-of-the-box but somehow I doubt it.

Secondly I’ll need to adjust the signal volume (amplitude) according to my preference. The USB audio device class provides for a volume control and PC operating systems expose that in the form of a microphone volume slider in their ‘settings’ control panels.

Finally the fully transformed buffer of PCM signals can be sent to the USB firmware for transmission to the PC.

All of this has to be designed so that the number of samples that we gather in one DMA buffer is sufficiently large that we have enough time to do all the processing before the next buffer is available but cannot exceed the maximum size permitted to be sent in a single USB packet. This is why I’ve selected a 180MHz CPU with DSP instructions — finding that I’m CPU-bound would be a show-stopper for the project.

Schematic

Here’s the full schematic for this project.




Click for a PDF

The schematic is quite modular in design so let’s have a walk-through of each section.

The power supply and USB connection

5V is delivered to the board over the USB cable, filtered through the usual LC network that I use and connected to a Texas Instruments LP5907 3.3v ultra low noise regulator. The USB data signals and the 5V input are passed through an ST Micro USBLC6 ESD protection IC.

The INMP441 microphone

The footprint for the INMP441 is just two rows of 3 female pin headers spaced at 300mil. L/R is pulled down to GND with R5 which should cause the INMP441 to output data in the left channel. Rather than leave this hardwired I also decided to connect it up to the MCU just in case I needed to assert manual control over that line.

The INMP441 specification requires a 100k pull-down on the SD line and when I wrote this schematic my INMP441 was on the slow-boat from China so I decided to include the footprint for R2 and if it turned out that the board included it then I’d leave mine off the final build.

The STM32F446RC

The smallest available LQFP package has 64 pins so lots of them are going to be unused. On the left side we’ve got the I²S signals connected to the I2S3 peripheral. There’s also a simple GPIO input for a physical mute button on board. When muting is enabled the device will ignore DMA interrupts. USB I/O and the necessary SWD programming ports complete the left side of the picture.

Two LED outputs are provided. The blue link LED will light when the USB connection is active and running. If there’s a software crash in the form of a hard-fault then I’ll rapidly flash this LED and to that end I’m providing a reset button that can be used to get me out of this situation without having to unplug the USB cable. The red live LED will light up when data is actively being sent over the audio connection. When software or hardware muting is enabled then this light will go out.

Optimum USB audio quality requires accurate clocks. I2S_CK on PC9 is one such clock. It’s connected to an external 12.288MHz Microchip oscillator. Internally the STM32 can divide this by 256 to get exactly 48kHz and by 4 to get exactly 3.072MHz.

The STM32’s core clock will be derived from an external 8MHz crystal that also guarantees an accurate 48MHz clock for the USB peripheral.

Bill of materials

Here’s the full bill of materials for this project.

IdentifiersValueQuantityDescriptionFootprint
C110n1Ceramic capacitors0603
C2, C7, C8, C9, C10, C11, C15, C16, C18100n9Ceramic capacitors0603
C3, C6, C124.7µ3Ceramic capacitors0603
C4, C52Ceramic capacitors0603
C13, C1422p2Ceramic capacitors0603
C1722µ1electrolytic capacitor2.5mm lead pitch
D1Live LED1Red LED2012
D2Link LED1Blue LED2012
FB1BLM18PG221SN1D1Ferrite bead0603
P1USB connector1USB mini-Bcustom
P2INMP44112x 3-pin headers100mil headers, 300mil spacing between headers
P3JST XHP-51Female SWD header5x 2.5mm
R1, R2, R5100k3Chip SMD resistor0603
R33301Chip SMD resistor0805
P41501Chip SMD resistor0805
SW1, SW22PCB buttoncustom
U1STM32F446RCT71MCULQFP64
U2LP59071LDO regulatorSOT23-5
U3USBLC6-2SC61USB ESD protectionSOT23-6
Y1Abracon ABM3B18MHz crystalcustom
Y2DSC6011CI2A-012.2880T1Microchip 12.288MHz oscillatorcustom

PCB layout

Before starting the layout I had a quick look at the current JLCPCB prices and could scarcely believe my eyes when I saw that controlled impedance 4-layer boards are available up to 100x100mm for just $8 a pack of five. It wasn’t that long ago when 4 layer boards would run into hundreds of dollars!

Are 4-layer boards going to be the new normal? They are for me that’s for sure. The benefits of being able to just drop a via when you need power or ground are hard to ignore. The only minor annoyance with JLPCB’s implementation is that they won’t accept a negative gerber for an internal plane. You have to do everything with polygons and fills.

Here’s the PCB layout.


I shelved the polgon pours, including the two internal pours for these screenshots hence the rats-nest of apparently unconnected nets.

The first internal layer is GND and the second is VDD. The layout of the header pins for the INMP441 is designed so that the hole in its board that allows sound to pass through into the microphone can is facing upwards. I do have some concerns about dust getting into that hole over time so I may have to consider a housing for the microphone with some foam over that area.

The 5-pin SWD connector is a female JST XHP-5. These box connectors have a 2.5mm pitch and mate with a male connector fitted to a custom cable that I’ve made up. I now use these connectors for SWD connections when the board is small and won’t take the much bigger 10×2 100mil headers.

The 3D view is best for checking whether any of the silkscreen labels overlap parts that they should not and making minor changes to their position. Four M3 screw holes round off a very simple physical design measuring around 46mm square.

Building the board

I uploaded my design to the JLPCB website, selected a black solder mask and used the 3 weeks it took to arrive to source the parts from Mouser.


JLPCB were the first to offer a matte-black soldermask at a reasonable price and I do have one of their early boards and the finish was very similar to a black chalkboard. Apparently that finish suffered from poor adhesion and so they’ve now changed it slightly and whilst it’s still matte there’s a very slight shine to it and it’s more black than grey. It does look very nice and not at all like the horrible glossy uneven black of old.

My process for building a board is quite slow but I find it reliable. I first tin the pads with solder, then I reapply flux to the pads, then I place the surface mount components on the tinned pads and then I reflow the board in my halogen reflow oven. When the reflow is complete I touch up any problems manually and then solder in the through-hole components by hand. Finally I wash the board using hot soapy water and a toothbrush, rinse it off with cold water and leave it out to dry for at least a day.

Looking good! The reflow was completely successful with no touch-up required. Now I can get on with the firmware development.

Firmware

I wrote the firmware in Ubuntu Linux using the STM32Cube IDE. This is basically Eclipse with plugins developed by ST Micro as well as some other open-source plugins. To kick-start the project I used the Cube GUI to configure the peripherals and clocks and then write out a template project. Here’s the peripherals view:

It’s very helpful to be able to use the graphical clock tree configurator to set everything up and know in advance that it’s going to be perfect.

Once Cube has generated the template project then I take it from there. I edit the generated source code to remove the huge ugly (sorry ST) comments and reformat the source to make it more readable.

Debugging

The first time I hooked everything up I was initially pleased to find that the MCU was responding, firmware was being flashed and I was getting regular interrupts from the DMA controller handling the I²S peripheral. On the USB side my microphone was detected by Ubuntu and I was able to capture samples using the free Audacity software.

However, it sounded dreadful. My voice was audible but it was harsh and way too loud. If I got anywhere near the microphone then it would clip badly and harshly. Something was clearly wrong.

The difficulty with debugging audio is that you’re presented with a continuous stream of what you have to assume is valid PCM samples but what does a correct sample look like? You can’t just look at a buffer of data in the debugger and be able to tell good from bad. I’ve been known to hold up the microphone to my computer speakers whilst playing an online sine-wave generator to see if the captured data would exhibit a constant pattern!

The first thing I did was verify what could easily be probed. The voltages were fine. My oscilloscope showed exactly 48kHz on the WS line and exactly 3.072MHz on the CK line so no problems there.

To rule out the microphone I ordered another one online from a different seller and waited a few days. The new microphone showed exactly the same issue so it was apparent that the problem was elsewhere.

For the next step I decided that I needed to know exactly how the DMA peripheral was delivering the I²S data into memory. I could see from the data in-memory that the first 32-bit word had the data and the second was always zero but was the 24-bit sample in the 32-bit double-word being byte-swapped or swapped around 16-bit words? This was a nagging concern that had to be investigated because the DMA peripheral is programmed for 32-bit transfers but the I²S peripheral is inherently a 16-bit device.

I got out an old STM32F4 discovery board and wrote a simple firmware program to act as an I²S slave that would output fixed 24-bit samples. I removed the INMP441 from my board and hooked up the discovery board, splicing in my logic analyser into the middle so I could see what was going on.

This was very enlightening. Not only could I see exactly how a 24-bit value in memory is placed on to the wire by the I²S peripheral in the discovery board but I could also see how that would end up in the receiving board’s memory.

The issue, in the end, was that I was losing the sign bit on the PCM sample during my processing. To downsample from 24 to 16 bits I simply take the top 16-bits, discarding the lower 8. With the sign bit preserved correctly the audio started working.

Audio processing

ST provide a suite of audio effects expansion software called X-CUBE-AUDIO. This is implemented as a closed-source but freely available package that integrates easily into firmware using consistent APIs designed to be used as part of an audio-processing pipeline.

The first package I was interested in was the Graphic Equalizer (GREQ) package documented in UM1798. The audio that I was sampling was clear and noise-free but because of the location where I sit it tended to sound quite boomy and bass-heavy. If I could attenuate those frequencies then my speaking voice would sound more crisp and natural with less reverberation.


You can select the center frequencies for the equalizer configuration from 5, 8 or 10 preset bands with a maximum adjustment of ±12dB. I selected 10 bands and configured it to attenuate 62, 115 and 215Hz by -6dB and to amplify the other frequencies by +6dB. The reason for the amplification of the higher frequencies is to compensate for an overall attenuation performed by the filter — see UM1798 for details.

This is the resource usage required by the GREQ filter.

To use the filters I simply copied the two header files and the binary library into my project, rebuilt and ran. Surprisingly and very happily it just worked first time. The audio was notably more clear with much less unwanted boomy reverberation.

I will have to leave these settings hardcoded into the firmware because although the USB audio device class does provide for graphic equalizer control I have not seen it implemented in PC operating system software even though I declare in my audio device descriptor that I support it and I can see it there in the UsbTreeView PC software.

The next part would be volume control. Until now I’d used the normalisation filter in Audacity to bring up the input signal to a loud enough level to listen to critically. This stage in the pipeline is achieved by ST’s Smart Volume Control (SVC) filter documented in UM1642.

It is possible to do dumb amplification by simply multiplying the PCM signals by a constant value and if the levels are low enough you might get away with that without clipping but you’ll also amplify any noise in the signal because you’ll treat low levels where there is no useful sound the same as the higher levels. ST’s filter takes into account the dynamic range of the input signal to give more consideration to the peaks without touching the troughs.

The configuration for this filter is a simple selection of the gain from -80 to +36dB in 0.5dB steps so the actual limits for the parameter are -160 to +72. This is the resource usage required by the SVC filter.

Once again this just worked and I was able to hook up the volume control to the USB control input so I could use the volume slider in the Ubuntu settings app to set it on-the-fly.

I have found that I need to apply the full +36dB amplification to get acceptable volume from the microphone. I used the Skype echo test call to check that I’m sounding good and, Skype’s obvious compression of the voice channel aside, it’s all good on the audibility front.

Here’s the important interrupt handler that processes the incoming samples.

/**
 * 1. Transform the I2S data into 16 bit PCM samples in a holding buffer
 * 2. Use the ST GREQ library to apply a graphic equaliser filter
 * 3. Use the ST SVC library to adjust the gain (volume)
 * 4. Transmit over USB to the host
 *
 * We've got 10ms to complete this method before the next DMA transfer will be ready.
 */

inline void Audio::sendData(volatile int32_t *data_in, int16_t *data_out) {

  // only do anything at all if we're not muted and we're connected

  if (!_muteButton.isMuted() && _running) {

    // transform the I2S samples from the 64 bit L/R (32 bits per side) of which we
    // only have data in the L side. Take the most significant 16 bits, being careful
    // to respect the sign bit.

    int16_t *dest = _processBuffer;

    for (uint16_t i = 0; i < MIC_SAMPLES_PER_PACKET / 2; i++) {
      *dest++ = data_in[0];     // left channel has data
      *dest++ = data_in[0];     // right channel is duplicated from the left
      data_in += 2;
    }

    // apply the graphic equaliser filters using the ST GREQ library then
    // adjust the gain (volume) using the ST SVC library

    _graphicEqualiser.process(_processBuffer, MIC_SAMPLES_PER_PACKET / 2);
    _volumeControl.process(_processBuffer, MIC_SAMPLES_PER_PACKET / 2);

    // we only want the left channel from the processed buffer

    int16_t *src = _processBuffer;
    dest = data_out;

    for (uint16_t i = 0; i < MIC_SAMPLES_PER_PACKET / 2; i++) {
      *dest++ = *src;
      src += 2;
    }

    // send the adjusted data to the host

    if (USBD_AUDIO_Data_Transfer(&hUsbDeviceFS, data_out, MIC_SAMPLES_PER_PACKET / 2) != USBD_OK) {
      Error_Handler();
    }
  }
}

/**
 * Override the I2S DMA half-complete HAL callback to process the first MIC_MS_PER_PACKET/2 milliseconds
 * of the data while the DMA device continues to run onward to fill the second half of the buffer.
 */

inline void Audio::I2S_halfComplete() {
  sendData(_sampleBuffer, _sendBuffer);
}

/**
 * Override the I2S DMA complete HAL callback to process the second MIC_MS_PER_PACKET/2 milliseconds
 * of the data while the DMA in circular mode wraps back to the start of the buffer
 */

inline void Audio::I2S_complete() {
  sendData(&_sampleBuffer[MIC_SAMPLES_PER_PACKET], &_sendBuffer[MIC_SAMPLES_PER_PACKET / 2]);
}

The DMA peripheral is configured to transfer 20ms of data into our buffer and to provide 'half-complete' and 'complete' interrupts. Therefore we have 10ms to decode, process and send 10ms of data before the next interrupt happens.

The first stage decodes the 64-bit mono samples from _sampleBuffer into 16-bit interleaved stereo signals in _processBuffer. Stereo is required because the GREQ filter will not operate on mono. To simulate stereo I simply duplicate the left channel into the right.

The second stage calls the GREQ filter to process the data in-place. Nice that these filters can work on data in-place.

The third stage calls the SVC filter to also process the data in-place.

The final stage takes the processed left channel from _processBuffer, copies it into _sendBuffer and calls USBD_AUDIO_Data_Transfer to transmit it. USBD_AUDIO_Data_Transfer has some constraints. It cannot be called more frequently than once per millisecond — check. You must pass it an amount of data that matches the calling frequency — I'm calling it once every 10ms with 480 mono samples which is exactly correct for a 48kHz stream. There may also be an upper cap of 1000 bytes on the packet to match a USB buffer size but that's not documented in the ST source code.

Performance

To measure the performance of the audio transformation and processing pipeline I inserted some debug code to toggle a GPIO pin at the start of the sendData method and then again just after the call to USBD_AUDIO_Data_Transfer. The actual data transfer performed by USBD_AUDIO_Data_Transfer is interrupt-driven so that part is not included in the performance figures. I used my oscilloscope to probe the oscillating GPIO pin and measured the time between the rising and falling edges.

Recall that I have 10ms to do all the work in the interrupt handler. In debug mode processing takes 1.7ms. In release mode it takes 1.5ms. I'm pleasantly surprised by this performance and it does indicate that the opaque audio processing blocks provided by ST Micro perform very well.

Watch the video

I've made short video that talks about this project and shows the microphone in operation with some sound samples so you can hear it.

It looks better when viewed directly from the YouTube website. Click here for that.

Free gerber files

If you'd like to build this board yourself then you can download the Gerber files from here. These can be uploaded to the JLPCB website.

Free firmware

It's all available on Github. Click here to go to the repository.

Fixing the USB microphone mute button click

$
0
0

In my previous article I documented the design and build process of my USB microphone based around an STM32F446 MCU. If you haven’t already read it then it’s probably worth catching up now before reading the rest of this article so that you have the necessary context.

The problem

I’ve been using the microphone for a while now and never really noticed that there was a noise issue with the hardware mute button until I recorded a sound file using Audacity that featured me coming in and out of mute. The noise issue is caused by my poor choice of hardware button:

These buttons are cheap PCB mounted momentary press-release buttons that have an audible click both on the press and the subsequent release. Unfortunately, because the button is located close to the INMP441 sensor the click is very audible.

Click here to listen to the problem. I come out of mute at the start and go back in at the end.

Given that I’m using this microphone daily for virtual meetings I can only assume that either the Skype audio is so bad that you can’t differentiate these clicks from the usual audio corruption that Skype randomly applies or my colleagues are too polite to tell me that I’m clicking at them.

The fix

I didn’t want to desolder the button from the board and bodge in a momentary button that doesn’t click because that would ruin the nice neat appearance of the board. Instead I decided to see what I could do in software.

General approach

Instead of always reacting to a button down event I would need to get smarter and react on the upward or downward button transition depending on whether I was going into, or coming out of mute. This needed a redesign of my generic Button class to inform me when either a new up or down state was reached, with a different reaction delay for up and down. Here’s the modified class:

/*
 * This file is part of the firmware for the Andy's Workshop USB Microphone.
 * Copyright 2021 Andy Brown. See https://andybrown.me.uk for project details.
 * This project is open source subject to the license published on https://andybrown.me.uk.
 */

#pragma once

/**
 * A debounced button class
 */

class Button: public GpioPin {

  public:
    enum CurrentState {
      None,
      Up,
      Down
    };

  private:

    static const uint32_t DEBOUNCE_UP_DELAY_MILLIS = 100;
    static const uint32_t DEBOUNCE_DOWN_DELAY_MILLIS = 1;

    enum InternalState {
      Idle,                         // nothing happening
      DebounceUpDelay,              // delaying...
      DebounceDownDelay
    };

    bool _pressedIsHigh;            // The button is electrically HIGH when pressed?
    InternalState _internalState;   // Internal state of the class

    bool _lastButtonReading;        // the last state we sampled
    uint32_t _transitionTime;       // the time of the last transition

  public:
    Button(const GpioPin &pin, bool pressedIsHigh);
    CurrentState getAndClearCurrentState();   // retrieve current state and reset to idle
};

inline Button::Button(const GpioPin &pin, bool pressedIsHigh) :
    GpioPin(pin) {

  _transitionTime = 0;
  _lastButtonReading = false;
  _pressedIsHigh = pressedIsHigh;
  _internalState = Idle;
}

/**
 * Get and reset the current state. This should be called in the main application loop.
 * @return The current state. If the current state is one of the up/down pressed states
 * then that state is returned and then internally reset to none so the application only
 * gets one 'notification' that the button is pressed/released.
 */

inline Button::CurrentState Button::getAndClearCurrentState() {

  // read the pin and flip it if this switch reads high when open

  bool buttonReading = getState();
  if (!_pressedIsHigh) {
    buttonReading ^= true;
  }

  const uint32_t now = HAL_GetTick();

  if (_lastButtonReading == buttonReading) {

    // no change in the button reading, we could be exiting the debounce delay

    switch (_internalState) {

    case DebounceUpDelay:
      if (now - _transitionTime > DEBOUNCE_UP_DELAY_MILLIS) {
        _internalState = Idle;
        return Up;
      }
      break;

    case DebounceDownDelay:
      if (now - _transitionTime > DEBOUNCE_DOWN_DELAY_MILLIS) {
        _internalState = Idle;
        return Down;
      }
      break;

    case Idle:
      break;
    }

    return None;

  } else {

    // button reading has changed, this always causes the state to enter the debounce delay

    _transitionTime = now;
    _lastButtonReading = buttonReading;
    _internalState = buttonReading ? DebounceDownDelay : DebounceUpDelay;

    return None;
  }
}

The user of this class calls getAndClearCurrentState() in the main loop and it will return Up or Down exactly once when there’s a transition and None otherwise.

Going into mute

When going into mute I want muting to happen immediately when the button is pressed down, hopefully before it emits a click. When the button comes back up it won’t matter because we’ll be in the muted state. To get that immediate response I set the debounce delay for a button down transition to 1ms in the Button class.

Coming out of mute

I want the transition out of mute to happen when the button comes up, and I want it to be delayed sufficiently that the click caused by releasing the button is skipped. To achieve that I set the debounce delay for a button up transition to 100ms in the Button class.

The above logic is encapsulated in the MuteButton subclass that distills everything down into a single isMuted method to return the current state.

/*
 * This file is part of the firmware for the Andy's Workshop USB Microphone.
 * Copyright 2021 Andy Brown. See https://andybrown.me.uk for project details.
 * This project is open source subject to the license published on https://andybrown.me.uk.
 */

#pragma once

class MuteButton: public Button {

  private:
    volatile bool _muted;
    bool _ignoreNextUp;

  public:
    MuteButton();

    void run();
    bool isMuted() const;
};

inline MuteButton::MuteButton() :
    Button(GpioPin(MUTE_GPIO_Port, MUTE_Pin), false) {

  _muted = false;
  _ignoreNextUp = false;
}

inline void MuteButton::run() {

  Button::CurrentState state = getAndClearCurrentState();

  // check for idle

  if (state == Button::CurrentState::None) {
    return;
  }

  if (state == Down) {

    if (!_muted) {
      _muted = true;
      _ignoreNextUp = true;   // the lifting of the button shouldn't exit mute
    }
  }
  else {

    if (_muted) {

      if (_ignoreNextUp) {

        // this is the lifting of the button that went into mute

        _ignoreNextUp = false;
      }
      else {
        _muted = false;
      }
    }
  }
}

inline bool MuteButton::isMuted() const {
  return _muted;
}

Testing and more fixes

With the above fixes I ran some tests by recording with Audacity and repeatedly pressing the mute button. For the most part it worked as I hoped but sometimes, about 1 in 5, there was still a 'pop' spike at either transition but now it sounded more like a pop due to a problem with the audio encoding rather than the 'click' of the button.

When muted my code zeros out the audio buffer before sending to the host and so, based on the hunch that this could cause a 'pop' sound at the transition into and/or out of mute I decided to change to enabling the soft 'mute' function of ST's Smart Volume Control (SVC) library that I was already using to control volume. The documentation in UM1642 has this to say about the mute function:

The SVC "mute" dynamic parameter mutes the output when set to 1 or has no influence on input signal when set to 0. When enabled, it allows mute the signal smoothly over a frame, avoiding audible artifacts.

This approach appeared to totally solve the problem when coming out of mute, the pop had totally gone no matter how many times I pressed and released the button. However, going into mute (where a fast reaction is required) it only improved slightly on the previous results. I suspect that the algorithm that smooths out the transitions is too slow to catch the pop sound.

To fix this last issue I went back to the method of zeroing out data frames. I found by experimentation that by zeroing out the first 500ms of data when transitioning into mute it solved the problem for at least 9/10 cases and I'm happy with that. The core sendData audio interrupt handler now looks like this.

inline void Audio::sendData(volatile int32_t *data_in, int16_t *data_out) {

  // only do anything at all if we're connected

  if (_running) {

    // ensure that the mute state in the smart volume control library matches the mute
    // state of the hardware button. we do this here to ensure that we only call SVC
    // methods from inside an IRQ context.

    if (_muteButton.isMuted()) {
      if (!_volumeControl.isMuted()) {
        _volumeControl.setMute(true);

        // the next 50 frames (500ms) will be zero'd - this seems to do a better job of catching the
        // mute button 'pop' than the SVC filter mute when going into a mute

        _zeroCounter = 50;
      }
    }
    else {
      if (_volumeControl.isMuted()) {

        // coming out of a mute is handled well by the SVC filter

        _volumeControl.setMute(false);
      }
    }

    if (_zeroCounter) {
      memset(data_out, 0, (MIC_SAMPLES_PER_PACKET * sizeof(uint16_t)) / 2);
      _zeroCounter--;
    }
    else {

      // transform the I2S samples from the 64 bit L/R (32 bits per side) of which we
      // only have data in the L side. Take the most significant 16 bits, being careful
      // to respect the sign bit.

      int16_t *dest = _processBuffer;

      for (uint16_t i = 0; i < MIC_SAMPLES_PER_PACKET / 2; i++) {

        // dither the LSB with a random bit

        int16_t sample = (data_in[0] & 0xfffffffe) | (rand() & 1);

        *dest++ = sample;     // left channel has data
        *dest++ = sample;     // right channel is duplicated from the left
        data_in += 2;
      }

      // apply the graphic equaliser filters using the ST GREQ library then
      // adjust the gain (volume) using the ST SVC library

      _graphicEqualiser.process(_processBuffer, MIC_SAMPLES_PER_PACKET / 2);
      _volumeControl.process(_processBuffer, MIC_SAMPLES_PER_PACKET / 2);

      // we only want the left channel from the processed buffer

      int16_t *src = _processBuffer;
      dest = data_out;

      for (uint16_t i = 0; i < MIC_SAMPLES_PER_PACKET / 2; i++) {
        *dest++ = *src;
        src += 2;
      }
    }

    // send the adjusted data to the host

    if (USBD_AUDIO_Data_Transfer(&hUsbDeviceFS, data_out, MIC_SAMPLES_PER_PACKET / 2) != USBD_OK) {
      Error_Handler();
    }
  }
}

Click here to listen to the audio after these fixes have been applied. I come out of mute at the start and go back in at the end.

Conclusion

Well obviously I should have thought ahead that a microphone project shouldn't really have noisy components situated right next to the sensor but at least I've been able to almost totally fix the problem with software alone so I'll have confidence using the mute feature in virtual meetings now.

The code fix has just been merged to master in Github so if you pull the latest changes you'll get the fixes.

Bluetooth Low Energy and the STM32WB55 MCU

$
0
0

I’m a subscriber to ST’s regular email newsletter and though most of it isn’t interesting to me I did notice that in one of the recent editions they were promoting their wireless range of STM32-based MCUs. As a big fan of the STM32 this caught my eye and co-incided with some ideas that were spinning around in my head for new wireless projects.

The series of MCUs in question is the SMT32WB55 and so I went off to familiarise myself with what ST were offering.

It turns out that these are very flexible wireless devices. Operating in the 2.4GHz range they are capable of implementing any protocol that ST has decided to provide a wireless stack for, which at the time of writing includes BLE, Bluetooth Mesh, Zigbee and Thread.

The cost of this flexibility is ease of use. These ICs do not provide a simple high-level programming interface that hides the underlying protocols from you. They’re very much like the ST USB peripheral — flexible but expect to get your hands dirty with the details of the protocol. If you’ve ever seen ST’s low-level source code before then you will be feeling a sense of foreboding right about now.

Challenge accepted, as they say on the internet. I set about designing a board that would allow me to experiment with BLE in the form of a wireless, battery-powered temperature sensor. After all, all this RF stuff can’t really be as hard as they say it is; can it? Actually yes, it is. But more on that later.

The wireless temperature sensor design

I decided to support four NTC thermistor-type sensors, each one providing an input to one of the STM32 ADC channels through a simple resistor divider that would allow me to easily calculate the temperature by way of an accurate external reference voltage provided to the MCU.

The MCU will be battery-powered direct and unregulated from two AA cells so I’ll need to be aware that the input voltage level will fluctuate from betwen 3.2V down to around 2V when the cells are exhausted. Running from battery power means that I’ll need to be mindful of power consumption. My target battery life will be a week from a pair of rechargeable AAs that will only have about 2.5V when fully charged.

The big variable is BLE itself. I’ve never worked with the protocol before and have only downloaded and skimmed the ST documentation. What I’m planning seems feasible and it’ll be this project that validates that. Even if I run into a brick wall I will have learned a lot along the way.

Off we go then, here’s the schematic that I’ll be building.


Click on the thumbnail for a full-size PDF.

Not very much to it is there? It’s basically just the MCU, temperature sensors, voltage reference and the RF antenna. Oh yes and a couple of LEDs to flash for fun.

The voltage reference

Since I’m going to be using the ADC I need a steady reference voltage. My battery supply will run from about 3.2V for non-rechargeable AAs down to 2V so my selection of reference voltage is 1.8V, provided by a Microchip MCP1501.

The STM32WB55

The schematic for the MCU follows ST’s recommendations. The maximum clock speed for the MCU is 64MHz and to that end there’s a 32MHz crystal on the board. I will actually run the CPU clock slower than 64MHz to save battery power. ST’s software stack requires a low speed clock and so I’ve provided a 32.768kHz crystal for the LSE. The SWDIO and SWCLK lines provide the programming and debug interface.

The power supply setup is a little different to what you might be used to with an STM32. To reduce power consumption there’s an efficient switch-mode power supply on the die and you have to provide external inductors for it. The values are given in the datasheet and are provided by me in the above schematic. The SMPS switches itself out in favour of the LDO when the MCU voltage falls below a certain level.

It makes sense to have an input that will tell me when my battery is going flat so I’m doing that with a simple voltage divider tapped at the center to an ADC input on the MCU. A full 3.2V charge will measure 1.6V at the ADC, falling to 1V as the battery life is exhausted.


A very flat QFN

The MCU is not offered in a package that has leads, presumably to avoid creating antennas where they should not be. The most hacker-friendly package is the QFN48 in the 256kB STM32WB55CCU6 MCU. This is the thinnest QFN I’ve seen to-date. Even the 0603 capacitors tower over it. The market for slimmer and slimmer smart devices is driving IC manufacturers to produce ever more compact packages.

The QFN has an exposed ground pad on the bottom, and it’s the only ground connection so it has to be soldered down. This is the only part of working with a QFN that I dislike. In fact I dislike any package pin where I can’t get a probe in to test continuity after the board has been made.

The temperature sensors

These sensors are simply NTC thermistors that you can get on ebay. Technically speaking they are a 3950 NTC with a beta value of 3380, a nominal resistance of 10k@25C and a dissipation constant of 5mW/C. These values are important when it comes to implementing the conversion software later on.

By placing them in the upper section of a potential divider with a known lower section I can tap off the center to an ADC input, measure the voltage and hence calculate the resistance of the thermistor and from that I can calculate the temperature (more on that below).

Accurate readings depend on the stability of the discrete resistors over time and temperature. It doesn’t matter so much if they’re a 1% or 5% value tolerance because you can get your most accurate multimeter out and measure them before soldering them to the board then the software can be programmed to use the measured values — and that’s exactly what I did.

The Antenna

ST have written application notes that provide you with many antenna options. AN5129 explains PCB-printed antennas and provides reference designs and AN5434 goes into even more detail. I’m certainly not short of options. My first attempt was an abject failure but we’ll save my cock-ups for later in this article. This new, improved version uses an Ethertronics M830520 2.4GHz chip antenna filtered by an ST MLPF-WB55-01E3 passive filter network.


Ceramic chip antenna

I really need to talk about that filter IC. It is a tiny, microscopically small, bumpless 6-pad chip-scale IC. It is by far the smallest package I’ve ever had the displeasure to work with. It is, however, a beautiful thing to behold when viewed under the microscope. My microscope is a traditional design that you look into with your eyes and doesn’t have a camera attachment so I couldn’t capture an image that way. This is the best image I could take using a DSLR with a 2:1 macro lens setup.


An 0603 capacitor is included for reference

It is made entirely from glass with a black top and the filter network etched on by some sort of laser or other extremely high-precision manufacturing process. The 6 tiny pads that you can see are on the outer surface of the glass and the connecting network is at least one layer of glass below it. Writing the product code in there in a 100µm font is just showing off.

And I thought working with the QFN was going to be a pain.

Bill of materials

Here’s a full bill of materials for this project.

IdentifiersValueQuantityDescriptionFootprint
C1, C34.7µ2Ceramic capacitors0603
C2, C4, C5, C6 C7100n5Ceramic capacitors0603
C9, C1010p2Ceramic capacitors0603
C112.2µ1Ceramic capacitors0603
C12300p1Ceramic capacitors0603
C14100p1Ceramic capacitors0603
D1Link LED1Red LED2012
D2Power LED1Red LED2012
L110n1HK160810NJ-T inductor0603
L210µ1LQM21FN100M70L0805
P1HDR2x312x3-pin header100 mil
P2JST XHP-51Female SWD header5x 2.5mm
P3HDR2x212x2 pin header100 mil
P42xAA battery holder1Keystone Electronics 2462custom
R1, R2, R3, R610k4SMD resistor0805
R4, R53902SMD resistor0805
R7, R8510k 1%2SMD resistor0805
S1power switch1MHSS1104custom
SW1reset1PCB pushbuttoncustom
T1, T2, T3, T4JST XHP-24thermistor connectors2x 2.5mm
U1STM32WB55CCU61MCUUFQFPN-48
U2MCP1501 1.8v1Microchip voltage referenceSOT23-6
U3MLPF-WB55-01E31Passive filter networkcustom
Y1NX3225SA-32.000MHZ-STD-CSR-1132MHz crystalcustom
Y2Epson FC135132.768kHz crystal2012

PCB layout

Four-layer impedance-controlled PCBs for $6 from JLCPCB are now the new normal so of course with this being a project that has a gigahertz RF component that’s the stackup that I chose to use. The inner layers are unbroken ground and power, respectively.

The trace leading to the chip antenna is 11.55 mil that makes it 50Ω impedance. The layout for the chip antenna is taken directly from the Ethertronics datasheet and of course the inner and bottom layers are stripped of copper beneath it. The biggest worry I had was whether the tiny 300µm pads for the MLPF filter would prove too much for the $6 manufacturing process.

PCB Manufacturing

I never bother with the expensive express delivery options and so it took a couple of weeks for my pack of 5 PCBs to arrive from JLPCB. Here they are.

Thankfully there were no problems with the MLPF pads and the soldermask inbetween them was intact. Most, but not all of the soldermask was intact between the QFN pads as well which will help a little with the construction.

I rather like the new-style black soldermask that JLPCB are using. To my eyes it looks rather smart and most importantly you can see the traces. The old style glossy black soldermask was much heavier in appearance and completely obscured the traces.

I went ahead and built the board by tinning the pads, balancing the components on the little solder bumps and then running the board through a reflow cycle in my reflow oven. Getting solder on to the QFN ground pad took some effort because of the heatsink effect of the inner ground plane. I had to use simultaneously use a heat gun to heat the whole area and a soldering iron to flow the solder out across the pad.

Board testing

Here’s a picture of the board half-built with enough components for me to be able to test that it’s functioning. There’s no point wasting components if the board proves to be a write-off. All the firmware design and test was done with the board in this skeleton state.

Once tested the board can be fully assembled with the remaining components. Now it’s starting to really look the business.

Let’s talk about failure

I’ll let you into a little secret. This isn’t my first attempt at this board. My first attempt was a couple of months ago and it looked like this:

A number of things went wrong with this board that, when taken together made it obvious that I’d have to go back and revise the design for a version 2.

The footprint for the battery holder was wrong. Somehow I’d managed to mis-read the datasheet and place the mounting and terminal holes offset to the left. I blame the technical diagram. Honestly, I think there’s a competition going on between the guys that draw those diagrams to complete them with the fewest number of labelled distances leaving you to get a calculator out to work out the key offsets and sizes!

Secondly, and more seriously I’d forgotten to hook up the MCUs VDD net on the schematic to the battery output. That red bodge-wire that you see was put in place to work around that fail. Major facepalm.

Thirdly, I’d forgotten to provide some pins for external power input. Although this design was going to be battery powered I would need pins to provide power from my bench PSU during testing and development. Those yellow bodge wires that you see are working around that omission.

By now I was wondering whether it was possible to get anything else wrong with the design, and of course the answer was yes it was possible. In this first version I’d decided to go with a PCB meandering printed antenna. The dimensions were strictly copied from an ST application note and the associated passive filter network also taken from their application note.

It didn’t work. It simply didn’t show up at all on my smartphone bluetooth monitoring apps. Without a VNA on my bench there was no way I could even attempt a bodge so it was at this point that I had to give up and create version two.

Version two, the black one, fixes all the issues and adds pin headers for the USART/LPUART peripheral because I noticed that the ST code uses those for debug output and they could prove useful. As you’ve seen by now I’ve also abandoned my attempts at a printed antenna and filter network based on discrete components in favour of a chip antenna and filter IC.

Testing version two

I hooked up my ST-Link, connected external power and switched on. My Windows desktop ST-Link application detected that there was something there but refused to ID it which was slightly worrying. After a bit of Googling it turns out that these STM32WB55 devices are a bit strange and will only be detected by the STM32CubeProgrammer application. This is a cross-platform java application that will work fine on Windows but on Linux its dependency on an old javaFX library means that it won’t work with the latest OpenJDK releases. On Linux I’ll be using the command line interface anyway, and on Windows the GUI looks like this:

Programming the STM32WB55

This is not as simple as just plugging in your ST-Link and flashing your masterpiece to the onboard flash memory. There are two processors inside this device, three distinct firmwares and multiple areas of flash and SRAM, some of which are shared between the CPUs and some of which are protected.

The main CPU onboard is a Cortex-M4 running at up to 64MHz. The co-processor that you cannot access directly is a Cortex-M0+ running at 32MHz. This CPU runs the RF firmware stack that you upload. The WB55 SKU that I’ve selected comes with 256kB of flash and 512kB and 1MB devices are available. 256kB of flash and 128kB of SRAM may seem like a lot but the RF firmware stack will take a significant bite out of both the flash and SRAM so you may need to buy a larger device than you think you need.

The three firmwares on this device are:

  • A firmware upgrade service used to upgrade the wireless stack. This is known as the FUS.
  • The wireless stack.
  • Your application firmware.

Updating any firmware requires you to use the STM32CubeProgrammer utility. The normal ST-Link utility will not recognise the WB devices — something that caused me to think my board was dead until discovering this little nugget of information on the internet.

Speaking of things you don’t easily find in ST documentation; the SRAM memory cannot be read back over the SWD connection and will always return 0x0000000 so when you see forum posts and documentation telling you to check the content of an SRAM location then forget it, you can’t. Only boards that have a USB programming connection can read out the SRAM.

The Firmware Upgrade Service (FUS)

This is the first firmware that you’ll have to tackle before you can do anything else. All STM32WB55 devices come with version 0.5.3 of the FUS installed and it must be upgraded before you can do anything else. At the time of writing version 1.2.0 is the latest version and it is (theoretically) possible to directly upgrade to it. Previously you had to step up the versions to get to the one you want.

Once upgraded, the FUS cannot be downgraded.

Application note AN5185 is the place to start when researching which versions of the FUS are available and what your upgrade path will be. The SBRV option byte can be read to determine if the FUS is running, and if it needs to be upgraded. In the following examples I’ve aliased /usr/local/STMicroelectronics/STM32Cube/STM32CubeProgrammer/STM32_Programmer.sh to stprog for ease of use.

This command will read the FUS version, and it does work over SWD:

$ stprog -c port=swd mode=UR -r32 0x20010010 1
      -------------------------------------------------------------------
                        STM32CubeProgrammer v2.7.0                  
      -------------------------------------------------------------------

ST-LINK SN  : Unexpected_SN_Format
ST-LINK FW  : V2J37S7
Board       : --
Voltage     : 2.99V
SWD freq    : 4000 KHz
Connect mode: Under Reset
Reset mode  : Hardware reset
Device ID   : 0x495
Revision ID : Rev Y
Device name : STM32WB5x
Flash size  : 256 KBytes
Device type : MCU
Device CPU  : Cortex-M4


Reading 32-bit memory content
  Size          : 4 Bytes
  Address:      : 0x20010010

0x20010010 : 00050300

My FUS version is 00050300 which means that version 0.5.3 is active and I need to upgrade it before I can go any further.

The FUS binaries are included with all the other firmware stacks in the STM32CubeWB package. You can download the latest from Github.

The firmware binaries and release notes are located in the Projects/STM32WB_Copro_Wireless_Binaries/STM32WB5x directory. As well as choosing the binary you also have to know the base address to flash it. These addresses are contained in a table in the release notes and vary depending on the device that you have and the version that you are flashing.

I first tried to flash the 1.2.0 firmware from the CubeWB 1.11.1 package straight to my 0.5.3 device. After all, the version compatibility table says you can. Didn’t work. I got an error message about version compatibility. Searching around it seems that this is a known issue so until that’s resolved I stepped back to CubeWB version 1.11.0.

That means my upgrade path is now 0.5.3 to 1.0.2 and then 1.0.2 to 1.1.2. Nothing’s ever easy with ST software is it?

I flashed FUS version 1.0.2 to my STM32WB55CCU6 using this command. The programmer output is quite verbose so I’ll strip the banners from the text I include here. 0x0803a000 is that base address I referred to earlier. Be careful not to just cut and paste these flashing commands without verifying your base address first.

$ stprog -c port=SWD -fwupgrade ./stm32wb5x_FUS_fw_1_0_2.bin 0x0803a000 firstinstall=0 -v

Memory Programming ...
Opening and parsing file: 0x495_FUS_Operator.bin
  File          : 0x495_FUS_Operator.bin
  Size          : 8 KBytes
  Address       : 0x08000000 

Erasing memory corresponding to segment 0:
Erasing internal memory sectors [0 1]
Download in Progress:
[==================================================] 100% 

File download complete
Time elapsed during download operation: 00:00:00.919
Application is running, Please Hold on...
Reconnecting...
Reconnected !
Warning: FUS_STATE_IMG_NOT_FOUND, Flash already empty !
Firmware delete Success
Download firmware image at address 0x803a000 ...

Memory Programming ...
Opening and parsing file: stm32wb5x_FUS_fw_1_0_2.bin
  File          : stm32wb5x_FUS_fw_1_0_2.bin
  Size          : 24492 Bytes
  Address       : 0x0803A000 

Erasing memory corresponding to segment 0:
Erasing internal memory sectors [58 63]
Download in Progress:
[==================================================] 100% 

File download complete
Time elapsed during download operation: 00:00:01.325

Firmware Upgrade process started ...

Application is running, Please Hold on...
Reconnecting...
Reconnected !
Reconnecting...
Reconnected !
Firmware Upgrade Success

And let’s verify that it worked:

$ stprog -c port=swd mode=UR -r32 0x20010010 1
Reading 32-bit memory content
  Size          : 4 Bytes
  Address:      : 0x20010010

0x20010010 : 01000200

It seems that it did work and now I have version 1.0.2 installed. Now to upgrade again from 1.0.2 to 1.1.2.

$ stprog -c port=SWD -fwupgrade ./stm32wb5x_FUS_fw.bin 0x0803a000 firstinstall=0
Firmware Upgrade process started ...

Application is running, Please Hold on...
Reconnecting...
Reconnected !
Reconnecting...
Reconnected !
Firmware Upgrade Success

Followed by the obligatory version check:

$ stprog -c port=swd mode=UR -r32 0x20010010 1 
Reading 32-bit memory content
  Size          : 4 Bytes
  Address:      : 0x20010010

0x20010010 : 01010200

Finally we're there and our first fight with ST's tools is over. Now we can flash a wireless stack.

The Wireless stack

A variety of wireless stacks are included with the CubeWB package including support for BLE, Thread and Zigbee. These are provided as encrypted blobs that are decrypted and flashed by the FUS that I just upgraded in the previous paragraph.

The stack that I want is the full BLE stack. The release notes tell me which file I want and what the base address for flashing should be so it should be a simple process of running the programmer command. Let's try it.

The first step is to delete the old firmware. I ran this step on a new device that presumably has no old firmware and it seemed to work.

$ stprog -c port=swd -fwdelete
Memory Programming ...
Opening and parsing file: 0x495_FUS_Operator.bin
  File          : 0x495_FUS_Operator.bin
  Size          : 8 KBytes
  Address       : 0x08000000 


Erasing memory corresponding to segment 0:
Erasing internal memory sectors [0 1]
Download in Progress:
[==================================================] 100% 

File download complete
Time elapsed during download operation: 00:00:00.907
Application is running, Please Hold on...
Reconnecting...
Reconnected !
Warning: FUS_STATE_IMG_NOT_FOUND, Flash already empty !
Firmware delete Success
fwdelete command execution finished

So far so good. Now we can flash the stack.

$ stprog -c port=swd mode=UR -fwupgrade ./stm32wb5x_BLE_Stack_full_fw.bin 0x8016000 firstinstall=1
Firmware Upgrade process started ...

Warning: Option Byte: nSWboot0, value: 0x0, was not modified.
Warning: Option Byte: nboot0, value: 0x1, was not modified.
Warning: Option Byte: nboot1, value: 0x1, was not modified.
Warning: Option Bytes are unchanged, Data won't be downloaded
Succeed to set nSWboot0=0 nboot1=1 nboot0=1 


Memory Programming ...
Opening and parsing file: 0x495_FUS_Operator.bin
  File          : 0x495_FUS_Operator.bin
  Size          : 8 KBytes
  Address       : 0x08000000 


Erasing memory corresponding to segment 0:
Erasing internal memory sectors [0 1]
Download in Progress:


File download complete
Time elapsed during download operation: 00:00:00.939
Application is running, Please Hold on...
Reconnecting...
Reconnected !
Reconnecting...
Reconnected !
Firmware Upgrade Success

The release notes now tell us to 'revert to default OB configuration' so let's do that since we appear to be on a roll.

$ stprog -c port=SWD mode=UR -ob nSWboot0=1 nboot1=1 nboot0=1
UPLOADING OPTION BYTES DATA ...

  Bank          : 0x00
  Address       : 0x58004020
  Size          : 96 Bytes

[==================================================] 100% 

  Bank          : 0x01
  Address       : 0x58004080
  Size          : 8 Bytes

[==================================================] 100% 


PROGRAMMING OPTION BYTES AREA ...
Warning: Option Byte: nboot0, value: 0x1, was not modified.
Warning: Option Byte: nboot1, value: 0x1, was not modified.

  Bank          : 0x00
  Address       : 0x58004020
  Size          : 96 Bytes



Reconnecting...
Reconnected !


UPLOADING OPTION BYTES DATA ...

  Bank          : 0x00
  Address       : 0x58004020
  Size          : 96 Bytes

[==================================================] 100% 

  Bank          : 0x01
  Address       : 0x58004080
  Size          : 8 Bytes

[==================================================] 100% 

OPTION BYTE PROGRAMMING VERIFICATION:

Option Bytes successfully programmed

It's all good. I now have a BLE wireless stack flashed on to the device. I did all this using the command line tools on Ubuntu Linux because I like to keep a repeatable history of what I've done. If you're uncomfortable with the command line and prefer to use a GUI then all these steps can be done using the programmer GUI. The release notes contain the point-and-click instructions that you need to follow.

Developing a BLE application

To start from scratch with a BLE application you can either take a copy of one of the sample applications included with the STM32CubeWB package or you can use the CubeMX GUI to generate boilerplate code for you and you fill in the blanks.

I chose the latter but constantly found myself referring sideways to example projects to see what they were doing, particularly when things weren't going well.

My first attempt was the peer-to-peer sample. That turned out to be a false start because it would require the whole annoying bluetooth pair-connect dance and a custom app to pair with. For a proof-of-concept application that would be too much.

After giving up on the peer-to-peer approach I then decided to use the Eddystone Beacon framework instead. Bluetooth beacons transmit their data continuously at fixed intervals you define. The disadvantage is that the amount and type of data you can transmit is very limited.

The two different types of beacon supported by the STM32_WPAN middleware are Google Eddystone and Apple iBeacon. I have no interest or involvement with anything Apple so I selected Eddystone. Although Google has discontinued their central bluetooth beacon services the protocol remains open source and supported by free beacon locator software on Android devices.

There are three Eddystone beacon sub-types: UID, URL and TLM. UID and URL transmit a GUID and a URL, respectively. Neither are much use to me. TLM (Telemetry) transmits device health information and is much more useful because it includes temperature, voltage and uptime; all of which I can provide. TLM is the protocol I'll implement.

Application architecture

The STM32WB55 comes with an array of peripheral hardware that those familiar with the STM32 would consider to be very limited. Just one USART and an LPUART. Only four timers and two low power timers. Only one SPI. The reason for this becomes clear when you consider the requirement for low power.

These devices are expected to be battery powered and must eek out every last microwatt of energy to preserve battery life. The bottom line is that most of the time your program will be in one of the low power modes; sleeping, on standby or stopped. Hence the availability of the 'LP' peripherals that will work in some of the power-saving modes.

This is the clock tree that I'm going to be using. The RTC has to have an input and I'll use the LSI (more on that later). The HSE input frequency is the mandatory 32MHz which runs through the PLL to clock HCLK, ABP1 and ABP2 at 32MHz.

A branch from the output of the PLL runs through a multiplier/divider to end up clocking the ADC at 12MHz. A lower ADC clock is key to getting accurate readings. I also program the ADC to take the maximum 640.5 cycles to get a reading to get the maximum accuracy. That means that each channel will produce a reading in 53µs

There are two key software components that come with the WPAN framework: a sequencer and a hardware timer server, both of which are documented in AN5289.

Sequencer

The sequencer is a library for scheduling function calls. You provide it with a callback function pointer and an ID and then in the future you can request that the function be called with a priority that you specify. Since the sequencer main loop is called by you in your main function its callbacks will be executed in the main (non-interrupt) context.

Timer server

The timer server allows you to register a callback that should happen after an elapsed time either as a one-shot event or repeatedly at the interval you specify. The timer server works off the RTC interrupt so it's active in the low power modes.

The RTC on the STM32WB can be run off a variety of sources including the LSE at 32.768kHz, LSI at 32kHz and HSE/32 at 1MHz. I included an Epson FC135 on my board to use as the LSE but unfortunately it wouldn't start, despite using exactly the same load capacitors on a different board where it works fine.

This meant that I had to use either LSI or HSE/32 as the input source. I'd prefer to use HSE/32 because it'll be more accurate than the internal LSI but unfortunately bugs somewhere in the ST code meant that it ran way too fast and I wasn't sufficiently motivated to find the error.

I was able to get the timer server working with the LSI selected as the input to the RTC by changing the definition for CFG_TS_TICK_VAL in the USER CODE BEGIN Defines section of app_conf.h like this:

#undef CFG_TS_TICK_VAL
#define CFG_TS_TICK_VAL DIVR( (CFG_RTCCLK_DIV * 1000000), LSI_VALUE )

If this modification were not made then the timer server would be based on the 32.768kHz LSE_VALUE which is close to the 32000 LSI_VALUE but not accurate.

Application implementation

I decided to sample the battery voltage and the temperature of the first thermocouple every 10 seconds where the updated values would then be picked up by the beacon transmitter code. In app_ble.c I started a timer:

  // create a timer for sampling the temperatures at 10s intervals
  HW_TS_Create(CFG_TIM_PROC_ADC, &adcTimerId, hw_ts_Repeated, GetTemperature);
  HW_TS_Start(adcTimerId, (1000000 / CFG_TS_TICK_VAL) * 10);

Then I added the GetTemperature callback method to the same file.

static void GetTemperature() {

  // read the battery voltage [0] and the thermistor [1]

  volatile uint16_t adcReadings[2];
  ConversionComplete = FALSE;
  HAL_ADC_Start_DMA(&hadc1, (uint32_t*) adcReadings, 2);
  while (!ConversionComplete)
    ;
  HAL_ADC_Stop_DMA(&hadc1);

  // convert the battery voltage in mV

  uint32_t batteryVoltage = ((adcReadings[0] * 1800) / 4096) * 2;

  // get the voltage across the thermocouple. 10005 is the measured value of the constant
  // resistor in the divider

  float voltage = 1.8 * (float) adcReadings[1] / 4096.0;

  // calculate resistance of thermistor

  float rt = ((1.8 * 10005) / voltage) - 10005; //(voltage * resistorValue) / (1.8 - voltage);

  // convert the resistance to a temperature

  float temperature = (1 / (0.003354016 + (0.000295858 * log(rt / 10000)))) - 273.15;

  uint8_t hi = (uint8_t) temperature;
  uint8_t lo = (uint8_t) (100 * (temperature - hi) + 0.5);

  SetBeaconTemperatureAndVoltage((hi << 8) | lo, batteryVoltage);
}

void HAL_ADC_ConvCpltCallback(ADC_HandleTypeDef *hadc) {
  ConversionComplete = TRUE;
}

Sampling multiple ADC channels in one go requires interrupts and DMA on the STM32. In this implementation I start the DMA conversion for the two channels I'm interested in and then wait until I get the interrupt callback in HAL_ADC_ConvCpltCallback that sets a volatile global flag to say that the conversion is done.

Because my GetTemperature callback is registered with the timer server and not one of the regular hardware timers it will continue to operate in low power mode when many of the core peripherals are stopped.

The implementation of SetBeaconTemperatureAndVoltage is over in eddystone_tlm_service.c.

void SetBeaconTemperatureAndVoltage(uint16_t temperature, uint16_t batteryVoltage) {
  EddystoneTLM_InitStruct.BeaconTemperature = temperature;
  EddystoneTLM_InitStruct.BatteryVoltage = batteryVoltage;
}

EddystoneTLM_InitStruct is the structure that's periodically transmitted by the beacon. The Eddystone TLM implementation actually transmits a URL and a TLM at slightly different time offsets. You'll see when using a bluetooth scanner that the URL beacon usually shows up first followed slightly later by an update that adds the sensor values to it.

ST's beacon code, although quite easy to follow, appears to lag behind in the maturity of its implementation compared to the rest of the WPAN middleware. One issue is that there isn't anywhere to make user modifications that won't be overwritten if you go back to CubeMX, make a change and regenerate source code. This means that I became very adept at doing a git add ... followed by a regenerate and then a git diff to reinsert what the UI had destroyed.

Testing

Well I'm pleased to say that with relatively few iterations around the fix-test loop I had a beacon that showed up with the correct values on the Beacon Scanner android app.

For this test the board was connected to my bench supply set to 2.5V and my room thermometer was reading 23C so I'm happy with the values sensed by the STM32 ADC.

I was pleased to see that the bluetooth range was about the same as you'd get from any commercial bluetooth device. That is, it'll go through one or two internal partition walls in a house with ease and from one floor to another if you're directly above or below it but a solid brick wall would severely dent its range.

Battery life

I set up a test to see how long a pair of rechargeable AA batteries would last. I used Duracell branded NiMH batteries with an advertised capacity of 2450mAh. They started off fully charged and I left the device transmitting continuously. Occasionally I would check in and record the date & time as well as the measured battery voltage. Here are the results:

Date & TimeMillivolts
06/08/2021 21:492700
07/08/2021 07:152650
07/08/2021 16:002642
07/08/2021 23:302630
08/08/2021 08:002622
08/08/2021 18:002616
09/08/2021 11:452604
09/08/2021 18:402604
10/08/2021 07:152600
10/08/2021 17:002600
11/08/2021 16:302594
12/08/2021 09:002586
13/08/2021 10:002586
13/08/2021 15:002586
14/08/2021 09:302586
15/08/2021 18:452572
16/08/2021 10:002562
16/08/2021 22:002560
17/08/2021 07:152558
18/08/2021 12:402544
18/08/2021 18:152544
19/08/2021 07:262530
20/08/2021 11:002530
21/08/2021 07:402496
23/08/2021 12:002446
24/08/2021 09:352418
24/08/2021 15:002404
24/08/2021 22:002390
25/08/2021 11:002264
25/08/2021 14:302116
25/08/2021 16:152046
25/08/2021 19:081834
25/08/2021 21:131832

It seems that I get 20 good days of performance before the batteries tail off rapidly and the device stops transmitting. This could be improved by using non-rechargeable batteries but I find those wasteful and prefer to use the rechargeables where possible.

Future applications

Bringing all four temperature sensors to bear would be a good goal for a future firmware. I'd like to avoid having to write a custom app to do this so I might explore the Bluetooth health protocol that permits multiple thermometers to advertise their readings without having to pair and connect. The thermometers are meant to be for defined areas of the body but I don't think I'll care too much if, for example, sensor one comes up on an app as my armpit temperature!

Watch the video

If you'd like to see this board in action along with some waffling by myself then you can see the video on YouTube:

Get the source code and the Gerber files

As always, the source code to my firmware is available on Github. Click here to go direct to the repo. This project's Gerber files are also available. Click here to download. They are suitable for uploading directly to JLPCB.

Viewing all 38 articles
Browse latest View live