Saturday, February 25, 2017

LiteESP8266Client: AT Command library on 100 bytes of SRAM!

Two weeks ago, I grumbled about the state of Arduino libraries (they use too much SRAM for stupid reasons).

Last week, I presented a zero-global-SRAM serial logging library.

This week, I'm offering an ESP8266 AT client library that uses 100 bytes of SRAM (if you use my zero-SRAM serial library for serial output) - almost entirely used within the Software Serial library.

More importantly, I'd like this to show off how to do a fairly complex library without using nearly as much SRAM as the alternatives!


If this is something that interests you, read on!




The Problem - SRAM Use!

There are a few different ESP8266 AT client libraries out there - and they're all awful about their SRAM use.

The common Atmega328 chip (used in the Arduino Uno) has 2048 bytes of SRAM - yes, 2kb.  Not 2MB.  Not 2GB.  2048 precious bytes.

The SparkFun ESP8266 AT library is what I started using for playing with the ESP8266 chips - and it uses 684 bytes of my precious SRAM.


The WeeESP8266 Library HTTP GET example uses 728 bytes of SRAM.  It doesn't allocate extra buffers like the SparkFun library, but it stores constant strings in data memory and uses an awful lot of String variables, which can lead to heap fragmentation (leading to less available memory over time).  Since it returns String types, I expect perfectly reasonable looking code stands a good chance fragmenting the heap.


And I haven't found anything radically better.  All the libraries I can find abuse their data memory privileges in some way or another.

So I did better.

100 bytes of SRAM - for entirely functional ESP8266 Client functionality and serial output (if you're using the LiteSerialLogger library I wrote).  The other 9 bytes are from the millis() timer.


Using the Library

The example given in the README file is probably the way to start, and checking the header for function use would be wise.

Basically, you put the radio in station mode, join to an AP, and then once connected transmit and receive data.

Transmitting is straightforward (the send() or send_progmem() functions, depending on where the data is coming from), but receiving is a tiny bit tricky.  To use as little memory as possible, the receive function will read the data block coming from the radio, allocate that much memory, and copy the data in.  It's the caller's responsibility to free this data - but you should also check to make sure the returned pointer is not null, because that indicates failure to either find a response or to allocate memory.  I considered a few other options, but this works for me, and this library is written, first and foremost, for my needs.

Development

An AT interface library is a reasonably complex library to write, and one thing that helped me immensely was having two serial interfaces to my development device.  The Arduino serial interface uses my LiteSerialLogger class to output data, but I also ran with another serial device monitoring the communications between the Arduino and the ESP8266 (or, rather, monitoring the responses from the ESP8266, because my little adapter won't read from the TX bus).

Because I have a custom shield PCB I've been working on (yes, I'll talk about this in the future - when it's finished), I just used this as my development shield.  This is based on my $10 ESP8266 shield, and the debug header seemed a good idea.


This is certainly not the most complex software package I've worked on in my life, but it's a decently complex Arduino library to write - especially if you're focused on keeping SRAM use to a minimum.

Mostly, I sat down with the datasheet for the ESP8266 and bunch of scratch paper, diagrammed out what I wanted to do, worked out how to get there, and then wrote code.

While building an Arduino library, it's vitally important to build it on a regular basis (this takes a few seconds in the IDE - there's no excuse for not doing it) and make sure the global memory use counter stays where you expect.  If you add 4 bytes to the class, it's reasonable to expect this to increase by 4 bytes - but not 8, 12, whatever.  And if you don't expect your code to be using any global memory, be sure to track down an increase.

This isn't something that's terribly easy to do at the end.  You have to do it, during the development, so you have a small chunk of code to look over if something changes unexpectedly.

Program Memory for Strings

Look.  Put your constant strings in program memory.  There is no excuse not to.  It's a Harvard Architecture.  That means separate memory spaces for program and data - and unless you explicitly specify that strings go in the program space, they end up taking up precious data memory, forever!

You'll note all my constant strings (commands, results, etc) are defined like this:

const char ESP8266_COMMAND_CONNECT[] PROGMEM = "CIPSTART=";
...
const char ESP8266_SERIAL_OPTIONS[] PROGMEM = ",8,1,0,0";
...
const char ESP8266_RESPONSE_OK[] PROGMEM = "OK\r\n";
const char ESP8266_RESPONSE_ERROR[] PROGMEM = "ERROR\r\n";
...
const char ESP8266_CONTENT_LENGTH_HEADER[] PROGMEM = "Content-Length: ";

There's a reason!  That "PROGMEM" specifier, as attached to a const char, means that these strings are all living in program memory.  It takes a bit more work to get them out than just referencing them, but saves SRAM, and this is where a lot of the libraries go wrong.

Working with strings in program memory takes a bit more work on the programmer side, but it's worth the savings.

pgm_read_byte_near is a useful function.  So are strcpy_P and strlen_P.  Use them.  If you have no idea what on earth this is, the Arduino references on PROGMEM are a useful start.  Or... if you have no idea about any of this, maybe don't go writing Arduino libraries until you learn.  Embedded programming.  Harvard architecture.  Just saying.

Buffers & Matching State Machines

One of the complaints I have with the SparkFun library, and some of the others as well, is that they use another buffer on top of the Serial or SoftwareSerial receive buffer.  This is just wasteful.  In at least some cases, it seems to be there to allow the programmer to use the "strcmp" function and be lazy (that the code refers to handling buffer wraparound as a "todo" backs this opinion).

A far better approach to searching for strings in a buffer is to pop the bytes off, updating the "match" counts until you either find a match or hit the timeout.  Just write a simple state machine to keep track of matches, and you end up with something far more memory efficient.

This is the core loop in my single string matcher.  It checks for an available character, and if there is one, reads it, checks it against the constant memory in program memory (pgm_read_byte_near and some pointer math), then returns success if the string is matched (or resets things if the character doesn't match).

  // Loop until the timeout is reached.
  while (millis() < (start_time + timeout_ms)) {
    // Only proceed if a character is available.
    if (radio_serial_->available()) {
      // If the character matches the expected character in the response,
      // increment the pointer.  If not, reset things.
      if (radio_serial_->read() == 
              pgm_read_byte_near(progmem_response_string + matched_chars)) {
        matched_chars++;
   
        if (matched_chars == response_length) {
          return LITE_ESP8266_SUCCESS;
        }
      } else {
        // Character did not match - reset.
        matched_chars = 0;
      }
    }
  }

This is a tight, efficient way of handling reads from an already existing buffer, and you may note that it only uses stack variables - it's not doing anything with heap allocations.

Heap Allocations & Return Data

I do allocate data on the heap, explicitly in my "get data" functions, for allocating a character array to return the data.

I considered a variety of ways of returning the data, and this seemed the cleanest.  I allocate one entry on the heap (while taking a max bytes size to limit the amount allocated if required), and the caller needs to free it.

In a system with more memory, I'd certainly consider taking a string variable as a reference, but an Arduino is not such a system.

Also, a number of other interesting libraries (one of the JSON parsing libraries in particular) can work with a char* without needing to copy things, but end up copying String data around and using more memory.  This really matters when you're starting with 2kb of RAM.

Don't assume that "convenient" is the best way to go - that's true in many environments, but the Arduino is a severely limited embedded environment.  I've got a watch with 512MB of RAM and 4GB of flash, and the Arduino Uno has 2kb of RAM and a whopping 32kb of storage.

Results

I'm quite happy with my results here.  With the combination of my zero-SRAM serial logging library and this library, I've taken ESP8266 AT communication down from 700+ bytes of SRAM (nearly half the available memory) down to 100 bytes - to accomplish, literally, the exact same things.  I call that a win!

So go take a look, and if it's useful, let me know.

1 comment:

  1. some of the most effective built in libraries. thanks that you share something very important for the programmers and developers. your blog is bookmarked now for future references.

    ReplyDelete