fastjson library for reading/writing JSON in C++

January 28, 2021

Categories: Programming Physics

The case for a fast JSON database

Simulation programs and data acquisition (DAQ) systems for physics experiments are typically driven by databases which contain the numerous parameters that can be changed to control the behavior of the simulation. This can vary from the mundane, such as how many events to simulate for this particular run, to the exotic, such as the neutron capture cross-section of hundreds of isotopes present in a detector. Historically each set of input data will define its own format in some contrived ASCII file that, while perhaps not difficult to parse, requires dedicated code to read that specific format. That’s not ideal, and a large push has been made in the last decade to normalize these formats. Fortunately, some modern simulation toolkits, like RAT-PAC have opted to adopt industry standard serialization formats like JSON to serialize their databases. I have opted to take a similar approach for DAQ systems I have written, as seen in the WbLSdaq program I designed to read out CAEN digitizers to HDF5 files for the CHESS experiment at UC Berkeley.

JSON has the attractive features of being human readable, very simple, and highly structured. It allows for a suite of standard types to be defined (integers, floats, strings, and booleans), arrays of these types, and also an “object” notation similar to a Python dictionary, or a C++ map with string keys and arbitrary typed values. There are also tools in most languages for reading and writing JSON files, with a notable exception being C++, where third-party libraries are required to gain this functionality. Third party libraries are great, and there are many JSON options out there, but most are missing a critical feature for a human readable database: the ability to provide inline comments and documentation. In fairness, the JSON specification also does not allow comments, though many JSON readers will happily ignore them.

Third party JSON libraries for C++ have two other undesirable features: they’ll either pull in many other dependencies and have a lot of code bloat or they will be quite slow. Reading JSON quickly is typically not a huge concern, but for something like RAT-PAC with thousands (or more) JSON documents that have to be read at the start of each simulation, taking more than a few milliseconds per document isn’t very attractive, and becomes untenable if read times approach a second.

Early in my physics career I decided to write a small, fast JSON reader/writer that addresses all of these concerns while supporting inline comments: fastjson I’ve used this in several physics-adjacent projects, including RAT-PAC and WbLSdaq, and it is GPL licensed for anyone else to use, as well. A brief overview of usage is given in the following sections.

Using fastjson

The project is available in the fastjson GitHub repository. Despite the disclaimer of “heavy development” in the README.md, fastjson has been stable for 5+ years. Perhaps at some point I will update README.md

fastjson has no dependencies, uses no fancy modern C++ features, and has two source files: the header json.hh and the source json.cc. Simply add these to your project (or include as a git submodule) and you are good to go.

The components are defined in the json namespace, and three classes exist, along with a few type definitions for the JSON values supported.

namespace json {

    class Value;
    class Reader;
    class Writer;

    //types used by Value
    typedef long int TInteger;
    typedef unsigned long int TUInteger;
    typedef double TReal;
    typedef bool TBool;
    typedef std::string TString;
    typedef std::map<TString,Value> TObject;
    typedef std::vector<Value> TArray;
    
};

The json::Value class wraps all possible JSON values (no subclasses). Constructor methods are defined for each of the types mentioned above. There are get and set methods for each JSON type that (for get only) do basic type checking, and raises an exception if the underlying type cannot be converted to the desired type safely. The class defines the = operator to assign values, and the [] operator for object (with std::string keys) and array (with size_t keys) accessors. The [] operator returns json::Value references, which can be modified at will or used as l-values in assignment.

For a json::Value that is an object, getMember, isMember and getMembers give access to the std::string keys. For a json::Value that is an array, getArraySize, setArraySize give access to the length of the underlying storage.

For those not afraid of C++ templates, there is a templated cast method which will convert to base C++ types, and a templated toVector method to convert arrays into a std::vector.

template <typename T> inline T cast() const;
template <typename T> inline std::vector<T> toVector() const

For more information, or a better understanding of the possible arguments, see the json.hh header file.

Reading with fastjson

The json::Reader class will perform all of your JSON reading needs after being initialized from a C++ stream or string. getValue will fill the reference with the next value parsed, returning true, or return false if there is nothing left. This was designed to be very fast, and implements JSON spec with a few nonstandard attentions such as comments with // or /* */ and hexadecimal notation.

//parses JSON values from a stream
class Reader {
    public:
        //Reads entire stream into internal buffer immediately
        Reader(std::istream &stream);

        //Copies the entire string into an internal buffer
        Reader(const std::string &str);

        ~Reader();

        //Returns the next value in the stream
        bool getValue(Value &result);

    protected:
        //Positional data in the stream data (gets garbled during parsing)
        char *data,*cur,*lastbr;
        int line;

        //Converts an escaped JSON string into its literal representation
        std::string unescapeString(std::string string);

        //Helpers to read JSON types
        Value readNumber();
        Value readString();
        Value readObject();
        Value readArray();

        void skipComment();

};

Writing with fastjson

The json::Writer class will create human readable field-per-line JSON output with nice indention of objects. Initialize it with the output stream the JSON should be written to, and pass values intended for output to the putValue method.

//writes JSON values to a stream
class Writer {
    public:
        //Only writes to the stream when requested
        Writer(std::ostream &stream);

        ~Writer();

        //This produces JSON compliant output at the expense of:
        //***Unsigned integers get printed as base 10 numbers, and the next parser may truncate into signed
        //Ultimately produces object-indented text with value-per-line mentality with arrays on a single line
        //which is similar enough to how RATDB looks without too much effort.
        void putValue(const Value &value);

    protected:
        //The stream to write to
        std::ostream &out;

        //Converts a literal string to its escaped representation
        std::string escapeString(std::string string);

        //Helper to write a value to the stream
        void writeValue(const Value &value, const std::string &depth = "");

};
>> Home