Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nicexprs.h: a single header file ExpressJs-like middleware API on nghttp2 asio library #409

Open
45 of 59 tasks
t2ym opened this issue Apr 25, 2021 · 0 comments
Open
45 of 59 tasks

Comments

@t2ym
Copy link
Owner

t2ym commented Apr 25, 2021

Gists - Version [0.0.10] - 2021-05-18

  • nicexprs.h - The header file
    • script to get clean code
cat nicexprs.h |\
sed -e 's/^#include /@include /' |\
sed -e 's/^#\(.* NICEXPRS_H\)/@@@@\1/' |\
sed -e 's/^[ ]*$/==BLANK LINE==/' |\
cpp -P -C -D STREAMING_SUPPORT=1 -D GENERATOR_STREAM=1 -D BOOST_FILTER=1 |\
sed -e 's/^==BLANK LINE==//' |\
sed -e 's/@@@@/#/'  |\
sed -e 's/^@include /#include /' |\
awk -- '/^\/[*][ ]*$/, /\/\/ NICEXPRS_H/'

Etymology (obvious to Japanese speakers)

  • nicexprs.h

  • < nice + express + c++ header ext

  • < imitated express in c++

  • nice

  • < ni-se (にせ or 偽) fake, imitated

  • < ni-se (にせ or 似せ) conjunctive form of ni-su (にす or 似す)

  • < ni-su (にす or 似す) old form of nise-ru (似せる) to make resemble

  • < ni (に or 似) resembling; conjunctive form of ni-ru (にる or 似る)

    • + su (す) old form of the causative auxiliary verb se-ru (せる)
  • < ni-ru (にる or 似る) to resemble

Status - PoC research & development

To be used for HTTP server on Android #408

Current Raw Performance: As of 0.0.3

~51,000 req/sec with a single worker thread with Ubuntu 20.04 on Ryzen 7 3700X

  • Measured by h2load via 1Gbps LAN with 16 threads and 256 clients

~12,800 req/sec with 8 worker threads with Android 11 on Pixel 4a (Snapdragon 730G)

  • Measured by h2load via WiFi with 16 threads and 64 clients

"HELLO, WORLD" Web App (extracted from example)

#include "nicexprs.h"
int main(int argc, char *argv[]) {
...
    middleware_cb raw_body = [](request &req, response &res){
      auto method = req.method();
      if (method == "POST" || method == "PUT") {
        auto body = std::make_shared<std::ostringstream>();
        req.on_data([&req, body](const uint8_t *data, std::size_t len){
          if (len == 0) { // EOF
            req.body = body->str();
            req.next();
          }
          else {
            body->write(reinterpret_cast<const char *>(data), len);
          }
        });
      }
      else {
        req.next();
      }
    };

    middleware_cb uppercase_middleware = [](request &req, response &res){
      //boost::to_upper(req.body); // locale-aware feature-rich uppercasing
      std::transform(req.body.cbegin(), req.body.cend(), req.body.begin(), ::toupper); // ASCII uppercasing
      req.header().emplace("x-uppercase", header_value{ std::string("UPPERCASED") });
      res.on_push([](response &res, std::string &method, std::string &raw_path_query, header_map &header){
        //boost::to_upper(raw_path_query);
        std::transform(raw_path_query.cbegin(), raw_path_query.cend(), raw_path_query.begin(), ::toupper);
        header.emplace("x-uppercase", header_value{ std::string("UPPERCASED") });
      });
      res.on_response([](response &res){
        //boost::to_upper(res.body);
        std::transform(res.body.cbegin(), res.body.cend(), res.body.begin(), ::toupper);
        res.header().emplace("x-uppercase", header_value{ std::string("UPPERCASED") });
        auto it = res.header().find("content-length");
        auto value = std::to_string(res.body.size());
        if (it != res.header().end()) {
          it->second.value = std::move(value);
        }
        else {
          res.header().emplace("content-length", header_value{ std::move(value), false });          
        }        
        res.next();
      });
      req.next();
    };

    middleware_cb hello_handler = [](request &req, response &res){
      res.write_head(200, {{"foo", {"bar"}}});
      res.end("hello, world\n");
    };

    boost::system::error_code ec;

    std::string addr = argv[1];
    std::string port = argv[2];
    std::size_t num_threads = std::stoi(argv[3]);
    auto server_ptr = std::make_shared<http2>();
    http2 &server = *server_ptr;

    web_app_ptr app = (*web_app::create()) // create shared_ptr to web_app object
      .use(raw_body) // buffer request body to req.body
      .use(uppercase_middleware) // PoC middleware
      .get("/hello2", hello_handler) // hello, world
      .mount("/", server) // mount at / on http2 server
      .ptr(); // ptr() is equivalent to shared_from_this()
  
    boost::asio::ssl::context tls(boost::asio::ssl::context::sslv23);
    tls.use_private_key_file(argv[4], boost::asio::ssl::context::pem);
    tls.use_certificate_chain_file(argv[5]);

    nghttp2::asio_http2::server::configure_tls_context_easy(ec, tls);

    if (server.listen_and_serve(ec, tls, addr, port, true)) {
      std::cerr << "error: " << ec.message() << std::endl;
    }
    else {
      server.join();
    }
}

Dependencies

Features

  • HTTP2 server middleware framework on top of the nghttp2 asio library
  • No changes to the base nghttp2 asio library (basically)
  • Similar to ExpressJs API
  • Compatible with C++14 and later

Notes on generator_streambuf implementation:

  • At first, read and write buffers were implemented as 2 separate generator_streambuf instances
    • gptr() of output streambuf and pptr() of input streambuf are unused
sink <-(overflow)-|output streambuf| <- filter - |input streambuf| <-(underflow)- source
                                   pptr()        gptr()

  • According to the c++11 standard, std::streambuf need not have a contiguous buffer for both input and output

    • eback(), gptr(), egptr() input pointer set and pbase(), pptr(), epptr() output pointer set do NOT necessarily interfere with each other
  • Came an inspiration that the 2 streambuf instances can be combined as a single streambuf instance

    • Read and write operations, i.e., buffering mechanisms, are used for its UPPER LAYER filter instead of external input/output, while its source and sink are connected as underlying 2 DEVICES
Conceptual diagram: inaccuracy remains compared with implementation
          +-----------------------+
          |      do_filter()      |
          +-----------------------+
              | write        ^
              v              | readsome
          +-----------------------+
          |   generator_stream    |
          +-----------------------+
              | sputn        ^
              v              | sgetn
          +-----------------------+
          |  generator_streambuf  |
          +-----------------------+
              | sync         ^
              v              | underflow
          +---------+  +----------+
return <- |  sink   |  |  source  | <- generator_cb
          +---------+  +----------+

Ordinary std::streambuf implementation: (basic_filebuf may have 2 separate buffers for read and write pointers)

sink <-read- | already read | filled buffer | to be filled | <-write- source
             eback()        gptr()           egptr()
             pbase()                         pptr()         epptr()

generator_streambuf implementation:

  Input buffer:
    _M_gbuf (std::string)
    |already read | filled buffer | to be filled | <-underflow- source
    eback()       gptr()           egptr()        end of _M_gbuf

  Output buffer:
   non-overflowing mode:
             generator_cb buffer                   overflow buffer
sink <-sync- | written | to be filled |     +      | to be filled                   |  
             buf        pptr()         epptr()     _M_pbuf_overflow
             =_M_pbuf                  =buf+len
             =pbase() 

   overflowing mode:
             generator_cb buffer                   overflow buffer
sink <-sync- | written (filled)       |      +     | written         | to be filled |  
             buf=_M_pbuf               buf+len     pbase()            pptr()         epptr()
                                                   =_M_pbuf_overflow

Features in example server

  • Kotlin-style compiletime::trimIndent() for html templates
  • search parameter parser in custom_helper
  • body json parser in custom_helper
  • route parameter parser /route/:param1/:param2 in custom_helper
    • store parameters in helper.route_parameters
  • html template usage
    • Using inja - not mandatory
      • How to get: curl -L -O https://raw.githubusercontent.com/pantor/inja/master/single_include/inja/inja.hpp
    • simplify template_handler by moving async operations to backend_handler
  • pseudo-backend in custom_helper
    • pseudo-backend with 100ms delay
    • backend_handler sample to simplify async operations in template_handler
        .use(raw_body)
        .get("/route2/:param1/:param2")
          .use(route_parser)
          .use(backend_handler)
          .use(template_handler)
          .parent()
  • blocking threaded task without blocking asynchronous network I/O operations
    • threaded_task_handler sample to delegate blocking tasks to pooled threads
    • robust locking scheme for req.helper object
      • hand results of threaded tasks via promise instead of writing helper objects directly
      • verify no memory leaks on premature closing of streams
    • hand an error code to next(err) if necessary
    • helper.post_threaded_task to encapsulate inter-thread communication
  • static_file_handler sample
    • port the sample callback from asio-sv2.cc
    • add basic content-type header
    • detect accept-encoding request header(s) to judge if content-encoding: gzip is acceptable
    • support content-encoding: gzip for original.ext with pre-gzipped original.ext.gz files
    • detect if-modified-since request header and respond 304 Not Modified if necessary
    • redirect .../path to .../path/ if docroot/.../path is a directory
    • use .../path/index.html if docroot/.../path is a directory
    • adaptive buffering
  • client certificate authentication - research in progress at Gist nghttp2 ssl patch
    • Note: Typical use cases: On-premise servers where client certificates can be distributed via Active Directory or other secure scheme in the organization.
  • streaming filters
    • no-operation streaming filter example
    • no resize streaming filer example
    • line number filter
    • boost gzip compressor filter
    • uppercasing streaming filter
    • digest trailer streaming filter

TODOs

  • support multiple on_close callbacks invoked in an appopriate order
  • cache_handler middleware sample to handle on-memory caching for multiple worker threads
  • logger_handler middleware sample to emit logs with appropriate queueing for multiple worker threads
  • support streaming filters
    • streaming filters as a chain of stream_cb callbacks
    • std::iostream adaptor class generator_stream
    • boost::iostreams::filtering_streambuf<output> adaptor class boost_filtering_streambuf_adaptor
    • wrap stream_cb as response_cb if one or more response_cb are included in response_filters
      • stream_cb works in buffered mode like response_cb
    • propagation of latent deferred status of streaming filters is controlled by wrapper generator_cb callbacks
  • TBD

Issues

  • [design issue] Is it effective to add generator_stream::preempt(std::streamsize size, char *&out_beg, char *&out_end) and generator_stream::commit(char *out_beg, char *out_end) methods to write to output buffer directly?
  • [design issue] Should the input buffer in generator_streambuf be expanded if no filterable chunk is found in full buffer?
    • [mitigation] compromised filtering processes can be implemented in do_peek() and do_filter()
  • The default behavior of missing accept-encoding request header is incorrect
  • Segmentation fault on popping from empty std::list<quadple>
    • Reproducible only on certain -O2 optimized code but in fact the issue is circumvented just by good luck on unoptimized code
  • [design issue?] middleware chain is (unexpectedly?) valid even after fallback
    app
      .get("/path")
        .use(middlewareA)
        .get("/subpath1", responding_middlewareB)
        .parent()
      .all(fallback_middleware) // on /path/subpath2, middleware_A is effective at fallback_middleware
  • [regression] Empty response of deferred response
    • Root Cause: On resume of a deferred response, buffer_body is not racalled but the first response filter is unexpectedly called prematurely
    • Fix: Add res.is_reading_body and recall buffer_body on resume if the body is still being read
  • Segmentation Fault without #define DUMP_MIDDLEWARE_ITEM 1
    • Work around SIGSEGV by outputting to ostringstream even when DUMP_MIDDLEWARE_ITEM is not defined
  • + is not decoded as a space in custom_helper.decode_url()
  • Server does not stop (i.e. server.join() blocks) on a signal, whose handler calls server.stop(), until all HTTP/2 sessions are disconnected from browsers
    • Is this by design?
    • Forceful disconnection methods should be explored
  • static_file_handler seems to percent-decode each path twice

Change Log

To be converted to CHANGELOG.md in a dedicated project

[Unreleased]

Added

Changed

Removed

Fixed

[0.0.10] - 2021-05-18

Added

  • Support streaming filters if STREAMING_SUPPORT macro is defined as truthy
    • response::on_response(response_cb on_header, stream_cb cb)
      • register a streaming filter callback with its corresponding on-header callback
    • If STREAMING_SUPPORT macro is undefined or 0, nicexprs.h is almost the same as the previous version 0.0.9
  • Define std::iostream adaptor class generator_stream if GENERATOR_STREAM macro is defined as truthy
  • Define boost_filtering_streambuf_adaptor class if BOOST_FILTER macro is defined as truthy

Changed

  • Change type of response::response_filters as std::list<filter_item> from std::list<response_cb>
    • filter_item contains response_cb for buffered filtering,
      or stream_cb for streamed filtering and another response_cb for header filtering in streaming

Removed

  • Remove unimplemented web_app::static_file()

[0.0.9] - 2021-05-17

Fixed

  • Fix the segmentation fault issue on popping an item from empty std::list<quadple>

[0.0.8] - 2021-05-10

Fixed

  • Fix the regression issue of empty deferred response

[0.0.7] - 2021-05-04

Added

  • Bypass response buffering if response filters are empty

[0.0.6] - 2021-05-02

Added

  • Support multiple on_close callbacks std::list<close_cb> response::on_close_callbacks invoked in the reversed order of their registrations

Removed

  • response::is_on_close_set
  • response::on_close_

[0.0.5] - 2021-04-28

Added

  • Add web_app.get(std::string path) and web_app.post(std::string path)

[0.0.4] - 2021-04-27

Added

  • Add extensible req.helper to store and manipulate data per request

[0.0.3] - 2021-04-25

Changed

  • Work around SIGSEGV when DUMP_MIDDLEWARE_ITEM macro is not defined

[0.0.2] - 2021-04-25

Added

  • DUMP_MIDDLEWARE_ITEM macro to switch on/off dumping middleware_items

[0.0.1] - 2021-04-25

Added

  • Initial PoC version as a single header file
  • Subject to drastic changes
@t2ym t2ym changed the title [nicexprs.h] nicexprs.h: a single header file ExpressJs-like middleware API on nghttp2 asio library nicexprs.h: a single header file ExpressJs-like middleware API on nghttp2 asio library Apr 25, 2021
@t2ym t2ym pinned this issue Apr 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant