Thursday 23 March 2017

gRPC is very interesting

MapGuide in its current form is a whole bucket of assorted libraries and technologies:
  • We use FDO for spatial data access
  • We use ACE (Adaptive Communication Environment) for:
    • Basic multi-threading primitives like mutexes, threads, etc
    • TCP/IP communication between the Web Tier and the Server Tier
    • Implementing a custom RPC layer on top of TCP/IP sockets. All of the service layer methods you use in the MapGuide API? They're all basically RPC calls sent over TCP/IP for the MapGuide Server to invoke its server-side eqvivalent. Most of the other classes that you pass into these service methods are essentially messages that are serialized/deserialized through the TCP/IP sockets. When you think about it, the MapGuide Web API is merely an RPC client for the MapGuide Server, which itself is an RPC server that does the actual work
  • We use Berkeley DBXML for the storage of all our XML-based resources
    • We have an object-oriented subset of these resource types (Feature Sources, Layer Definitions, Map Definitions, Symbol Definitions) in the MdfModel library with XML serialization/parsing code in the MdfParser library
    • Our Rendering and Stylization Engine work off of these MdfModel classes to render the maps that you see on your viewer
  • We use xerces for XML reading/writing XML in and out of DBXML
  • We use a custom modified (and somewhat ancient) version of SWIG to generate wrappers for our RPC client so that you can talk to the MapGuide Server in:
    • .net
    • Java
    • PHP
So why do I mention all of this?

I mention this, because I've recently been checking out gRPC, a cross-platform, cross-language RPC framework from Google.

And from what I've seen so far, gRPC could easily replace and simplify most of the technology stack we're currently using for MapGuide:
  • ACE? gRPC is the RPC framework! The only reason we'd keep ACE around would be for multi-threading facilities, but the C++ standard library at this point would be adequate enough to replace that as well
  • DBXML/MdfModel/xerces? gRPC is driven by Google Protocol Buffers.
    • Protobuf messages are strongly typed classes that serialize/deserialize into compact binary streams and is more efficient and faster than slinging around XML. Ever bemoan the fact you have to currently work with XML to manipulate maps/layers/etc? In .net you are reprieved if you use the Maestro API (where we provide strongly-typed classes for all the resource XML types), but for the other languages you have to figure out how to use the XML APIs/services provided by Java/PHP to work with the XML blobs that the MapGuide API gives and expects. With protobuf, you have none of these problems.
    • Protobuf messages can evolve in a backward-compatible manner
    • Because protobuf messages are already strongly-typed classes, it makes MdfModel/MdfParser redundant if you get the Rendering/Stylization engine to work against protobuf messages for maps/layers/symbols/styles/etc
    • If we ever wanted to add support for Mapbox Vector Tiles (which seems to be the de-facto vector tile format), well the spec is protobuf-based so ...
    • Protobuf would mean we no longer deal in XML, so we don't need Xerces for reading/writing XML and DBXML as the storage database (and all its cryptic error messages that can bubble up from the Resource Service APIs) can be replaced with something simpler. We may not even need a database at this point. Dumping protobuf messages to a structured file system could probably be a simpler solution
  • SWIG? gRPC and protobuf can already generate service stubs and protobuf message classes in the languages we currently target:
    • .net
    • Java
    • PHP
    • And if we wanted, we can also instantly generate a gRPC-based MapGuide API for:
      • node.js
      • Ruby
      • Python
      • C++
      • Android Java
      • Objective-C
      • Go
    • The best thing about this? All of this generated code is portable in their respective platforms and doesn't involve native code interop through "flattened" interfaces of C code wrapping the original C++ code, which is what SWIG ultimately does for any language we want to generate wrapper bindings out of. If it does involve native code interop, it's a concern that is taken care of by the respective gRPC/protobuf implementation for that language.
  • Combine a gRPC-based MapGuide Server with grpc-gateway and we'd have an instant REST API to easily build a client-side map viewer out of
  • gRPC works at a scale that is way beyond what we can currently achieve with MapGuide currently. After all, this is what Google uses themselves for building their various services
If what I said above doesn't make much sense, consider a practical example.

Say we had our Feature Service (which as a user of the MapGuide API, you should be familiar with) as a gRPC service Definition

// Message definitions for the request/response types below are omitted for brevity but basically every request and
// response type mentioned below will have eqvivalent protobuf message classes automatically generated along with
// the service

// Provides an abstraction layer for the storage and retrieval of feature data in a technology-independent way.
// The API lets you determine what storage technologies are available and what capabilities they have. Access
// to the storage technology is modeled as a connection. For example, you can connect to a file and do simple
// insertions or connect to a relational database and do transaction-based operations.
service FeatureService {
    // Creates or updates a feature schema within the specified feature source.
    // For this method to actually delete any schema elements, the matching elements
    // in the input schema must be marked for deletion
    rpc ApplySchema (ApplySchemaRequest) returns (BasicResponse);
    rpc BeginTransaction (BeginTransactionRequest) returns (BeginTransactionResponse);
    // Creates a feature source in the repository identified by the specified resource
    // identifier, using the given feature source parameters.
    rpc CreateFeatureSource (CreateFeatureSourceRequest) returns (BasicResponse);
    rpc DeleteFeatures (DeleteFeaturesRequest) returns (DeleteFeaturesResponse);
    // Gets the definitions of one or more schemas contained in the feature source for particular classes.
    // If the specified schema name or a class name does not exist, this method will throw an exception.
    rpc DescribeSchema (DescribeSchemaRequest) returns (DescribeSchemaResponse);
    // This method enumerates all the providers and if they are FDO enabled for the specified provider and partial connection string.
    rpc EnumerateDataStores (EnumerateDataStoresRequest) returns (EnumerateDataStoresResponse);
    // Executes SQL statements NOT including SELECT statements.
    rpc ExecuteSqlNonQuery (ExecuteSqlNonQueryRequest) returns (ExecuteSqlNonQueryResponse);
    // Executes the SQL SELECT statement on the specified feature source.
    rpc ExecuteSqlQuery (ExecuteSqlQueryRequest) returns (stream DataRecord);
    // Gets the capabilities of an FDO Provider
    rpc GetCapabilities (GetCapabilitiesRequest) returns (GetCapabilitiesResponse);
    // Gets the class definition for the specified class
    rpc GetClassDefinition (GetClassDefinitionRequest) returns (GetClassDefinitionResponse);
    // Gets a list of the names of all classes available within a specified schema
    rpc GetClasses (GetClassesRequest) returns (GetClassesResponse);
    // Gets a set of connection values that are used to make connections to an FDO provider that permits multiple connections.
    rpc GetConnectionPropertyValues (GetConnectionPropertyValuesRequest) returns (GetConnectionPropertyValuesResponse);
    // Gets a list of the available FDO providers together with other information such as the names of the connection properties for each provider
    rpc GetFeatureProviders (GetFeatureProvidersRequest) returns (GetFeatureProvidersResponse);
    // Gets the locked features.
    rpc GetLockedFeatures (GetLockedFeaturesRequest) returns (stream FeatureRecord);
    // Gets all available long transactions for the provider
    rpc GetLongTransactions (GetLongTransactionsRequest) returns (GetLongTransactionsResponse);
    // This method returns all of the logical to physical schema mappings for the specified provider and partial connection string
    rpc GetSchemaMapping (GetSchemaMappingRequest) returns (GetSchemaMappingResponse);
    // Gets a list of the names of all of the schemas available in the feature source
    rpc GetSchemas (GetSchemasRequest) returns (GetSchemasResponse);
    // Gets all of the spatial contexts available in the feature source
    rpc GetSpatialContexts (GetSpatialContextsRequest) returns (GetSpatialContextsResponse);
    // Inserts a new feature into the specified feature class of the specified Feature Source
    rpc InsertFeatures (InsertFeaturesRequest) returns (stream FeatureRecord);
    // Selects groups of features from a feature source and applies filters to each of the groups according to the criteria set in the aggregate query option supplied
    rpc SelectAggregate (SelectAggregateRequest) returns (stream DataRecord);
    // Selects features from a feature source according to the criteria set in the query options provided
    rpc SelectFeatures (SelectFeaturesRequest) returns (stream FeatureRecord);
    // Set the active long transaction name for a feature source
    rpc SetLongTransaction (SetLongTransactionRequest) returns (BasicResponse);
    // Connects to the Feature Provider specified in the connection string
    rpc TestConnection (TestConnectionRequest) returns (TestConnectionResponse);
    // Executes commands contained in the given command set
    rpc UpdateFeatures (UpdateFeaturesRequest) returns (UpdateFeaturesResponse);
    // Updates all features that match the given filter with the specified property values
    rpc UpdateMatchingFeatures (UpdateMatchingFeaturesRequest) returns (UpdateMatchingFeaturesResponse);
}

Running this service definition through the protoc compiler with grpc plugin gives us:
  • Auto-generated (and strongly-typed) protobuf classes for all the messages. ie: The request and response types for this service
  • An auto-generated FeatureService gRPC client ready to use in the language of our choice
  • An auto-generated gRPC server stub for FeatureService in the language of our choice ready for us to "fill in the blanks". For practical purposes, we'd generate this part in C++ and fill in the blanks by mapping the various service operations to their respective FDO APIs and its return values to our gRPC responses.
And at this point, we'd just need a simple C++ console program that bootstraps gRPC/FDO, registers our gRPC service implementation, start the gRPC server on a particular port and we'd have a functional Feature Service implementation in gRPC. Our auto-generated Feature Service client can connect to this host and port to immediately start talking to it.

The only real work is the "filling in the blanks" on the server part. Everything else is taken care of for us.

Extrapolate this to the rest of our services (Resource, Rendering, etc) and we basically have a gRPC-based MapGuide Server.

Also filling in the blanks is a conceptually simple exercise as well:
  • Feature Service - Pass down the APIs in FDO.
  • Rendering Service - Setup up FDO queries based on map/layers visible and pass query results to the Rendering/Stylization engine.
  • Resource Service - Read/write protobuf resources to some kind of persistent storage. It doesn't have to be something complex like DBXML, it can be as simple as a file system (that's what mg-desktop does for its resource service implementation btw)
  • Tile Service - It's just like the rendering service, but you're asking the Rendering/Stylization engine to render tile-sized content.
  • KML Service - Just like rendering service, but you're asking the Rendering/Stylization engine to render KML documents instead of images.
  • Drawing Service - Do we still care about DWF support? Well if we have to support this, it's just passing down to the APIs in DWF Toolkit.
  • Mapping Service - It's a mish-mash of tapping into the Rendering/Stylization engine and/or the DWF Toolkit.
  • Profiling Service - Just tap into whatever tracing/instrumentation APIs provided by gRPC.
Now because gRPC is cross-language, nothing says we have to use C++ for all the service implementations, it's just that most of the libraries and APIs we'd be mapping into are already in C++, so in practical terms we'd stick to the same language as well.

Front this with grpc-gateway, and we basically have our RESTful mapagent to build a map viewer against.

There's still a few unknowns:
  • How do we model file uploads/downloads?
  • Can server-side service implementations call other services?
Google's vast set of gPRC definitions for their various web services can answer the first part. The other part, will need more playing around.

The thought of a gRPC-based MapGuide Server is very exciting!

No comments: