Thursday 19 October 2017

The journey of porting the MapGuide Maestro API to .net standard

So what prompted the push to port the MapGuide Maestro API to .net standard was Microsoft recently releasing a whole slate of developer goodies:
Of particular relevance to this subject of this post, is .net standard 2.0.

For those who don't know, .net standard is (you guessed it) a versioned standard by which one can write portable and cross-platform class libraries against that will work in any .net runtime environment that supports the version of .net standard that you are targeting. If you do Android development, this is similar to API levels.

.net standard is of interest to me as the MapGuide Maestro API at the moment is a set of class libraries that target the full .net Framework. Having it target .net standard instead would give us guaranteed cross-platform portability across .net runtime environments that support .net standard (Mono) and/or supporting platforms that would never have been possible before in the past (.net Core/Xamarin/UWP)

I tried an attempt at porting the Maestro API to earlier versions of .net standard, with mixed success:
  • The ObjectModels library was able to be ported to .net standard 1.6, but required installing many piecemeal System.* nuget packages to fill in the missing APIs.
  • Maestro API itself could not be ported due to reliance of XML schema functionality and HttpWebRequest, that no version of .net standard before 2.0 supported.
  • Maestro API had upstream dependencies (eg. NetTopologySuite) that were not ported to .net standard.
  • More importantly, the bits I were able to port across (ObjectModels), I couldn't run their respective (full-framework) unit test libraries from the VS test explorer due to cryptic assembly loading errors due to the assembly manifest of the various piecemeal System.* assemblies not matching their assembly reference. With no way to run these tests, the porting effort wasn't worth continuing.
Around this time, I heard of what the upcoming (at the time) .net standard 2.0 would bring to the table:
  • Over 2x the API surface of netstandard1.6, including key missing APIs needed by the Maestro API like the XML schema APIs and HttpWebRequest
  • A compatibility mode for full .net Framework. If this works as hoped, it means we can skip waiting on upstream dependencies like NetTopologySuite and friends needing to have netstandard-compatible ports and use the existing versions as-is.
Given the compelling points of .net standard 2.0 and mixed results with porting to the (then) current iteration on .net standard, I decided to put these porting efforts on ice and wait until the time when .net standard 2.0 and its supporting tooling comes out.

Now that .net standard 2.0 and supporting tooling came out, it was time to give this porting effort another try ... and I could not believe how much less painful the whole process was! This was basically all I had to do to port the following libraries to .net standard 2.0:

Preparation Work

To be able to use our (ported to .net standard 2.0) MaestroAPI in the (full framework) Maestro windows application, we needed to first re-target all affected project files to target .net Framework 4.6.1, as this is the minimal version of the full .net framework that supports .net standard 2.0

OSGeo.FDO.Expressions

This is a class library that uses the Irony grammar parser to parse FDO expression strings to an object oriented form. Maestro uses this library to be able to analyze FDO expressions for validation purposes (eg. You don't have a FDO expression that references a property that doesn't exist).

My process of converting the existing full framework csproj file to .net standard was to basically just replace the entire contents of the original csproj file with the minimum required content for a .net standard 2.0 class library.


1
2
3
4
5
<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <TargetFramework>netstandard2.0</TargetFramework>
  </PropertyGroup>
</Project>

That's right, the content of a minimal .net standard 2.0 class library is just 5 lines of XML! All .cs files are implicitly included now when building this project, which greatly contributes to the simplicity of the new csproj format.

Now obviously this project file as-is won't compile as we need to reference Irony and use VS2017 to regenerate the resx string bundles and source link shared assembly info files. After those changes were made, the project builds with the only notable warning being NU1701, which is the warning emitted by the new tooling when we reference full framework libraries/packages from a netstandard2.0 class library (that the new tooling allows us to do for compatibility purposes).

It was around this time that I discovered that someone has made a netstandard-compatible port of Irony, so we replaced the existing Irony reference with the netstandard-compatible port instead. This library was now fully ported across.

ObjectModels

This is the class library that describes all of our XML resources in MapGuide as strongly-typed classes with full XML (de)serialization support to and from both forms at various schema versions.

The original porting attempt targeted netstandard 1.6. While this was mostly painless, I had to reference tons of piecemeal System.* nuget packages, which then flowed down to anything that was referencing it.

For this attempt, we target .net standard 2.0 using the same technique of pasting a minimal netstandard2.0 class library template into the existing csproj file. Like the previous attempt, building this project failed due to dependencies on System.Drawing as a result of usages of System.Drawing.Font. Further analysis shows that we were using Font as a glorified DTO. So it was just a case of adding a new type that carried the same properties we were capturing with the System.Drawing.Font objects that were being passed around.

Due to referencing the NETStandard.Library metapackage by default, this attempt did not require referencing piecemeal System.* nuget packages like the previous attempts. So that's another library ported across.

MaestroAPI

Now for the main event. Maestro API needed to be netstandard-compatible otherwise this whole porting effort is a waste. The previous attempt (to target netstandard1.6) was cut short as APIs such as XML Schema support was not there. For .net standard 2.0, these missing APIs are back, so porting across MaestroAPI should be a much simpler affair.

And indeed it was.

Just like the ObjectModels porting effort, we hit some snags around references to System.Drawing. Unlike ObjectModels, we were using full blown Images and Bitmaps from System.Drawing and not things like Fonts which we were just using to sling font information around.

To address this problem a new full framework library (OSGeo.MapGuide.MaestroAPI.FxBridge) was introduced where classes that were using these incompatible types were relocated to. There was also service interfaces that returned System.Drawing.Image objects (IMappingService). These APIs have been modified to return raw System.IO.Stream objects instead, with the FxBridge library providing extension methods to "polyfill" in the old APIs that returned images. Thus, code that used these affected APIs can just reference the FxBridge library in addition to MaestroAPI and be able to work as before.

After sectioning off these incompatible types to the FxBridge library, the next potential roadblock in our porting efforts was our upstream dependencies. In particular, we were using NetTopologySuite, GeoAPI and Proj.NET to give Maestro API a strongly-typed geometry model and some basic coordinate system transformation capabilities. These were all full framework packages, meaning our previous porting attempt (to target netstandard1.6) was stopped in its tracks.

Because netstandard2.0 has a full-framework compatibility shim, we were able to reference these existing packages with the standard NU1701 compatibility warnings spat out by NuGet. However, since the previous porting attempt, the authors of NetTopologySuite, GeoAPI and Proj.NET have released netstandard-compatible (albeit prerelease) versions of their respective libraries, so as a result we were able to fully netstandard-ify all our dependencies as well.

However, we had to turn off strong naming of our assembly in the process because our upstream dependencies did not offer strong-named netstandard assemblies.

And with that, the Maestro API was ported to .net standard 2.0

MaestroAPI HTTP Provider

However, the Maestro API would not be useful without a functional HTTP provider to communicate with the mapagent. So this library also needed to be netstandard-compatible.

The previous porting attempt (to netstandard1.6) was roadblocked because the HTTP provider uses HttpWebRequest to communicate with the mapagent. While we could have just replaced HttpWebRequest with the newer HttpClient, that would require a full async/await-ification of the whole code base and then having to deal properly with the leaky abstractions known as SynchronizationContext and ConfigureAwait to ensure our async/await-ified HTTP provider is usable in both ASP.net and desktop windows application contexts without it deadlocking on one or the other.

While having a fully async HTTP provider is good, I wanted to have a functional one first before undertaking the task of async/await-ifying it. The development effort involved was such that it was better to just wait for .net standard 2.0 to arrive (where HttpWebRequest was supported) than to try to modify the HTTP provider to use HttpClient.

And just like the porting of the ObjectModels/MaestroAPI projects, this was a case of taking the existing csproj file, replacing the contents with the minimal netstandard class library template and manually adding in the required references and various settings until the project built again.

Caught in a snag

So all the key parts of the Maestro API have been ported across to .net standard 2.0 and the code all builds, so now it was time to run our unit tests to make sure everything was still green.

All green they were indeed. All good! Now to run the thing.

Most things seemed to work until I validated a Map Definition and got this message.



Assembly manifest what? I have no idea! This error is also thrown when I use any part of the MaestroAPI that uses NetTopologySuite -> GeoAPI.

My first port of call was to look at this known issue and try all the workarounds listed:
  • Force all our projects to use PackageReferences mode for installing/restoring nuget packages
  • Enable automatic binding redirect generation on all executable projects
After trying these workarounds, the assembly manifest errors still persisted. At this point I was stuck and was on the verge of giving up on this porting effort until some part of my brain told me to take a look at the assemblies that were in the output directory.

Since the error in question referred to GeoAPI.dll, I'd thought I'd crack that assembly open in ILSpy and see what interesting information I could find about this assembly.



Well this was certainly most interesting! Why is a full-framework GeoAPI.dll being copied out? The only direct consumer of GeoAPI (OSGeo.MapGuide.MaestroAPI.dll) is netstandard2.0, and it is referencing the netstandard target of GeoAPI.

Here's a diagram of what I was expecting to see:



After digging around some more it appears from observation that there is a bug (or is it feature?) in MSBuild where given a nuget package that offers both netstandard and full-framework targets, it will prefer the full-framework target over the netstandard one. This means in the case of GeoAPI, because our root application is a full-framework one, MSBuild chose the full-framework target offered by GeoAPI instead of the netstandard one.

So what's the assembly manifest error all about? The FusionLog property of the exception reveals the answer.



GeoAPI is strong-named for full-framework. GeoAPI is not strong-named for netstandard. The assembly manifest error is because our netstandard-targeting MaestroAPI references the netstandard target of GeoAPI (not strong-named), but because our root application is a full-framework one, MSBuild gave us a full-framework GeoAPI assembly instead. At runtime, .net could not reconcile that a strong-named GeoAPI was being loaded when our netstandard-targeting MaestroAPI was references the netstandard GeoAPI that is not strong named. Hence the assembly manifest error.

Multi-targeting for the ... win?

Okay, so now we know why it's happening, what can we do about it? Well, the other major thing that the new MSBuild and csproj file format gives us is the ability to easily multi-target the project for different frameworks and runtimes.

By changing the TargetFramework element in our project to TargetFrameworks (plural) and specifying a semi-colon-delimited list of TFMs, we now have a class library that can build for each one of the TFMs specified.

For example, a netstandard 2.0 class library like this:

1
2
3
4
5
<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <TargetFramework>netstandard2.0</TargetFramework>
  </PropertyGroup>
</Project>

Can be made to multi-target like this:

1
2
3
4
5
<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <TargetFrameworks>netstandard2.0;net461</TargetFrameworks>
  </PropertyGroup>
</Project>

If MSBuild insists on giving us full-framework dependencies if given the choice between full-framework and netstandard (when both are compatible), then the solution is to basically multi-target the MaestroAPI class library so that we offer 2 flavors of the assembly:
  • A full-framework one (net461) that will be selected by MSBuild if the consuming application is a full-framework one.
  • The netstandard one (netstandard2.0) that will be selected by MSBuild if the consuming application is .net Core, Xamarin, etc.
Under this setup MSBuild will choose the full-framework Maestro API over the netstandard one when building the Maestro windows application. Since we're now building for multiple frameworks/runtimes and explictly targeting full-framework again, we can re-activate strong naming on the full-framework (net461) target, ensuring the full-framework dependency chain of MaestroAPI is fully strong-named (as it was before we started this porting effort), and our assembly manifest error goes away when running unit tests and the Maestro application itself whenever we hit functionality that uses GeoAPI/NetTopologySuite.

So the problem is effectively solved, but the whole process feels somewhat anti-climactic.

I mean ... the whole premise of .net standard and why I wanted to port MaestroAPI to target it was the promise of one unified target (an interface if you will) with many supporting runtimes (ie. various implementations of this interface). Target the standard and your class library will work across the supporting runtimes, in theory.

Unfortunately in practice, strong-naming (and MSBuild choosing full-framework targets over netstandard, even if both are compatible) was the leaky abstraction that threw a monkey wrench on this whole concept, especially if some targets are strong-named and some are not. Having to multi-target the Maestro API as a workaround feels unnecessary.

But at the end of the day, we still achieved our goal of a netstandard-compatbile Maestro API that can be used in .net Core, Xamarin, etc. We just had to take a very long detour to get from A to B and all I can think of was: Was this (multi-targeting) absolutely necessary?

Some Changes and Compromises

Although we now have a .net standard and full framework compatible versions of the Maestro API, we have to make some changes and compromises around the developer and acquisition experience for this to work in a cross-platform .net world.

1. For reasons previously stated, we have to disable strong-naming of the Maestro API for the .net standard target. This is brought upon us by our upstream dependencies (the netstandard flavors of GeoAPI and NetTopologySuite), which we can't do anything about. The full framework target however is still strong-named as before.

2. The SDK package in its current form will most likely go away. This is because turning Maestro API into a .net standard library forces us to use nuget packages as the main delivery mechanism, which is a good thing because nobody should be manually referencing assemblies in this day and age for consuming libraries. The tooling now is just so brain-dead simple that we have no excuse to not make nuget packages. No SDK package also means that we can look at alternative means of generating API documentation (docfx looks like a winner), instead of Sandcastle as making CHM files is kind of pointless and the only reason I made CHM files was to bundle it with the SDK package.

The sample code and supporting tools that were previously part of the SDK package will be offloaded to a separate GitHub repository that I'll announce in due course. I'll need to re-think the main ASP.net code sample as well, because the old example required:

  • Manually setting up a web application in local IIS (not IIS Express)
  • Manually referencing a whole bunch of assemblies
  • Needing to run Visual Studio as administrator to debug the code sample due to the local IIS constraint.

These are things that should not be done in 2017!

3. Because nuget packages are the strongly preferred way of consuming libraries, it meant that having the HTTP provider as a separate library just complicates things (having to register this provider in ConnectionProviders.xml and automating it when installing its theoretical nuget package). The Maestro API on its own is pretty useless without the HTTP provider anyways, so in the interest of killing two birds with one stone, the HTTP provider has been integrated into the Maestro API assembly itself. This means that you don't even need ConnectionProviders.xml unless you need to use the (mg-desktop wrapper) local connection provider, or you need to use a (roll your own wrapper around the official MapGuide API) local-native connection provider.

4. The CI machinery needed some adjustments. I couldn't get OpenCover to work against our newly ported netstandard libraries using (dotnet test) as the runner, so I had to momentarily disable the OpenCover instrumentation while the unit tests ran in AppVeyor. But as a result of needing to multi-target MaestroAPI (for reasons already stated), I decided on this CI matrix:

  • Use AppVeyor to run the Maestro API unit tests for the full-framework target on Windows. Because we're running the tests under a full-framework runner, the OpenCover instrumentation can be restored, allowing us to upload code coverage reports again to coveralls.io
  • Use TravisCI to run the Maestro API unit tests for the netstandard target under .net Core 2.0 on Linux. The whole motivation for netstandard-izing MaestroAPI was to get it to run on these non-windows .net platforms, so let TravisCI handle and verify/validate that aspect for us. We have no code coverage stats here, but surely that can't be radically different than the code coverage states had we run the same test suite on Windows with OpenCover instrumentation.
Where to from here?

Now that the porting efforts have been completed, the next milestone release should follow shortly. 

This milestone will probably only concern the application itself as the SDK story needs revising and I don't want that to hold up on a new release of Maestro (the application).

A simpler MgCooker tile seeding process

I don't know if you've ever seen the guts of the tile seeding code used by MgCooker, it isn't the most prettiest of things, but it for the most part works.

Besides some cosmetic restructuring of the code, I haven't really touched this part of Maestro ever.

Consider the history of this tiling code. It originated around 2009. Things we now take for granted like async/await and the Task Parallel Library probably didn't exist around that time, so you had no choice but to dive deep into wait handles, auto-reset events and manual thread management.

If I had to write MgCooker from scratch today, I'd cook up (pun intended) probably something like this

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
using System.Diagnostics;
using System.Threading;
using System.Threading.Tasks;
...
public class TileSeeder
{
    public void SeedTiles()
    {
        List<(int row, int col, int scale)> tiles = ComputeTileRequestList();
        int total = tiles.Count;
        int rendered = 0;
        var sw = new Stopwatch();
        sw.Start();

        //The magic sauce that takes multi-threads our tile seeding and takes care of all our multi-threading concerns!
        Parallel.ForEach(tiles, (tile) => 
        {
            //Send a HTTP request to GETTILE mapagent API with tile.row, tile.col and tile.scale
            ...
            Interlocked.Increment(ref rendered);
            Console.WriteLine($"Rendered {rendered}/{total} tiles");
        });

        //And this method blocks too, so if we get to this point, the tiling process has finished.
        sw.Stop();
        Console.WriteLine($"Rendered {rendered} tiles in {sw.Elapsed}");
    }
}

Isn't this much easier to read and comprehend?

The implementation of the ComputeTileRequestList method referenced here is omitted for brevity, but for the implementation we can just reuse what is in the current iteration of MgCooker. Most of the settings in MgCooker mainly affects the generation of the list of row/col/scale anyways.

The core multi-threading "render/cache all these tiles" is just one simple Parallel.ForEach method baked right in the .net Framework itself!

MgCooker is overdue for a rewrite anyways. I just didn't really think it would be so conceptually simple with today's .net libraries and C# language constructs!