Thursday, 11 June 2020

Announcing: FDO Toolbox 1.4.1

This release should dramatically improve stability of FDO Toolbox and the new FdoCmd CLI when working with SQLite data files.

In fixing the stability problem, it also finally gave me the ultimate insight into the proper usage patterns around the FDO .net API. I had made my concerns known over a decade earlier about the flakiness of the FDO .net wrapper and the various answers given didn't quite give me a solid set of "rules of thumb" to avoid.

What lit the light bulb was looking the FDO SQLite provider codebase once more. I noticed that the main connection class not only implements the main connection interface, it also implements several other interfaces for convenience. The other insight was that there are several places in the FDO Toolbox code that it would be possible that a reference to a capability or a property dictionary (off of the capability, or some other top-level connection property) would out-live the underlying connection which meant that when it came time for the .net GC to start cleaning up references, it would call the finalizers on these references which would then proceed to subtract the ref count on the underlying native pointer.

But because in the case of SQLite the connection implements several FDO interfaces, we may be subtracting a ref count of a pointer to a non-connection interface, but the underlying implementation is actually the connection itself, causing an access violation from tearing down the same connection more than once or tearing down something that is connected to the torn down connection.

I didn't have full evidence to confirm that the above was indeed the case, but it was a solid enough theory that was backed by my observations in running isolated snippets of C# code using the FDO API targeting the SQLite provider.

Here's an example of a crashing snippet:


 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
using OSGeo.FDO.ClientServices;
using OSGeo.FDO.Commands;
using OSGeo.FDO.Commands.DataStore;
using OSGeo.FDO.Connections;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace FdoCrash
{
    class Program
    {
        static void Main(string[] args)
        {
            var conn = FeatureAccessManager.GetConnectionManager().CreateConnection("OSGeo.SQLite");
            using (conn)
            {
                if (HasCommand(conn, CommandType.CommandType_CreateDataStore, "Creating data stores", out var _))
                {
                    using (var cmd = (ICreateDataStore)conn.CreateCommand(CommandType.CommandType_CreateDataStore))
                    {
                        var dict = cmd.DataStoreProperties;
                        foreach (string name in dict.PropertyNames)
                        {
                            Console.WriteLine("{0}", name);
                        }
                    }
                }
            }
        }

        static bool HasCommand(IConnection conn, CommandType cmd, string capDesc, out int? retCode)
        {
            retCode = null;
            if (Array.IndexOf<int>(conn.CommandCapabilities.Commands, (int)cmd) < 0)
            {
                //WriteError("This provider does not support " + capDesc);
                //retCode = (int)CommandStatus.E_FAIL_UNSUPPORTED_CAPABILITY;
                return false;
            }
            return true;
        }
    }
}


And here's the same snippet, refactored to the point it does not crash with System.AccessViolationException:



 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
using OSGeo.FDO.ClientServices;
using OSGeo.FDO.Commands;
using OSGeo.FDO.Commands.DataStore;
using OSGeo.FDO.Connections;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace FdoCrash
{
    class Program
    {
        static void Main(string[] args)
        {
            var conn = FeatureAccessManager.GetConnectionManager().CreateConnection("OSGeo.SQLite");
            using (conn)
            {
                if (HasCommand(conn, CommandType.CommandType_CreateDataStore, "Creating data stores", out var _))
                {
                    using (var cmd = (ICreateDataStore)conn.CreateCommand(CommandType.CommandType_CreateDataStore))
                    {
                        using (var dict = cmd.DataStoreProperties) //Dispose the property dictionary asap
                        {
                            foreach (string name in dict.PropertyNames)
                            {
                                Console.WriteLine("{0}", name);
                            }
                        }
                    }
                }
            }
        }

        static bool HasCommand(IConnection conn, CommandType cmd, string capDesc, out int? retCode)
        {
            retCode = null;
            using (var cmdCaps = conn.CommandCapabilities) //Dispose the command capabilities asap
            {
                if (Array.IndexOf<int>(cmdCaps.Commands, (int)cmd) < 0)
                {
                    //WriteError("This provider does not support " + capDesc);
                    //retCode = (int)CommandStatus.E_FAIL_UNSUPPORTED_CAPABILITY;
                    return false;
                }
            }
            return true;
        }
    }
}


And with this snippet, I have come to what I confidently feel should be the "best practices" to using the FDO .net API.

Firstly, dispose of any reference to a top-level connection property (or a sub-property that hangs off of that property) as soon as you are done with it. You must do whatever you can to eliminate the possibility of such references out-living the main connection if/when the .net GC goes to cleanup. The C# using keyword helps streamlining this a lot. Other FDO objects do not need such aggressive disposal (the key point is that they don't directly hang off of the main connection or one of its top-level properties), but you should probably do it anyways out of habit. A thing that may result in confusion (it's confused me for about a decade!) is that disposing a .net FDO object is not the same as disposing a FDO object in C++. In C++, disposing is actual deleting/deallocation of memory, in .net disposing is just subtracting the refcount from the underlying C++ pointer (a way to tell it we're done with the object on the .net side). And because nearly every FDO class/interface exposed to .net implements the IDisposable interface, you are encouraged to dispose early and often once you're done with them.

Secondly, avoid compound statements (some.property.PropertyOrMethod) on anything involving classes/interfaces from the FDO .net API. These are the rules for working with its C++ API (to prevent memory leaks from not housing intermediate parts of a compound statement in smart pointers) and we should probably follow the same rules on the .net side just to be safe.

With these 2 rules established, I did a sweep of the FDO Toolbox codebase to make sure these 2 rules were being adhered to and the end result was that our powershell test harness no longer crashes when involving the SQLite provider, which objectively gave me confidence that this issue was finally addressed.

And that's the little (and hopefully insightful) side-story to pad out this blog post :)

Download

Friday, 5 June 2020

Announcing: FDO Toolbox 1.4

After 5+ years of inactivity, the dam of pain points and personal frustrations finally burst and resulted in 2 months of solid enhancements and quality-of-life improvements, culminating in a new release of FDO Toolbox that I am pleased to finally announce.

Here's the significant changes of this release

FDO Toolbox is now 64-bit only

You should all be running a 64-bit version of Windows by now, so there's no real point trying to make a 32-bit version available.

Bulk Copy Enhancements

The Bulk Copy feature of FDO Toolbox has undergone many enhancements and quality-of-life improvements to make it ever more robust in getting spatial data out of one spatial data store and into another. This post covers all the significant changes.

New FdoCmd command-line tool

This release includes FdoCmd, a much more powerful and flexible command-line tool that replaces the existing FdoInfo.exe and FdoUtil.exe tools.

FDO 4.1

This release of FDO Toolbox ships with FDO 4.1 (r7973). This is pretty much equivalent to the FDO that ships with MapGuide Open Source 3.1.2 with extra PostgreSQL and MySQL provider enhancements made after the 3.1.2 release.

Improved file extension to provider inference

Most of you are probably accustomed to dragging and dropping a .sdf file or a .shp file into the Object Explorer of FDO Toolbox and it automatically creating a respective SDF or SHP FDO connection.

Unfortunately, the list of file extensions that work like this was a hard-coded list. Drag/drop a:
  • .geojson file
  • .csv file
  • .tab file
  • etc
And expect a FDO connection to be created? This was not possible until this release. For this release, we no longer hard-code a list of file extensions, we delegate that out to a new and external FileExtensionMappings.xml file. This file (which you can edit) defines all the file extensions a FDO connection can be created from and this file helps to service:
  • The file drag and drop functionality of the Object Explorer
  • The --from-file command line argument of any FdoCmd verb that requires FDO connection details
Removal of various cruft

5 years is a lot of time in the world of technology so naturally when coming back to this codebase, I took a look at things that were either obsolete or done real half-heartedly and gave them the axe. This includes:

1. Removing specialized connection UIs for Autodesk Oracle and SQL Server FDO providers. I have not touched an Autodesk Geospatial product in many years (where these providers are included), so whether this feature still works or not I do not know, nor do I want to invest in the resource to know (I'm long gone from the Autodesk reseller/partner game, so it's not like I have easy access to these products to find out). The existing open source SQL Server and Oracle providers are more than adequate for this task.

2. Removing specialized connection UI for the legacy PostGIS provider. If you don't know what this is, this was the FDO provider for PostGIS that was replaced by the more robust OSGeo.PostgreSQL provider around the FDO 3.5 timeframe. The legacy provider itself has long since been removed, but the UI for this was still present. Not anymore.

3. Removal of Sequential Process support. This feature (which I probably never documented, but you may see references to it in various UI menus) was an XML-based definition around wrapping calls to the old FdoInfo/FdoUtil CLI tools. Obviously, with the consolidation of these tools into FdoCmd, the Sequential Process support was broken and since it is much simpler to just write your own powershell wrapper around the FdoCmd tool, the choice to axe Sequential Process support was an easy one.

4. Removing all semblances of scripting and extensibility. My original ambitions for FDO Toolbox was for it to be a fully customizable and scriptable Windows GUI application. These ambitions were better realized in MapGuide Maestro, but were left half-baked in FDO Toolbox. Having the opportunity to revisit this codebase, I've come to the conclusion that FDO Toolbox doesn't need customization or an integrated scripting engine.

It is a tool with a singular set of purposes:
  • I want to peek/inspect some spatial data
  • I want to query some spatial data
  • I want to get data in/out of some spatial data source
And in this frame of reference, it was clear that adding in a half-baked IronPython scripting engine was overkill. Maestro has legitimate use cases for an integrated scripting engine, whereas for FDO Toolbox I struggle to find such use cases that cannot be satisfied with this release. In terms of scripting/automation, FdoCmd + powershell is already a combination that can address this niche. So as a result, this release of FDO Toolbox no longer include the scripting engine UI and has closed off the addin system. There just isn't a strong need for such features. 

This release also no longer includes API documentation for the FDO Toolbox core library. Coming back to this codebase, I've found that this library is really just a thin-wrapper on top of the FDO .net API and there isn't much value that the library adds on top. You are better off just using the FDO .net APIs directly or using the new FdoCmd + powershell combination.

In closing

This release has addressed all of my personal pain points and frustrations built up in the last 5 years since the last release of FDO Toolbox, and hopefully it does the same for you!