Thursday, 28 May 2020

Introducing FdoCmd, the missing CLI for FDO

Fixing long time annoyances is not the only standout feature of the upcoming 1.4 release of FDO Toolbox. This release will also introduce a new command-line tool that replaces the existing FdoInfo.exe and FdoUtil.exe tools and combines their functionality into a single, flexible, script-friendly CLI tool.

The original FdoInfo/FdoUtil tools were implemented with the goal of getting a basic command-line experience for FDO functionality without much thought into proper best practices for how CLI tools should behave and work in a shell scripting context.

In the many years since, having accumulated experience in using various CLI tools (in particular, the git command-line client) and .net libraries coming into existence like CommandLine, which makes creating .net CLI applications in the behavioral template of git and friends dead-simple, I saw the re-activation of FDO Toolbox development as an opportunity to do away with FdoInfo/FdoUtil and build a proper CLI tool that can act as a true command-line companion for FDO Toolbox.

And thus FdoCmd.exe was born.

To understand why FdoCmd.exe came to be, it probably helps to explain what my high-level goals were for FdoCmd.exe
  • It has to verb-based.
  • It has to be mostly self-documenting and its functionality easily discover-able.
  • Its output has to be in a form that allows for it to be easily capture/parsing in shell scripts. Since FDO Toolbox is a windows-only tool, we are primarily referring for FdoCmd.exe to be easily driven by Powershell script, powershell being the predominant shell scripting environment for Windows.
  • It has to have functional parity with the core features provided by its GUI counterpart. The normal things you do in FDO Toolbox (the windows application) should be also doable in FdoCmd.exe.
Each goal I'll be explaining in greater detail below.

Verb-based terminology

FdoCmd.exe is a verb-based CLI. What we mean by this is that the first argument you pass to FdoCmd.exe is the verb that you want to run followed by arguments specific to that particular verb. The most famous example of CLI tools that follow a verb-based pattern is the git command line client. 

When using the git command-line client, every command always starts with "git":
  • git add somefile.txt
  • git commit -m "some commit message"
  • git reset --hard
  • git merge origin/master
FdoCmd.exe follows this same pattern for all the functionality it provides.

Self-documenting and discoverability

The verb-based design contributes to the discover-ability of FdoCmd.exe. What do we mean by this? If you run FdoCmd.exe without arguments, you effectively get a "directory listing" of all supported verbs.



From here you know what verbs to use, and you can drill down further. Want to know what arguments you need for the list-schemas verb? Just run FdoCmd.exe list-schemas and the tool will tell you.


And for some verbs, we even provide some examples.



It is this discoverability that lends to the self-documenting nature of the tool. There's really no need to write a dedicated set of documentation for this tool when the tool can tell you itself how its meant to be used.

Shell-scripting friendliness

FdoCmd.exe was designed with the requirement that it should be easy to write shell scripts around the tool. On windows, the predominant shell scripting environment is Powershell, so the outputs of FdoCmd.exe is such that it can be easily parsed and processed in powershell code.

For example, listing the classes of a data store prints each class name line by line.


Which is easily captured as a string array when invoked within a powershell script

1
2
3
4
$classNames = & .\FdoCmd.exe list-classes --provider OSGeo.SHP --connect-params DefaultFileLocation D:\fdo-trunk\Providers\SHP\TestData\Sheboygan
foreach ($className in $className) {
    Write-Host "Doing something with: $className"
}

Where applicable, FdoCmd.exe provides CSV output for its verb, which when combined with the ConvertFrom-CSV cmdlet allows powershell to easily consume FdoCmd.exe output for particular verbs as an array of objects. For example, you can query out features as CSV (along with the usual querying features like filtering and selective output of properties)


Which easily allows for powershell scripts to read features as arrays of objects.

1
2
3
4
5
6
7
8
9
$records = & .\FdoCmd.exe query-features --class Parcels --from-file D:\fdo-trunk\Providers\SHP\TestData\Sheboygan\Parcels.shp --schema Default --filter "RNAME LIKE 'SCHMITT%' AND RYEAR > 1950" --properties FeatId RNAME RYEAR RBILAD RCITY RZIP --format CSV | ConvertFrom-CSV
foreach ($record in $records) {
   $fid = $item.FeatId
   $name = $item.RNAME
   $addr = $item.RBILAD
   $city = $item.RCITY
   $zip = $item.ZIP
   # Do stuff with these values
}

For convenience, querying features through FdoCmd also supports GeoJSON output for easy conversion of feature data to GeoJSON.

If you recall my previous post on splitting out a SHP file to individual GeoJSON files per feature, FdoCmd.exe has all the tools (pun intended) to now do this in a single powershell script!


1
2
3
4
5
6
7
8
9
$srcPath="C:\TestData\national-esri-fe2019\COM_ELB_region.shp"
$featIds = & .\FdoCmd.exe query-features --from-file $srcPath --class COM_ELB_region --properties FeatId --computed-properties FileName "concat(lower(Elect_div), '_new_0_', lower(State))" --format CSV | ConvertFrom-CSV
foreach ($item in $featIds) {
    $fid = $item.FeatId
    $fn = $item.FileName
    $outPath = "C:\TestData\electorates_geojson\$fn.geojson"
    & .\FdoCmd.exe query-features --from-file $srcPath --class COM_ELB_region --filter "FeatId = $fid" --format GeoJSON | Out-File -FilePath $outPath
    Write-Host "Saved: $outPath"
}

Functional parity with FDO Toolbox

Anything you normally do in FDO Toolbox, you should be able to do in FdoCmd.exe as well.
  • Creating/Destroying data stores
  • Introspecting required provider parameters to connect, create, destroy data stores.
  • FDO provider registration/unregistration
  • Walk the structure (schemas/classes) of any data store
  • Listing/creating/destroying spatial contexts
  • Exporting/Applying FDO schemas 
  • Querying features in a feature class
  • Run SQL commands on data stores where the provider supports SQL commands
  • Easily bulk copy features from a feature class to a target data store
  • Executing existing bulk copy / join definition files
Along with some convenience functionality that is exclusive to FdoCmd.exe:
  • Get a total count of features in any feature class
  • Get the extent of any feature class
  • CSV or GeoJSON output for feature/SQL query commands
Where to from here?

As far as I am concerned, when FDO Toolbox 1.4 is released (coming real soon!). The FdoCmd.exe tool that ships with it will be functionally complete.

The only thing that can be improved on at that point would be to port it over to C++ and built against the native FDO APIs (instead of the windows-only .net wrapper APIs), so it can be part of the FDO native binary set, whether it's on Windows or Linux. That way, even if there won't be FDO Toolbox for Linux, at least you'll have access to its functionally equivalent CLI tool.

Monday, 25 May 2020

Fixing and improving long-standing FDO Toolbox annoyances

After 5+ years of inactivity, the FDO Toolbox project has roared back to life.

What prompted this burst of development activity? Years upon years of eating my own dogfood in my day job. Every-time I needed to load spatial data? Whip out the toolbox. Every-time I needed to take a peek at some spatial data to see what I'm dealing with? Whip out the toolbox. But those many years of self-usage also revealed many annoyances I had with using the tool and it was a constant battle between "just fix the problem already" and "we can workaround it".

For the longest time, "we can workaround it" had always been the winner, but those annoyances and pain points have built up to a critical mass of annoyance where I've finally decided to nip these problems in the bud once and for all.

So for this post, I'm going to outline some common scenarios and how the upcoming 1.4 release of FDO Toolbox will make these scenarios much simpler to carry out.

Scenario: Connecting to an RDBMS server with lots of data stores

Ever connect to a SQL Server that has lots of databases within (eg. 50+) and the current connect UI forces you to wait for the FDO provider to enumerate all those databases to load into a combo-box where you then have to pick and sift through to find your database in question.


Or heaven forbid, one of these 50+ databases is slightly misconfigured security-wise, and blows up the data store enumeration process. The UI in its current form has no escape hatch. You literally cannot make a connection to this SQL Server through the UI provided as a result. Now the actual bug is no longer an issue because the upcoming release will include FDO 4.1, whose SQL Server provider has this fix, but this scenario is being used as an example of the failings of the current connect UI.

Most of the time, you probably already know the name of the database you want to connect to, so in the upcoming release, that combo-box is now editable, allowing you to bypass this whole enumerating all the databases by putting in the name of the database directly and going straight to connect



Scenario: Setting up a bulk copy involving lots of files

Say you made a blank SQL Server database, and you want to load in these data files

How would you do this in current FDO Toolbox? It would be as follows:
  1. Connect to the SQL server database
  2. Connect to your data files
  3. Create a new bulk copy
  4. For each data file, add a copy task and pick the source feature class to copy and its destination target
  5. Review all the copy task options. Did you not want to copy/map all the properties? Well, you'll have to un-map them one-by-one.
  6. This is probably a once off thing, but the UI doesn't allow you to execute this right away. You have to give the bulk copy a name and save it out to a bulk copy task.
  7. Then you can execute this by right clicking the task on the Tasks tree and executing it.
  8. Grab a coffee/tea while the bulk copy does its thing
Now depending on how many files you are trying to load in, step 4 and 5 of this process are the most tedious step of the lot whose tediousness scales up with the number of files you're trying to load in.

In the upcoming release, we're going to streamline this process by giving you a new UI do specify multiple copy tasks at once!



Here's a gif of how long it takes to set up a full bulk copy of multiple SHP files into SQL Server.



Could you imagine how long this would've taken in current FDO Toolbox? What's more, you don't even have to save the task to execute it. There's now an Execute button in the bulk copy editor UI for you to execute the bulk copy in-place!



Scenario: I want a quick peek at the data

I use SQL Server Management Studio a lot when working with SQL Server databases directly and one of the features I take for granted in this tool is the ability to quickly preview the first 1000 results of any table I select.

I yearned for such a feature when going back to FDO Toolbox to take a quick peek at some spatial data and being annoyance at the absence of such a feature. So in the upcoming release, I've implemented such a feature!



Scenario: Copying data with non-standard/foreign projections into SQL Server

A common dataset I use when testing out FDO Toolbox or MapGuide is the Parcels.shp that comes with the SHP FDO Provider test suite.

So imagine my surprise that when I set up a bulk copy to load this file into a blank SQL Server db, that the process completely fails!



The actual logged error in question being:



This error is thrown by the SQL Server provider and reveals a strange corner case with blank SQL Server databases. The key thing to note was that this database was created outside of FDO Toolbox (I created it through SQL Server Management Studio). When you create a fresh database in SQL Server through FDO Toolbox like so:


You don't actually get a fully blank database in SQL Server, if you look at it through alternate tools you'll see this little table called f_scinfo


What is this table? This table is used to store FDO spatial context information about geometry columns/properties. So when we start the bulk copy into this blank database, the FDO provider creates this table automatically, which leads to the error at hand.

The error is due to the fact that when creating a spatial context in SQL Server, it tries to resolve a matching entry in the sys.spatial_reference_systems system table. Failing that, it tries to do an in-memory lookup of its local copy of ExtendedCoordSys.txt if you have one. If no matching entry is found through either means, it throws the aforementioned error. The Parcels.shp file is actually in the (LL/EPSG:4326) projection, but the name and WKT presented from the SHP file does not resolve to a coordinate system that SQL Server knows about.

So with the problem identified, the upcoming release presents the solution. For created copy tasks, there is a new option node available.


That when right-clicked presents an option to show a new spatial context override UI where you can replace the CS name/wkt from the source with a name/wkt that is known to SQL Server. When bulk copy creates its spatial contexts, it will use the name/wkt from the override settings if specified.


If you've ever used the Coordinate System Override feature in MapGuide Feature Sources, this is effectively the same concept but used for bulk copying in FDO Toolbox: A way to "fix" problematic coordinate systems in the source.

Alternatively, if you don't want to create a spatial context from source during the bulk copy and want to reuse an existing spatial context already on the target connection instead, we now provide that option as well.


Right-clicking this node presents a list of available target spatial contexts to map to.

Scenario: Copying data out of Oracle (through the King.Oracle provider)

If you have an Oracle database, then you are probably familiar with the quirks that come with reading data out of it through the King.Oracle FDO provider.
  1. The weird tilde notation for feature class names (unless you set up the KingFdoClass table)
  2. Spatial contexts always starting with "OracleSrid"
If you ever want to bulk copy data out of Oracle through this provider, these quirks live on in the data stores you've copied to, which may complicate any application code you have that may be working against this data store as the king oracle-isms have seeped into the data store.

In the upcoming release, through the multiple copy task UI, you now have the ability to specify the name of the target feature class if creating one. No more tildes!


Similarly, on the new spatial context override UI already shown, you may nominate a new name for the spatial context to be created. No more "OracleSrid" either!

In closing

This is not the full summary of what will be coming in the upcoming 1.4 release, but these features and improvements are the ones I believe will have the most impact in making the things you currently use FDO Toolbox for that much easier.

Sunday, 24 May 2020

Announcing: vscode-map-preview 0.5.2

This release future-proofs the extension by using the recommended webview APIs and adding a CSP to the generated preview HTML used to display the map.

In the grand scheme, everything should be working as before.

Special thanks to Matt Bierner of the VSCode team for the heads up on this issue.