Get a subset of shape file features based on field values

Russ · March 22, 2016, 7:29am

I have a requirement to read in an arbitrary shape file, and based on a variable set of fields, find all the shape features with matching field values.  This matching subset of shape features will then be placed in an InMemoryFeatureLayer and displayed in the map.

I can get almost where I need to be by using FeatureSource.GetAllFeatures, but it’s pretty clunky having to basically iterate checking feature field values.  And, the returningColumnNames arguments are case sensitive with no insensitive option/overload.  Since I can’t be sure what the casing may be, that becomes problematic as well.  There are other options for query operations, but none that I can get to work.

So, assuming I have a shape file loaded with column names “Column1” and “Column2” with arbitrary upper/lower cased names.  And I want all features “where column1=‘foo’ and COLUMN2=‘bar’”, (note varied casing) how would I code that?

Also note that on one file I may need to get value matches from “Column1” and “Column2”, on another file I may need to match on “Col1” and “Name”, the next only matches on “Address”, and so on.  So the number of columns to match as well column name will vary arbitrarily.

I’ve looked through samples and forum posts, but have not yet found an answer.

Troy · March 22, 2016, 9:31am

Hi Russ,

For the shape file column search, we provide the OLE db search with T-Sql statement and I think it can solve the case sensitive issue, please try the below method:

DataTable dt = featureSource.ExecuteQuery(“select * from tablename where upper(column11) = upper(‘foo’)”);

Then, we can use the returned result to get features later like GetFeatureByColumnValue.

Please let us know if any questions.

Thanks,

Troy

Russ · March 22, 2016, 9:31am

I think I’m still missing a connection.

I was already aware of the ability to ExecuteQuery to get a data table containing rows that match on case insensitive column names.

But it seems Shape DBF files don’t have any sort of identity field, unless one is explicitly created, which I cannot count on since the source for these files is somewhat arbitrary.

So, with ExecuteQuery I can get a set of all columns for rows from the DBF that match my criteria.  How do I convert those into the matching feature?  Without an identity field/column I can’t just use GetFeaturesByColumnValue.  For one thing, I don’t necessarily have any one field that uniquely identifies the feature.  Even with the multiple fields matching in the query, nothing guarantees that each composite forms a unique key, and it’s entirely likely that that composite will exist multiple times in the result set.  But it wouldn’t matter if it did because GetFeaturesByColumnValue only uses a single column.

So the only way I can see going forward is to use GetAllFeatures specifying the desired columns, and then iterate to find (again) the ones that match the query criteria.  But then that means the query results are actually useless other than maybe to determine that there are NO matches at all and avoid the massive iterative search.  Ultimately, we would use the query result only to provide case sensitive (one would hope?) column names for passing to GetAllFeatures.

And that’s basically the “pretty clunky” solution I described in the first post.

For what it’s worth, I was just playing with it again, and it appears that I was mistaken when I thought it cared about casing.  I’ve tried it a few ways and it keeps working.  No idea what I did that I misunderstood(?) as being case sensitive.

But that still leaves me nowhere except avoiding the initial query.  Unfortunately, GetFeaturesByColumnValue does not let me specify more than one column value.  This would be ideal if I could specify a collection of fields and values to match them all, but it does not.

So, my evolving understanding is that I must use GetAllFeatures to get what may be a rather large collection of features, and the use one of several techniques to get what I need from that collection.  Is there really no better and more efficient way?

Ethan · March 22, 2016, 9:31am

Hi Russ,

I think there isn’t better idea on querying multi-columns value condition without “id” column. If you want to get a high performance querying, you must build a “Id” column and fill it.

Thanks,

Russ · March 22, 2016, 9:31am

That does seem to be the case, but doesn’t seem like it has to be that way.  The query options work just fine for “compound keys”.  The problem is that there is no path from record/row matches back to matching feature.  I don’t have control over the shape files that customers use.  If there were a unique identity field, then I could get that with a query then turn around and go back through GetFeaturesByColumnValue using that identity field value.

Anyway, when each feature is paired with a database record, it sure seems like there should be a better way to get from arbitrary data record to it’s associated feature.  The mapping exists within the data format, it’s just not exposed through your APIs.  So the only option left is to get everything and loop through either explicitly or via LINQ.

Ethan · March 22, 2016, 9:31am

Hi Russ,

If there aren’t any relative column such as “Id” column or you can’t get record index in whole records, then, the dbf record will be not related with shp file feature, I guess that you can first query the record index in whole records, then invoke “GetFeatureById” to get the target feature.

Thanks,

Russ · March 22, 2016, 9:31am

Yes, I’ve basically given up on using the features provided by the map and moved on with a solution somewhat as you describe.

My ultimate solution was to get all features with all values and post process the whole thing from my own in-memory collection using a combination of LINQ and a parallel in-memory index (really more a topology tree based on the composite keys) making ongoing processing fast and efficient as well as flexible.  It just seemed like the map features were so close to providing what I needed, I just hated to go the route of a complete duplication of the entire dataset for a multitude of shape files.  But in the end it works very well and made subsequent related work much easier and faster to implement.

Thanks for all your help.  This may not have produced the answer I had hoped for, but you did confirm that I wasn’t reimplementing the wheel just because I was missing some key point.

Troy · March 22, 2016, 9:31am

Hi Russ,

It’s our pleasure to assist you. Hope you will have a good progress on your project.

Thanks,
Troy