ThinkGeo.com    |     Documentation    |     Premium Support

Calling GetAllFeatures on a very large FeatureSource

I am getting an ESRI “General function failure” exception when I call GetAllFeatures() on a geodatabase table with roughly a million features in it (I know this is a huge dataset).

The reason I need all of the features is I have implemented an attribute table and need to load the features into the attribute table in the order they appear in the geodatabase table. I am not too stressed about the exception as I am sure even if it didn’t fail it would take an extreme amount of time to fetch that many features from the gdb.

My initial thought is to adopt a lazy loading methodology where I only grab the first couple thousand features and load more on an as needed basis. Where I am stuck is finding a ThinkGeo functionality that allows me to get some number of features in the order they appear in the geodatabase.

In looking over the existing ‘GetFeatureBy…’ methods I see lots for spatial query options and a couple for column contents and feature IDs but I am unsure how to translate this into something that is data-agnostic (will work on any large geodatabase).

In an ideal world for my needs something like GetFeaturesByRows(0, 3000) GetFeaturesByRows(3001, 6000) would be great but I am open to other ideas or solutions.

Thanks,
Sean Jamieson

Hi Sean, I had a similar issue and solved it by limiting the columns returned to none, and then splitting the IDs returned in batches to split the rows. Maybe see if that already works?

Hi Julian,

That sounds promising, I will give that a go after lunch here.

Thank you!

Let me know if that works - I assume it is cleaner if we have an internal batch api though, as that would cut down on a lot of function calls

Hi guys,

Julian’s approach will help but still has room to improve ----GetAllFeatures() always fetches geometries, so it will still be slow even when limiting the returned columns to none.

Please use FeatureSource.GetFeatureIds() instead, which returns only the IDs without any geometry data.

However, I found that FileGeoDatabaseFeatureSource didn’t have its own implementation of this method, so it was falling back to the base class which calls GetAllFeatures() internally. We’ve just fixed it in v15.0.0-beta006 — pull the latest and give it a try.

@Julian_Thoms, again thanks for chiming in and sharing your thoughts!

Thanks,
Ben

Just to follow up — the following FeatureSources already have GetFeatureIdsCore() implemented:

SqlServerFeatureSource, PostgreSqlFeatureSource, SqliteFeatureSource, ShapeFileFeatureSource,
MultipleShapeFileFeatureSource, TabFeatureSource, WkbFileFeatureSource, NauticalChartsFeatureSource, GridFeatureSource, DelimitedFeatureSource, FileGeoDatabaseFeatureSource

The following don’t have it yet, and we plan to add it in a future version:
GdalFeatureSource, CadFeatureSource

Excellent to hear, Ben!

I am part way through implementing a solution that uses GetFeaturesByIds and have not tested it yet but it would seem you have gotten ahead of my next road block already (most of our large data is in GDB’s).

I will get on the newest beta and give it a go once I have it all implemented on my end.

Thanks again @Ben and @Julian_Thoms, this was a huge help!

You are always welcome!