ThinkGeo.com    |     Documentation    |     Premium Support

Querying features from a large ShapeFile

Hi,


I am trying to obtain a subset of features from a large shapefile based on an attribute value. The shapefile has about 800,000 records in it. My current approach is outlined in the following code snippet:


 
public IEnumerable<Feature> GetFeaturesByColumnValues(string columnName, ICollection<string> columnValues)
        {
            var featureCollection = new List<Feature>();
            FeatureLayer.Open();
            var columns = FeatureLayer.FeatureSource.GetColumns();
            foreach (var column in columns)
            {
                if(String.Equals(column.ColumnName,columnName))
                {
                    foreach (var value in columnValues)
                    {
                        var features = FeatureLayer.FeatureSource.GetFeaturesByColumnValue(columnName, value);
                        featureCollection.AddRange(features);
                    }
                    break;
                }
            }
            FeatureLayer.Close();
            return featureCollection;
        }
 I tried to obtain about 600 records from the featuresource and it took about 3 mins for the code to finish execution. This time penalty is too huge to be acceptable. Is there a better way to do it ?

Would creating an attribute index help. If yes, could you tell me how to do it using thinkgeo api ?

Amritayan,


Sorry for the delay because our spring festival time now.
 
If you want to create index for .shp file, you can use ShapeFileFeatureSource.BuildIndexFile static method. But I think it doesn’t solve your question, the index file is used for spatial query which use a bounding box to do query. You can try the following codes; I think it should be faster. 

public IEnumerable<Feature> GetFeaturesByColumnValues(string columnName, ICollection<string> columnValues)
        {
            var featureCollection = new List<Feature>();
            FeatureLayer.Open();
            Dictionary<string, string> expectedValues = new Dictionary<string,string>();
            foreach (string temp in columnValues)
            {
                expectedValues.Add(temp, temp);
            }
 
            var columns = FeatureLayer.FeatureSource.GetColumns();
            foreach (var column in columns)
            {
                if (String.Equals(column.ColumnName, columnName))
                {
                    ShapeFileFeatureSource source = (ShapeFileFeatureSource)FeatureLayer.FeatureSource;
                    int count = source.GetCount();
                    for (int i = 1; i <= count; i++)
                    {
                        string tempDbfValue = source.GetDataFromDbf(i.ToString(), columnName);
                        if (expectedValues.ContainsKey(tempDbfValue))
                        {
                            Feature feature = source.GetFeatureById(i.ToString(), ReturningColumnsType.NoColumns);
                            feature.ColumnValues.Add(columnName, tempDbfValue);
                            resultFeatures.Add(feature);
                        }
                    }
                    break;
                }
            }
 
            FeatureLayer.Close();
            return featureCollection;


 
 
Any more questions just feel free to let me know.
 
Thanks.
 
Yale