Hey everyone,
I've been doing a bit of development with the ThinkGeo platform, and a few days ago I ran into a snag. I was using a fairly large amount of data (6900 columns, 3100 rows) to dynamically render shapes against classbreak values. The trouble is, as ThinkGeo has pointed out, that the InMemoryLayer wasn't really designed with sizes like that in mind. So, ThinkGeo came to our rescue and helped us create an InMemoryLayer that uses an r-tree index to speed things up a bit(see the seperate topic for the code on that one). This helped a good bit, but the performance compared to an indexed shapefile still wasn't there. After testing, I realized that the problem is an inefficiency in the way ThinkGeo renders it's class breaks. I don't know how to expose the actual method, so I can only guess, but I suspect that for each feature, it performs a GetColumnByName, or something to that effect, grabs the value, matches it, and then moves on to the next shape. Unfortunately, when you get to datatables of the size I'm working with, it slows everything down. A lot. So, here's what we came up with. It's a bit of a hack, and I'm sure ThinkGeo can do better, but I thought I'd share it for them and anyone else interested.
Step 1.
Don't store the data in the layer, only the shape and the ID. Instead, we created a custom class that is essentially a collection of hashtables. Each column is a hashtable, and each entry is the shapeID and the value.
Public Class DataContainer
'Contains a Hashtable whose keys are columns, and objects' are FeatureID, FeatureValue hashtables.
Private Columns As New Hashtable
Private WorkingHashtable As Hashtable ' Maintain reference to last call to improve efficiency
Private WorkingTableName As String
Public Sub Add(ByRef ColumnName As String, ByRef FeatureID As String, ByRef Value As String)
Dim ColumnHash As Hashtable = Columns(ColumnName)
If Not ColumnHash Is Nothing Then
ColumnHash.Add(FeatureID, Value)
Else
Columns.Add(ColumnName, New Hashtable)
ColumnHash = Columns(ColumnName)
ColumnHash.Add(FeatureID, Value)
End If
End Sub
Public Function GetValue(ByRef ColumnName As String, ByRef FeatureID As String) As Double
If ColumnName = WorkingTableName Then
If Not WorkingHashtable Is Nothing Then
Return WorkingHashtable(FeatureID)
Else
Return 0
End If
Else
Dim ColumnHash As Hashtable = Columns(ColumnName)
WorkingHashtable = ColumnHash
WorkingTableName = ColumnName
If Not ColumnHash Is Nothing Then
Return ColumnHash(FeatureID)
Else
Return 0
End If
End If
End Function
End Class
Step 2.
Use the IndexedInMemoryLayer from the other post, and Return the shapes inside the bounding box. In the code below, mData is a reference to the class above.
'Create new featuresource for output
Dim fsc As New Collection(Of FeatureSourceColumn)
fsc.Add(New FeatureSourceColumn("DValue"))
'Create outputlayer for featuresource
Dim outputlayer As New IndexedInMemoryFeatureLayer(fsc)
outputlayer.Open()
outputlayer.EditTools.BeginTransaction()
'GetFeaturesInBoundingBox and add them to the featuresource
For Each item As Feature In ActiveLayer.FeatureSource.GetFeaturesForDrawing(BoundingBox, 256, 256, New Collection(Of String))
Dim r As New Feature(item.GetShape)
r.ColumnValues("DValue") = mData.GetValue(AmendedColumnName, item.Id)
outputlayer.EditTools.Add(r)
Next
outputlayer.EditTools.CommitTransaction()
Step 3.
Use the newly created Layer and add your classbreak classes to it, setting the render column to the "DValue" column we created to hold the data we want to display.
Essentially, this condenses the objects such that they may be rendered much more efficiently. It takes much less time to create the new layer and render breaks against it than the original. In my tests, the render speed was roughly 10 times faster.
Finally, this will not work for everyone, as hashtables are very inefficient to add items to. In my case it does not matter, as my web server will persist the created shapes and datalayers for weeks at a time with no alterations. This will also not help you if you have 50 columns. But, if you want to use vast amounts of data, and get away from shp files with their 255 column limit, this is the fastest way to do it that we've seen. Perhaps ThinkGeo might even be able to find a way to include some performance enhancements in future revisions. I think the InMemoryLayer can have many more uses this way.
If anyone has any questions, feel free to ask!
Regards,
Grant Hamm
Developer,
AWhere, Inc.