David, I have uploaded a PowerPoint Slide Show that illustrates our application and the requirements we have from a mapping engine. Within the 20 or 30 slides, there are 5 or 6 that explicitly illustrate why I have this specific question on rendering performance.
Brief summary: We are a company that builds software for "Precision Agriculture". This is the application of GIS and GPS technology to crop production. Our customers have GPS on their machines and the log operating information every second as they drive through a field. As such, in a 1/2 mile by 1/2 mile field (160 acres), there might be 150,000 logged data records. Our goal is to display that data on a map, such that the user can see high and low yield areas, product application rates, soil density, nutrient analysis, etc, etc for any area in the field. This is what we call Dense point data. 90% of the time, we display all of the data for a field within the same map frame. The ability to select a specific subset as we zoom in is a low priority. An example of a map of "raw" cotton yield data is shown on the 10th slide entitled "Field Op Records".
On the 13th slide, entitled Spatially Aware Sensors, you can see that we do have reason to zoom in on a map at times. And when we do, we are displaying a set of dots that are sized dynamically to reflect the operating width of the machine, and overlapped in time series so that it is easy to see in what direction the machine was travelling. These are the gray dots. The green dots represent the operation of just a single part of the machine, as farmers may have one variety in each half of a planter. Our special skill within the industry is our ability to map and display these individual operating parts of a machine.
We also deal with Sparse data... manually pulled soil test sites within a field, or things like soil type and soil testing polygons.
For Dense and Sparse sites, we sometimes generate surfaces. These are persisted within our system as layers of many small rectangular polygons, with the polygons that are intersected by the containing field boundary being irregularly shaped. These are the 5,000 to 50,000 record surface layers to which I was referring in the original post. There is an illustration of one of these on the IntelliCalc Multi-Layer Processing slide. That's an old slide, and of our COM system, so teh cells are about 10 meters, and were rendered with outlines. In our new .Net stuff, the cells are 2 to 5 meters, and rendered w/o outlines. But, in any case, you can see that we are displaying them all at one time, in most cases.
On the FieldOps data, for the first five five years we used this package, we drew this data directly into the map window ourself, using GDI. On a 1ghz machine, we could render 50,000 yield points within a couple of seconds. That was compared to 10 to 15 seconds with the out-of-the box renderers drawing from a shapefile. We were finally able to contract with the vendor to write an optimized renderer plug-in. They took their grid renderer and made it available for vector layers, and added a parameter for a size field when rendering dots. This let us render size and color at the same pass. On a 2ghz machine today, I can render to 50,000 sites in about 1/2 second. The issue is the time it takes to copy my data into their required recordset for rendering. That takes about 3 seconds, after they gave us a bulk load method. It was 10 seconds. However, in their architecture, once I have created that recordset, I'm done... as the user zooms in and out, or pans, the rendering happens automatically w/o a requirement that I reload the recordset. I've accomplish this in MapSuite by caching the feature collection that I build. But I don't know how hard you work against that collection, though.
In ou .Net world, we store these 50,000 records as a single compressed blob in a database (or in a flat-file), and read them into a .Net dataset. The geometry is represented as an NTS geomtry object. In a separate question, I'll ask about that. Once I have the "blob" out of the database or flatfile, I can instantiate my pouplated dataset in less than a second. That becomes my datasource. Getting the blob depends if it is coming down the wire from a web service, or opening a local flat file, etc.
As you review our slides, you wee that we operate from a busines navigational tree. A farmer has farms, farms have fields, and fields have cropzones. Cropzones have these FieldOp records and soil test layers, etc. The user will navigate around by clicking on those nodes. We expect no longer than a 3 second response for displaying the "yield map" for a large field after the user clicks on the node. It may be longer the first time they click on the node if we have to cache the data locally from a remote source.
Clearly my other question about filtering from a feature source was related to having a layer with all of the field boundaries for the farmer, themed in one manner, the boundaries of the currently selected farm themed in a second manner, and the boundary of a currently selected field themed in a third manner. If the user clicks on a farm node, then there is no field layer, but I wanted to keep the placeholder on the map so I didn't have to reorder layers when they go back and click on a different field (or a different farm). As the farm and field layers are subsets of the full grower (farmer) layer, I wanted to only have one copy of the data and filter it for the subset. We know know of two ways to handle that efficiently.
All of our data is stored as WGS84 Lon/Lat, and we dynamically project it into UTM NAD83 at display. On our COM stuff, we could run a bulk transform on the high-frequency data and project 50,000 sites in about .2 seconds. It was amazing. Proj4Net appears to be pretty fast, but I'm not sure if you run a bulk transform in your drawing stuff, or not. That's kind of where I'm looking for a flow diagram. Where does the transform occur, and is it performed each time the map is redrawn? Would appear that it is, as you ask for a new FeatureCollection as the map is zoomed and panned. So, that likely means that in my FeatureSource, I would like to be aware of the target projection, and do the projection one time as I build the cached feature collection, preventing you from doing it over and over again?
We have been using NTS and GeoAPI since we started our port to .Net two years ago. We have a strong collection of supporting geometry and GIS objects that we have written around them (surfacing, topology cleaners, more robust intersections and unions, et). I won't be using any of the ThinkGeo components for geometry processing, per se... just presentation.
So.... let me see what I've missed on your questions: I think I covered the first 4.
I am somewhat familiar with a Grid. But, we do have a need to display the original raw GPS sites in some instances, and the grid is not applicable there. And, when we do build surfaces, we build them square to the geographic coordinate system, but want to display them in a projected coordinate system. Is that an option? And, then there is the issue of the "edge" cells of the grid needing to be clipped to a overlaying field boundary. That's not an option, is it? These are the issues that have kept us from pursuing the Grid approach in the past.
My personal preference is to theme surfaces and sites with a smooth gradiant legend, and yes... if it were known that this was happening, then we could calculate the index prorated between min and max values, and dive into the correct break. We also have the option for "custom" breaks, though, and many of our customers want to define that pH values between 5.5 and 5.8 are Red, 5.8 and 6.1 are yellow, etc,. So, we could optimize the renderer for indexing in a gradiant theme, but we cannot assume that is always what the user will want to use.
I do think the option of having all cells that are between 5.5 and 5.8 be in one layer, 5.8 to 6.1 in another layer, etc is quite intriguing. My datasource is always going to be all 50,000 points, but each layer could filter for only the appropriate cells. I need to understand more about how to implement a Layer object that is a collection of layers, and get them all drawn. I think this could be very promising.
My only connection to a database (of file system) is to get the data the first time the grower requests it, and to the extent possible, keep it cached in our in-memory .Net typed datasets.
Does the searching for the class break really take enough time to warrant this discussion? Very good point. All I know is that using other packages in the past, the rendering was extremely slow compared to what we could do. I'm still testing some of this stuff in MapSuite. 15,000 points is not an issue. 75,000 points hung my box. And then I had to get back to my day job. But I'll be exploring more this weekend, and was asking questions based upon lots of experience with other packages, and a little with MapSuite.
Sharing the code for the class break style would let me see how you are doing it, and I really think you will find that caching the last value found and testing it first on the next point could be a big help, if you are not doing that. But, I'm not clear if I would have the opportunity to write my own alternative class break style guy by deriving from one of your classes or not. It appears your compoennts are very extensible in many areas, but I don't really find that magic document that defines the full scope of your extensibility :) That's not a complaint. I hear the same thing from people that integrate my stuff into their applications, too :)
I'm not ready to explore tile caching. I think we have lots of other options, first.
Thanks, David. I appreciate the design brainstorming. I hope I have provided some useful information to help flush out the discussion. I'll be happy to visit by phone, but the background I've provided here should help that conversation if we need it.
That document that you have someone starting on the rendering workflow is really the most significant peice of documentation that would help me on knowing where to spend optimization resources. If that document can highlight the workflow steps as "overridable", that would be perfect. I guess any method you have called *Core is overridable?