Help or Suggestion please

Rose · March 22, 2016, 7:17am

Hi,

In a PostgreSQL database, we have a Polyline table which stores the vehicle's tracks with timestamp of starting date and ending date. We also have a Waypoint table which stores the points that make up the polylines. Each point has a timestamp when it is stored and a TrackID that links to the specific track in the polyline table. This is a live database, as the data coming in from the serial port, we append the points to the existing tracks or create a new track on-the-fly. This database is growing every minute.

We designed an Analyser tool using MapSuite desktop edition to allow the user to select a subset of the tracks from the database to do further analysis. One of the analysis is to display the tracks in a time frame, say from Tuesday to Wednesday. Some tracks may start on Monday and end on Friday, for those tracks, we only want to display the part from Tuesday to Wednesday. My question is what the best way to crop those tracks from MapSuite perspective given the vast volumn of data?

I thought I could retrieve the points data within the time frame from the Waypoint table and create the tracks on-the-fly. But there will be a big performance issue as we have so much data.

Any help or suggestion is greatly appreciated!

Thanks in advance

Rose

David · March 22, 2016, 8:30am

Rose,

   This has one foot on the server and one foot on the client.  My gut level feeling is to do it at the SQL side if things.  It may be that the performance is not good becasue you do not have an index on the timestamps and thus it is slow to find the points within the time frame.  I would look there first.  The only reason I say it is because it is closer to the source and if you can handle it there then you waste the lease amount of resources.  For example if you do it on the client then you need to read the disk more to get the extra records, they need to be transported on the network, then the client has to deal with them and do more processing, it is not so effective.

   Having said all of that i do understand that the database is usually a big bottleneck as technologies for scaling databases are just not that accessible to most people and putting stress on your database usually slows down everything related to the operation or not.  In the case where you could not do anything on the back end I would handle this with maybe a style or maybe a inherited feature source.  In the case of the style you would need to make sure you fetched the timestamp in the RequiredColumnNames of the style and then inside you would build your tracks and made sure to order them in some kind of sorted list.  You can then shave off points from the start and finish and still know the order of the points are good as they are sorted.

   The downside of the Style is that if you do spatial operations like how far is my vehicle to the truncated track then it will not return what you want because the feature source still has the points you want removed.  The best wway if you need this kind of support is to inherit from the PostGreFeatureSource and then override the GetFeatureInsideBoundingBox.  In there call the base and get the results.  Next sort the points like I mentioned above and cull out any of the points outside of the time window you want.  You can add the start and stop times as properties to your new feature source.  In this way we can pass back form the override just the features as you want them.  The GetFeaturesIndideBoundingBox is the key one as all of the other spatial query type of call use that at their root.  If you modify that one they others will fall into place.

I am not sure if this is what you wanted but its out there anyway.  Let me know what you think and if you want to explore any of these.

David

Rose · March 22, 2016, 8:30am

David:

Thanks for your reply.

In my PostgreSQL database, the Waypoints table is partitioned based on the Timestamp, it is one table per month. So this should be fast to retrieve the points within the specified time window.

If I inherit from the PostgreFeatureSource , regarding to cutting out any points outside of the time window in the GetFeatureInsideBoundingBox , I am thinking to imple it in this way:

For each track do following step:

1. Retrieve the first and last points within the time window from the Waypoints table

2. Get the track geometry as WKT

3. Go through the WKT and compare the point coordinates with the first and last point coordinates got in Step 1, remove the points before the first point coordinates, remove the points after the last point.

The problem is everytime the map extent is changed, it will go back to the database to retrieve the tracks and crop the tracks again, this might be slower.

Another way I am thinking, forget the Polyline table, I only retrieve the points within the time window from the Waypoints table, and create the tracks on-the-fly and store the created tracks in an InMemoryFeatureLayer. From the database perspective, this will give the database a big relief, it does not need to create the tracks. Will this work and better relatively speaking from the MapSuite perspective?

Thanks

David · March 22, 2016, 8:30am

Rose,

   If you throw out the poly line table and just query the points then that might be fast enough for your database to handle in real-time.  Then you can just on the FeatureSource side create the polyline as that should be really fast.

Just to make sure I understand your SQL would be something like: Select * from Waypoints where timestamp > xx and timestamp < xxx

If so then this should be pretty fast for the database as long as the timestamp is indexed.

Of course caching them in an InMemeoryFeatureLayer isn’t a bad idea either.  In this way you could control what is cached and what is not.  I guess the only thing is can you query all the various vehicles at one time and cache it or would it be by the extent.  Once the extent changes you need to re-query?  If you had to re-query and dint have so many vehicles you could include a “not in” in your query so once you query a vehicles trail for that period you could modify the SQL to exclude it.  Of course this has limits as to the SQL size.

   If you did the in memory stuff I would suggest once you have the cache loaded you called the BuildIndex() method to build and in memory R-Tree for the records to make the query fast for them as you pan around.  It is a little complicated not know all of the requirements.  If this is one vehicle at a time for analysis them in memory is the way to go.

David

Rose · March 22, 2016, 8:30am

David:

Am I right regarding to cutting out the track points outwith the time window?

The SQL query to query the Waypoints table will be sth. like this: Select * from Waypoints where Timestamp>=xx and Timestamp<=xxx and Type=xxxx etc.

The timestamp is indexed, so the database will use the timestamp to filter the data first, then further filter the selected data based on other criteria. I think this part should be fast.

I am going to use the InMemoryFeatureLayer to cache the tracks. One thing I am not sure here is, when I query the Waypoints table, although I did not specify the extent, the PostgreFeatureLayer will only retrieve the points within Map's extent. How to modify this? So the PostgreFeatureLayer will retrieve all the points which match the search criteria even some points are out of the map's extent? In this way, this will be a one time only task, when the map extent changed, I don't need to re-query the data, only if the user's search criteria changed, I need to re-query the data.

Thanks very much.

Rose

ThinkGeo · March 22, 2016, 8:30am

Rose,

As I am understanding this problem, following codes shows you how to only fetch data only the first time:
//Create your own cached data featureSource. public class CustomPostgreFeatureSource : PostgreSqlFeatureSource { private bool isDataFetched; private Collection<Feature> cachedFeatures; public bool IsDataFetched { get { return isDataFetched; } } public CustomPostgreFeatureSource() : this(string.Empty, string.Empty, string.Empty) { } public CustomPostgreFeatureSource(string connectionString, string tableName, string featureIdColumn) : this(connectionString, tableName, featureIdColumn, -999) { } public CustomPostgreFeatureSource(string connectionString, string tableName, string featureIdColumn, int srid) : base(connectionString,tableName,featureIdColumn,srid) { isDataFetched = false; } public CustomPostgreFeatureSource(NpgsqlConnection connection, string tableName, string featureIdColumn) : this(connection, tableName, featureIdColumn, -999) { } public CustomPostgreFeatureSource(NpgsqlConnection connection, string tableName, string featureIdColumn, int srid) : base(connection,tableName,featureIdColumn,srid) { isDataFetched = false; } protected override Collection<Feature> GetFeaturesInsideBoundingBoxCore(RectangleShape boundingBox, IEnumerable<string> returningColumnNames) { if (!isDataFetched) { cachedFeatures = base.GetAllFeatures(ReturningColumnsType.AllColumns); isDataFetched = true; } return cachedFeatures; } } //Create your own featureLayer. public class CustomPostgreFeatureLayer : PostgreSqlFeatureLayer { public CustomPostgreFeatureLayer() : this(string.Empty, string.Empty, string.Empty, -999) { } public CustomPostgreFeatureLayer(string connectionString, string tableName, string featureIdColumn) : this(connectionString, tableName, featureIdColumn, -999) { } public CustomPostgreFeatureLayer(string connectionString, string tableName, string featureIdColumn, int srid) : base() { FeatureSource = new CustomPostgreFeatureSource(connectionString, tableName, featureIdColumn, srid); SetupTools(); } public CustomPostgreFeatureLayer(NpgsqlConnection connection, string tableName, string featureIdeColumn) : this(connection, tableName, featureIdeColumn, -999) { } public CustomPostgreFeatureLayer(NpgsqlConnection connection, string tableName, string featureIdeColumn, int srid) : base() { FeatureSource = new CustomPostgreFeatureSource(connection, tableName, featureIdeColumn, srid); SetupTools(); } } //Useage. CustomPostgreFeatureLayer postgreLayer = new CustomPostgreFeatureLayer(connectString, "rail", "oid");//fllkaa40 postgreLayer.CommandTimeout = 6000; postgreLayer.ZoomLevelSet.ZoomLevel01.DefaultLineStyle = LineStyles.Railway1; postgreLayer.ZoomLevelSet.ZoomLevel01.ApplyUntilZoomLevel = ApplyUntilZoomLevel.Level20;

Any more qustions just feel free to let me know.

Thanks.

Yale

David · March 22, 2016, 8:30am

Yale,

   Your example shows loading all of the features.  I think what Rose wants to do is to is to modify the query for get all features and add some extra stuff like excluding points that are within a date range.  I think what you want to explore is the ExecutingSql event or something like that.  Whenever a sql call is made we raise and event and we pass along the SQL we are going to execute.  You can modify that sql and pass in your own.

Below is the link to show you how to use it, we also have a sample.  I think you were on this thread as well.

gis.thinkgeo.com/Support/DiscussionForums/tabid/143/aff/21/aft/6377/afv/topic/afpgj/1/Default.aspx#11624

David