ThinkGeo Team,
Has WmsServer been upgraded to V13?
Thanks,
Dennis
ThinkGeo Team,
Has WmsServer been upgraded to V13?
Thanks,
Dennis
Hey @Dennis,
Unfortunately, the WmsServer edition is now considered a legacy product. I believe the latest available version is v10.6.1. What we recommend going forward is to upgrade to a XYZ tile server using WebAPI, or sticking with what you have now since it’s well established and likely will not need to be changed.
Thanks,
Kyle
hi Kyle,
See screen-capture below for WmsServer version.
I’m asking because, as you might recall, our WMS provider implemented a new system last August, which continues to give us issues. Every week more or less we’ll get an IIS Worker Process that goes to 100% CPU and locks up the system. I have been unable to determine the cause and our provider maintains that it is not their issue.
Can you offer any advice on how one might detect what is going on? I have lots of logging, but unfortunately it provides no clues.
Thanks,
Dennis
Hey @Dennis,
Yeah, that’s all slowly coming back to me now. And your version is the latest beta package available too, so you’re up to date in that regard.
So, adding the connection timeout ultimately didn’t help things then. Do you have the logs split out or identifiable by process id? We could take a look at the logs from that process and see where it’s at, what request was sent to the WMS provider, and look at your timers to see what else we can do.
I wonder if we manually flood the WMS provider with a bunch of these requests from the logs that we could easily recreate the issue in a controlled environment with a debugger attached to see what’s hanging and if we can force close the connection to the provider to get things “unstuck”.
Thanks,
Kyle
hi Kyle,
I did add the connection timeout timer, but it did not help. It’s currently set at 25 seconds.
The logs do not contain ProcessId, but I should be able to add it.
I was also thinking of creating an application that read historical logs and flooded the Provider with requests so it could be recreated. Did not consider being in debug mode at the time, which is a great idea.
What’s interesting though is that the Provider tells me that they see no issue at all on their side. I believe this is a loading issue and their new system is unable to handle the load from our requests. With using a central server to funnel all our requests to them I’m thinking there are bursts of requests that they can’t handle. If you recall too all this worked just fine under the Providers’ old Legacy product. The only difference from my side is the URL has a new format, which had to be supported and that was the only code change.
I would really appreciate your help with this. I will talk more with the Provider and also see about adding ProcessId to the logs.
I’ll let you know what they say.
Thanks,
Dennis
hi Kyle,
I’ve met with the WMS provider, and they are open to the idea of manually flooding the server with requests except that they would like advance notice. My idea was to write a simulation application that would read our logging records, extract the URL and then send to the provider.
After speaking with them it is definitely a loading issue where they are unable to process our requests as efficiently as they did under their legacy system. I believe this is then causing issues within my ASP.NET application and I would think somewhere in the WmsServer API’s. Maybe there needs to be more error/retry/abort logic in the API’s.
Would you be open to allowing me to look at the source code? I’m thinking I might be able to spot potential problem areas just by looking at the code.
In one WmsServer thread I read something about being able to use Log4Net to do logging. Is that possible and do you think it would provide any useful information?
Also, they suggested to possibly switch to WMTS as it is lightweight standard. I believe WmsServer accommodates WMTS does it not? If so, do you have any example code.
Thanks,
Dennis
Hey @Dennis,
Yeah, if the only thing that changed was the url to their just updated service, then it seems logical that their service isn’t setup to handle high volumes and should probably be addressed. But that’s an issue only they can resolve.
I can see a case where the provider is unable to handle the requests and drops them in a manner that we do not expect that is causing it to hang. When they refuse to process your request, do they know how they drop that request? That would be important information to know in order to come up with a fix within our APIs. Or if they are able to hang up the connection in the manner we expect, that could also resolve the issue (and potentially help their other clients using their service).
I’m not opposed to sharing the WmsRasterLayer class with you, but I’ll have to upload it via our ticket system and not on the public forums. I created a ticket addressed to you on our helpdesk.thinkgeo.com website. Let me know if you have received it.
We have a WPF sample on setting up a WMTS layer here. Setting it up in WmsServer should be extremely similar since it just needs the ThinkGeo.MapSuite.Layers.Wmts package from nuget.
Log4Net, I believe with my limited experience, would provide more information out of the box than hand rolled logging. That should give you the ProcessId automatically in their logging object. The only downside to it is that it is a bit of a learning curve to get used to. So I can’t say whether or not the effort is worth it.
Thanks,
Kyle
hi Kyle,
Yes, all they did from our perspective is provide us with a new URL to point to their new product.
They did show me examples of errors related to our requests. I will ask them details about the errors and how the connection gets dropped or how they recover from the error condition.
Implementing WmtsLayer does look very similar to WmsRasterLayer. I would simply duplicate the paradigm used for WmsRasterLayer. Since the clients all use WmsRasterLayer is it possible to transform the received WmtsLayer into a WmsRasterLayer before sending to the client?
I’ve downloaded the code in the ticket and will take a look.
I’m very familiar with Log4Net and in fact I use that within my ASP.NET application. I had thought that WmsServer had Log4Net for its’ own logging that was somehow enabled via Web.config.
Thanks,
Dennis
Hey @Dennis,
Yeah, I assume you would want to keep the clients the way they were so you won’t have to update all of them. So, yes, you would implement the WmtsLayer similar to how the WmsRasterLayer is designed, but when the client requests a large area that covers multiple WMTS tiles, you’ll need to split those requests and then stitch up the resulting images into one image the client expects.
One thing that concerns me about moving to WMTS, however, is that I recall you requesting smaller tiles to their server in the past and you had the same exact issue with the amount of requests going through to their server. Is their WMTS service running on some other server stack that would resolve that?
I think in v9 we had log4net on the WmsServer, but with the move to v10, I believe it was dropped because I’m not seeing any reference to it anywhere in the source code. The closest thing we got is the MapSuiteDebugger
class that logs events that occur on the trace info. But I think the only thing you might see on the trace is “WMS SERVER: Not Aligned with TileMatrix” in certain circumstances. So, I’m not sure that’s very helpful to you.
Thanks,
Kyle
hi Kyle,
Yes, your memory is correct in the smaller tile requests that we make due to our use of MultipleTile Overlay.
It sounds like too much effort to switch to WMTS plus our provider will not guarantee that WMTS won’t have the same issue. So based on all this I’m not inclined to switch at this point.
What I would like to do is prevent IIS Worker Process from going to 100% and locking-up the CPU.
Do you think if there was more error checking/recovery on the WebRequest/WebResponse invocations the 100% CPU could be avoided?
Dennis
Hey @Dennis,
Yeah, I would think that the issue would persist even when switching to WMTS as well. Just based on how I understand their servers behave now. I would say the effort is not worth the risk of it not helping.
If the error checking was able to pinpoint what exactly was causing the request to hang, then I think we can figure out how to deal with it. The code surrounding the HTTP request itself is extremely straightforward. You’ll see that on line 228 in WmsRasterSource.cs
. We fire the sending request event, then actually send the http request, then fire the sent request event. There might be additional properties that we can set on the headers or on the HttpWebRequest object that avoids this. I know we already set the Timeout property on the request, but setting the ReadWriteTimeout as well would help? Could try setting that in the SendingWebRequest event. But it’s just a guessing game until we know why it’s hanging in the first place.
Not that this would solve the core of the issue, but another workaround would be to setup a CPU trigger on your AppPool in IIS to kill the worker once it reaches high CPU after a period of time.
Thanks,
Kyle
hi Kyle,
Thanks for the tip on setting ReadWriteTimeout, that’s an easy change so I might just do that.
I’ve experimented already with KillW3wp on high CPU. I can simulate the condition and it does reset things, but my client application is then having trouble getting the Capabilities so I’m in the process of determining why. Yes, this is not a solution to the problem, but it is the next best thing.
Dennis
Kyle,
I spoke too soon. The ReadWriteTimeout is not available in the SendingWebRequest event, at least not that I can see. So how would this get set?
Thanks,
Dennis
Hey @Dennis,
Oh yeah, you have to cast e.WebRequest into an HttpWebRequest to gain access to it:
((HttpWebRequest)e.WebRequest).ReadWriteTimeout = 60;
I forgot that you had setup that KillW3wp previously. Kind of weird that the GetCapabilities is causing you trouble though. Usually, that’s called once to get the info unless maybe the layer is being recreated each request from the client?
Thanks,
Kyle
Morning Kyle,
Casting does indeed provide access to WebRequest.ReadWriteTImeout.
The default is 300,000ms, which seems excessive. A value in the range of 30,000ms-60,000ms seems more appropriate.
There are actually two GetCapabilities requests. There is the request from the server to our 3rd party provider, which is accomplished once for each of 16 NumberOfMapConfigurationsPerWorkerProcess.
The other is from the client to our server to request our server Capabilities, which is requested initially upon startup. However, if there is no response from our server then the client will wait 90 seconds and then attempt a GetCapabilities again. It is in this timeout state that GetCapabilities fails. As long as imagery is being requested and returned GetCapabilities is not requested. There is most likely a bug in my timeout logic.
Thanks,
Dennis
Hey @Dennis,
Oh, I see, it’s the client’s GetCapabilities that fails. That makes more sense then.
As for the ReadWriteTimeout, I think it makes sense to set it to 30-60 seconds and see if that improves things.
Thanks,
Kyle
hi Kyle,
The ASP.NET application I developed is based on the ThinkGeo example of WmsServer project, which revolves around IIS.
In my situation is IIS & website required? My ASP.NET application is really just a proxy server, and a website is not required.
Is it possible to remove IIS and treat it strictly as a proxy server?
Thanks,
Dennis
Hey @Dennis,
While you could technically remove IIS and just run the proxy app on its own, I don’t think I would recommend that since IIS provides security features and instance pooling automatically in the background. I could see a world where you replace IIS with Apache or Nginx, but since they are primarily Linux focused, I think finding good documentation and support on it might be difficult. But I’ve never used those in Windows so I could be wrong. I’d choose Nginx as a personal favorite over Apache any day if you were going to explore that route.
Also, I think the IIS term of “websites” is pretty outdated. I’d say back in the early days it was accurate before everyone started using HTTP for remote communication over UDP or Message Queues. Really it just refers to any application that communicates via the Web. In fact, I’d say that 95% of ThinkGeo’s “websites” we run on IIS are just WebAPI applications that never return actual HTML. Just JSON, XML, or image data, sort of like your proxy server.
On the subject of cutting down complexity, I could see two other options:
Thanks,
Kyle
hi Kyle,
I will heed your advice and leave it in IIS.
Thanks for the education on use of “websites”. This really is not my area of expertise.
I could run the clients directly to the WMS Provider (with API Key). However, the proxy server is used because none of our client workstations have Internet access and so this is the only path for WMS imagery. Also, WmsServer does Tile Caching, which is then available to all client workstations.
This morning I deployed the change:
((HttpWebRequest)e.WebRequest).ReadWriteTimeout = 30000;
Sometimes the 100% CPU for an IIS Worker Process occurs only once within a week, sometimes two weeks so I have to let this run for at least two weeks before I can say it made a difference.
Couple more questions–>>
How does IIS determine which IIS Worker Process is to handle any given HTTP Request?
What is the relation, if any, between the ThinkGeo Web.config AppSetting of NumberOfMapConfigurationsPerWorkerProcess and the IIS AppPool Maximum Worker Processes?
Currently NumberOfMapConfigurationsPerWorkerProcess is set to 16 and Maximum Worker Processes is set to 4.
My logging shows the ASP.NET application being initialized 4 times, which corresponds to the AppPool setting. The LayerPlugIn GetMapConfigurationCore is called 16 times.
Should these two properties be set to the same value?
Thanks,
Dennis
Hey @Dennis,
Hopefully that timeout helps relieve some of the issues.
NumberOfMapConfigurationsPerWorkerProcess is the number of MapConfigurations that we hold in a queue to process requests as they come in so we do not have to recreate them for each request. The default is 4. We cache these by the layer name, crs, and style. More will be created if the demand for them is needed. So, if you are getting a lot of requests in, we will create more instances of the MapConfiguration to use and they will be taken off the queue while they process the request and then get placed back into the queue once it’s done. Usually, this is for WmsServers with lots of different layers and supported CRSs or high demand servers with low CPU but high memory.
As for your worker processes in IIS, that’s the number of instances of your WmsServer is running. Having more allows IIS to load balance requests properly and allows for more parallel responses. I’m seeing that you should have 4 per CPU. So bumping that up probably wouldn’t hurt. But I would hold off on that until we know if the ReadWriteTimeout helped things first.
Overall, the number of IIS workers is more impactful than the number of MapConfigurations for parallel request handling.
Thanks,
Kyle