I meant to write this a long time ago but somehow that never really got out of the room. Following is a narrative of an EPiServer site that was on and off the net for half a year or longer and what I?ve learned in the process.
<day id=?1? />
We?ve gathered all the data from the client ? we know they have implemented custom ?skins? (basically controls that brand the mini sites based on under which domain the site is being displayed. Quite a cool solution. Also since they were struggling with the speed they have implemented a custom mini taxonomy based on Lucene.net to speed things up. Yet the site is terribly slow and keeps showing the familiar (for a developer) ?Application is busy under initialization? from time to time.
<note time=?11.00? description=?The bits are in place? />
Finally got the site running on my machine after 2 days of struggle to get a non non-corrupted database and what seems to be a complete source code, let?s try to get the site going.
<note time=?11.15? description=?I have the site running ? more or less?? />
Problem as following ? the site seems to be going up and down, nothing really is being reported in the EPiServer logs no matter to what level I would crank it up. The page is fine for about 3 requests then going down. Lucene.Net seems to be throwing errors… Sounds like a candidate. Could there be a race condition in the code?
<note time=?13.00? description=?Talk to the client ? this can be a tough call? />
The internal development team tried to stabilize the site for half a year, no luck. They don?t want to influence us, but they suggest a couple of places in the code and what has been changed recently, but somehow none of those seem to be looking like a good candidate for the problems. I?m going to need hard evidence.
<note time=?13.45? description=?The call is over back to the code? />
So the site is going down. I need the exact reason?. Setting up the diagnostic points to be able to tell exactly ? what is happening and when.
Application_Start – determine when application startup has occurred,
Application_Error – log any not otherwise logged errors
Application_End – log application domain unload and discover the reason for the unload.
Application_BeginRequest ? need to find out how many requests and where are being made.
<note time=?14.45? description=?Pulling hair ? Four requests and a restart? />
Apparently the application starts, handles 4 requests and reloads. There are no errors in the logs, everything is fine, the app domain just goes down and reinitializes! But why does it decide to restart?!
<note time=?15.15? description=?The evidence is coming? />
A couple of Google searches later, I?ve found a way to determine the reason for the domain to unload:
protected void Application_End(object sender, EventArgs e) { HttpRuntime runtime = (HttpRuntime)typeof(System.Web.HttpRuntime).InvokeMember("_theRuntime", BindingFlags.NonPublic | BindingFlags.Static | BindingFlags.GetField, null, null, null); if (runtime == null) return; string shutDownMessage = (string)runtime.GetType().InvokeMember("_shutDownMessage", BindingFlags.NonPublic | BindingFlags.Instance | BindingFlags.GetField, null, runtime, null); string shutDownStack = (string)runtime.GetType().InvokeMember("_shutDownStack", BindingFlags.NonPublic | BindingFlags.Instance | BindingFlags.GetField, null, runtime, null); Exception ex = new Exception(shutDownMessage + ": " + shutDownStack); Logger.Error( string.Format("[Cognifide diagnostics] Application Ends because {0}, \n at:\n{1} ", shutDownMessage, shutDownStack)); }
<note time=?16.00? description=?The evidence is confusing? />
So the reason for the application to restart is the most confusing – "Recompilation limit of 15 reached". This basically means that dynamic recompilation limit set by ASP for recompilation of a file has been exceeded, but what does that REALLY means?
Being an ASP.NET developer you must have stumbled upon the fact that the pages and controls in the aspx/ascx files are in the end being compiled into dlls and served that way to speed up the rendering. This is practically the biggest contributor to the fact that the ASP.Net sites don?t start instantaneously but rather require you to wait while in the background the csc.exe is going up and down numerous times.
The Framework has no way of unloading once loaded assemblies as it basically has no way of telling if a code is using it. Thus if an ascx or aspx file has changed it will get recompiled and loaded again every time. investigations with Process Monitor do not reveal any changes in the physical folder when a site is running so nothing there.
<note time=?19.30? description=?Getting the clue? />
And then it strikes me? what about the controls used for the site skinning? How are they served? How does it work?
Turns out the solution is quite clever. They are implemented using a Virtual Path Provider and alternate between different folders (different styles) based on which site or which section of the site are you on. Very nice, but there is a one simple problem with it. The site makes a round-trip to itself to get them and the controls are served every time like they were dynamically generated, without any form of caching, thus every time a site asks for them, they are treated like new entities and recompiled. The site does not realize that they have not changed.
<note time=?20.00? description=?The solution? />
The solution turns out to be simple and succinct to a degree that makes some people laugh and some other cry. I?ve also heard that it is also illegal in some countries to fix a problem with a single line of code. The solution is:
Provide the necessary cache dependency for the virtual path provider.
Namely adding the following lines to their Virtual Path Provider code:
public override CacheDependency GetCacheDependency( string virtualPath, IEnumerable virtualPathDependencies, DateTime utcStart) { if (IsVirtual(virtualPath)) { return new CacheNoDependency(virtualPath); } return Previous.GetCacheDependency(virtualPath, virtualPathDependencies, utcStart); }
<note time=?21.00? description=?The Report? />
Pretty happy with myself ? I?m writing a report to the client outlining the problem and the solution and heading home.
<day id=?2? />
Needless to say there were serious dancing the next day in the client?s office. Some virtual beers were sent our way, minstrels wrote a couple of songs in our honor and obviously there was a cake waiting for us.
<lessonLearned />
As I see a lot of VPP being used to serve active content like ascx and aspx files ? don?t forget to properly mark your cache dependency if you decide to do it.
This entry (Permalink) was posted on Friday, November 5th, 2010 at 2:21 pm and is filed under .Net Framework, ASP.NET, EPiServer, Software Development, Web applications. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response , or trackback from your own site.