Using Analog C:Amie Edition to generate live statistics for inclusion on web pages

System Requirements:

  • Analog C:Amie Edition

The Problem:

While it isn’t very common these days, and is generally considered to be somewhat crass, there are occasions where you might want to display statistical information in terms of page views or hit counts directly on a web page – for example you may have an administrative view for your website, be running a competition and need to track hits on specific files or just want to tell the world how popular you are.

This article outlines how to use Analog C:Amie Edition’s XML mode to do this.

More Info

By default most users will either generate the HTML output for display or Computer output for piping into ReportMagic when configuring Analog. There are however other options for the structure of the data export as originally implemented by Stephen Turner. One of which allows the export of statistics into XML which you can use to parse for live statistical data and ultimately include on a web page.

How-to

The how-to is split into the following sections:

  1. What is not covered here
  2. Pre-requisites
  3. File System Considerations
  4. Prepare Analog
  5. Configure the Analog Server Settings
  6. Running Analog on a Schedule
  7. Displaying the Page Statistics
  8. Final Considerations

What is not covered here

This guide is specifically written to demonstrate the configuration of Analog and associated web scripts running under IIS on Windows when used in conjunction with ASP 3. The process is equally possible in PHP, JSP or Ruby as well as .net however only ASP 3 VBScript examples are given.

Pre-requisites

Before you begin, you will need to be running Analog C:Amie Edition 6.04 or higher as there is a bug in the XML DTD of lower versions of both Analog and Analog C:Amie edition.

File System Considerations

You need to consider where you place the XML output from Analog. While placing it in the public web root is entirely possible, this will make it potentially available to anyone who happens to become aware of its existence. Visibility of its existence is highly possible when debugging or in the event that there is an error on the part of the parser, file system or file system permissions.

It is therefore suggested that you keep the output XML file outside of the publicly accessible web root. A good candidate for this would be in the same location as your analog.cfg file.

For the rest of this example the following structures will be assumed

Public Web Root D:\sites\www.domain.com\web\
Log Files D:\sites\www.domain.com\logs\W3SVC1\
Configuration (.cfg) and Output Destination (.xml) D:\sites\www.domain.com\
Analog Executable D:\parser\Analog\Analog.exe

Configure the Analog Server Settings

In most cases, you will probably want to maintain your existing statistics output in HTML format for casual browsing. Therefore it is necessary to create a second Analog configuration file which will generate the required output.

The configuration file for Analog’s XML process can be less complex compared to that of the main one, although some customisation to improve parser time and minimise the size of the XML output should be considered essential.

Assuming that an existing Analog .cfg file exists at D:\sites\www.domain.com\logs\analog.cfg we will create a new .cfg file at D:\sites\www.domain.com\logs\analog-xml.cfg.

The example .cfg file shown below outlines an example XML output configuration for our example with paths highlighted for completeness.

# Analog C:Amie Edition XML Statistics Configuration
# Version 1.0.2
# See http://www.c-amie.co.uk/ for updates and moreLOGFILE d:\sites\www.domain.com\logs\w3svc1\*.logOUTPUT XML
OUTFILE d:\sites\www.domain.com\analog.xml
HOSTNAME "Analog Test Site"
HOSTURL http://www.domain.com/
IMAGEDIR "images/"

# Reports Enabled/Disabled List
GENERAL OFF #General Summary
YEARLY OFF #Yearly Report
QUARTERLY OFF #Quarterly Report
MONTHLY OFF #Monthly Report
WEEKLY OFF #Weekly Report
DAILYREP OFF #Daily Report
DAILYSUM OFF #Daily Summary
HOURLYREP OFF #Hourly Report
HOURLYSUM OFF #Hourly Summary
WEEKHOUR OFF #Hour of the Week Summary
QUARTERREP OFF #Quarter-Hour Report
QUARTERSUM OFF #Quarter-Hour Summary
FIVEREP OFF #Five-Minute Report
FIVESUM OFF #Five-Minute Summary
HOST OFF #Host Report
REDIRHOST OFF #Host Redirection Report
FAILHOST OFF #Host Failure Report
ORGANISATION OFF #Organisation Report
DOMAIN OFF #Domain Report
REQUEST ON #Request Report
DIRECTORY OFF #Directory Report
FILETYPE OFF #File Type Report
SIZE OFF #File Size Report
PROCTIME OFF #Processing Time Report
REDIR OFF #Redirection Report
FAILURE OFF #Failure Report
REFERRER OFF #Referrer Report
REFSITE OFF #Referring Site Report
SEARCHQUERY OFF #Search Query Report
SEARCHWORD OFF #Search Word Report
INTSEARCHQUERY OFF #Internal Search Query Report
INTSEARCHWORD OFF #Internal Search Word Report
REDIRREF OFF #Redirected Referrer Report
FAILREF OFF #Failed Referrer Report
BROWSERREP OFF #Browser Report
BROWSERSUM OFF #Browser Summary
OSREP OFF #Operating System Report
VHOST OFF #Virtual Host Report
REDIRVHOST OFF #Virtual Host Redirection Report
FAILVHOST OFF #Virtual Host Failure Report
USER OFF #User Report
REDIRUSER OFF #User Redirection Report
FAILUSER OFF #User Failure Report
STATUS OFF #Status Code Report

# Referring URL Report
REFLINKINCLUDE *
REFREPEXCLUDE http://domain.com/*
REFREPEXCLUDE http://www.domain.com/*
REFFLOOR 10r

# Referring Site
REFSITEEXCLUDE http://domain.com/

# Request Report
REQFLOOR 1r
REQEXCLUDE *.jpg
REQEXCLUDE *.gif
REQEXCLUDE *.png
REQEXCLUDE *.bmp
REQEXCLUDE *.class
REQEXCLUDE *.css
REQEXCLUDE *.js

REQEXCLUDE *.ico

# Status Code Report
304ISSUCCESS ON # Includes 304 errors on the request report

# Custom Exclusions
FILEEXCLUDE /stats/*

Optimisation

It is important to disable any and all unwanted reports, this will speed-up processing of the log files, reduce the size of the output file and reduce the expense of processing and reading data into your website later on. If you are going to follow this example and are only interested in accessing the hit count then the only report that you need to turn on is the REQUEST report.

Additionally, you should configure the REQFLOOR and REQEXCLUDE options to optimise the scope of the output. In most cases REQFLOOR will be set to 1r (1 request) while exclusions should include files that you have no intention or ability to measure. For example, if you are only going to display the hit count of the currently visible .asp file then everything apart from .asp can be excluded from the report.

Running Analog on a Schedule

Now that Analog has been configured, you need to run it. You can run the process manually (including via Task Scheduler) via the command

"d:\parser\analog\Analog.exe" +gd:\sites\www.domain.com\analog-xml.cfg

This will merge any master configuration file with the custom XML .cfg file in generating the output as instructed in the analog-xml.cfg

If you wish to schedule it as part of a larger stats run process, then following the example from my “Using Analog C:Amie Edition to provide automatic statistics on a multi-site production IIS 4.0, 5.0, 5.1, 6.0, 7.0, 7.5 or 8.0 web server” guide, the following script can be used to automate both the parsing of the HTML and XML output

cls
@echo off
SET ALGROOT=d:\parser\Analog\
SET WEBROOT=d:\sitesFOR /f "tokens=*" %%A IN ('dir /b d:\sites') DO (echo Looking for: analog.cfg in %%A

IF EXIST "%WEBROOT%\%%A\analog.cfg" (
echo Found analog.cfg for %%A
md "%WEBROOT%\%%A\web\stats"
echo.
"%ALGROOT%analog.exe" +g%WEBROOT%\%%A\analog.cfg > %WEBROOT%\%%A\analog.log
echo.

IF EXIST "%WBROOT%\%%A\analog-xml.cfg" (
"%ALROOT%analog.exe" +g%WBROOT%\%%A\analog-xml.cfg > %WBROOT%\%%A\analog.log
) ELSE (
echo ANALOG XML NOT FOUND
)

) ELSE (
echo ANALOG CONFIG FILE NOT FOUND IN "%WEBROOT%\%%A\"
echo.
)
)

This script searches all sub-folders of d:\sites for the presence of an analog.cfg file e.g. d:\sites\www.mydomain.com\analog.cfg. If it finds one it ensures that there is an appropriate output directory (\web\stats) and then runs the Analog C:Amie Edition executable using the baseline configuration (this is implicit) and the local site-level analog.cfg configuration file (explicit) to produce the report.

It will then repeat the process, looking for analog-xml.cfg and if present will generate the associated XML output.

Displaying the Page Statistics

Moving forward with our example, we will now have a XML file ready to re-parse located at d:\sites\www.domain.com\analog.xml. The next step is to create a piece of code that will match up the currently displayed web page with its entry in the XML output. The steps to do this are:

  1. Normalise the current page URL
  2. Filter the current page URL
  3. Load the XML file
  4. Query the XML file
  5. Display the output

Normalise the current page URL

Your web servers log file will log anything that it is sent by the browser, often the structure of this is entirely at the mercy of the web user or web service sending the request. You should therefore take precautions to protect the parser and increase the likelihood of finding a match in the XML file.

The following obtains the current URL from ASP’s ServerVariables object, removes unnecessary spaces and enforces that we will only search the XML file using Lower Case URL’s.

Dim strLocalPath
strLocalPath = Request.ServerVariables("URL")
strLocalPath = Trim(strLocalPath)
strLocalPath = Replace(strLocalPath," ", "+")
strLocalPath = LCase(strLocalPath)

Filter the current page URL

Assuming that you are using ASP 3, by default your index page will be identified as “default.asp” (for example www.domain.com/folder/default.asp). It is however possible for a client to access the same page using “/” (for example www.domain.com/folder/ or www.domain.com/folder). Which of these is logged will depend entirely on which of the three (equally valid) options the client chose to use. In theory this should be normalised, however Analog may record impressions for www.domain.com/folder/default.asp, www.domain.com/folder/ and www.domain.com/folder separately, therefore what we need to do is ensure that when we query the XML file later on that we are querying for all valid combinations on your server.

The simplest way to achieve this is to start building a query that assumes we will need all permissible options in order to obtain a valid result.

Dim strSearchSelector
' Normalise the Path String, generate the XPath OR Statement
if ((strLocalPath = "/") OR (strLocalPath = "/default.asp")) then
strSearchSelector = "col=""/"" or col=""/default.asp"""
else
if (Right(strLocalPath, 12) = "/default.asp") then
strLocalPath = Left(strLocalPath, (Len(strLocalPath) - 12))
end ifif (Right(strLocalPath, 1) = "/") then
strLocalPath = Left(strLocalPath, (Len(strLocalPath) - 1))
end if
strSearchSelector = "col=""" & strLocalPath & """ or col=""" & strLocalPath & "/"" or col=""" & strLocalPath & "/default.asp"""
end if

In the above code we first check to see if the use is querying the absolute website home page (either www.domain.com/ or www.domain.com/default.asp) and create a select statement accordingly.

If we are not on the absolute home page, we remove the trailing “/default.asp” from the URL and we remove the trailing “/” from the URL and then construct a select statement that will query for all three possible combinations:

  1. www.domain.com/folder
  2. www.domain.com/folder/
  3. www.domian.com/folder/default.asp

Load the XML file

Now that we have a search query, it is time to load the XML file and configure Microsoft XML to perform the leg work required to find our up to three possible URL combinations for the current page.

Dim xmlDoc
set xmlDoc = Server.CreateObject("MSXML2.DOMDocument.6.0")
xmlDoc.async = False
xmlDoc.setProperty "ServerHTTPRequest", true
xmlDoc.setProperty "ProhibitDTD", false ' Required for ~MSXML 6 (default in 6 = true; <6 = false)
xmlDoc.setProperty "SelectionLanguage","XPath"xmlDoc.validateOnParse = false' Load the specified XML file (returns XML output)
xmlDoc.load("d:\sites\www.domain.com\analog.xml")

In the above example, a Microsoft XML 6.0 object is created with configuration for the use of XPath to execute our query and permission to use the Analog DTD. ServerHTTPRequest is required for processing of synchronous XML documents within ASP.

xmlDoc.load() copies the XML file from the file system, parses it into memory and permits us to progress to the data extraction phase.

Query the XML file

As mentioned above, the weapon of choice in this example will be to query the XML file using XPath syntax. XPath in a somewhat complicated way allows us to walk the DOM hierarchy of an XML file by applying selection filters are we transverse the file towards our desired result.

Without wishing to overcomplicate analysis on how this works or the processes involved, the XPath query required to extract a hit count from the Analog C:Amie Edition 6.04+ DTD is:

/analog-data/report[@name='rep_req']/row[@level='1'][col='/folder/default.asp']/col[@name='col_reqs']

Or in a more logical format:

  1. Select the analog-data parent
  2. Select the report sub tree
  3. Expand all elements named row where the attribute level = 1
  4. Then find sub elements named col where the data (value) equals /folder/default.asp
  5. If all of the above matches return the value of the element col under row under report under analog-data where the attribute name = col_reqs

In our code, we substitute the value of strSearchSelector into the query so that all three valid path combinations are returned

Dim xmlResult
set xmlResult = xmlDoc.selectNodes("/analog-data/report[@name='rep_req']/row[@level='1'][" & strSearchSelector & "]/col[@name='col_reqs']")

Display the Output

The final task is to display the request count. Our xmlResult object does not contain a single value for this, but may contain 3 (or more) values, one for each of our valid paths – more if there is an anomaly in the data.

In order to obtain the actual page request count, we need to aggregate the three values together

Dim lngOut
Dim strOut
if (xmlResult.length > 0) then
for i = 0 to (xmlResult.length - 1)
if (IsNumeric(xmlResult(i).text)) then
lngOut = (lngOut + CLng(xmlResult(i).text))
end if
next
strOut = CStr(FormatNumber(lngOut,0))
else
strOut = "0"
end if

In this code segment, if there are no valid results in the XML file, we return “0”. Else, we read the value of each of the results into a variable lngOut, adding each time. We then convert this number into a formatted string (adding comma’s as the thousands separator) and can return the actual page request count as defined by analog.

The complete code sample

Bringing the above segments together, the final algorithm should look something like this:

' Analog XML Stats Printer Function
' © C:Amie. All Rights Reserved. 1996 - 2013
' http://www.c-amie.co.uk/
' Not for resale or use in commercial profit making activities. Use of this script sample is permitted as long as attributions is maintained.' Call the function and write the value
Response.Write("The Request Count Is: " & getPageHits(Request.ServerVariables("URL")) )Function getPageHits(ByVal strLocalPath)
Dim i
Dim xmlDoc
Dim xmlResult
Dim lngOut
Dim strSearchSelector
Dim strOut

strLocalPath = Trim(strLocalPath)
strLocalPath = Replace(strLocalPath," ", "+")
strLocalPath = LCase(strLocalPath)

' Normalise the Path String, generate the XPath OR Statement
if ((strLocalPath = "/") OR (strLocalPath = "/default.asp")) then

strSearchSelector = "col=""/"" or col=""/default.asp"""

else

if (Right(strLocalPath, 12) = "/default.asp") then
strLocalPath = Left(strLocalPath, (Len(strLocalPath) - 12))
end if

if (Right(strLocalPath, 1) = "/") then
strLocalPath = Left(strLocalPath, (Len(strLocalPath) - 1))
end if
strSearchSelector = "col=""" & strLocalPath & """ or col=""" & strLocalPath & "/"" or col=""" & strLocalPath & "/default.asp"""
end if

set xmlDoc = Server.CreateObject("MSXML2.DOMDocument.6.0")
xmlDoc.async = False
xmlDoc.setProperty "ServerHTTPRequest", true
xmlDoc.setProperty "ProhibitDTD", false ' Required for ~MSXML 6 (default in 6 = true; <6 = false)
xmlDoc.setProperty "SelectionLanguage","XPath"
xmlDoc.validateOnParse = false
' Load the specified XML file (returns XML output)
xmlDoc.load("d:\sites\www.domain.com\analog.xml")
if (xmlDoc.parseError.errorCode <> 0) then
strOut = "Error in parsing XML file"
else
set xmlResult = xmlDoc.selectNodes("/analog-data/report[@name='rep_req']/row[@level='1'][" & strSearchSelector & "]/col[@name='col_reqs']")
if (xmlResult.length > 0) then
for i = 0 to (xmlResult.length - 1)
if (IsNumeric(xmlResult(i).text)) then
lngOut = (lngOut + CLng(xmlResult(i).text))
end if
next
strOut = CStr(FormatNumber(lngOut,0))
else
strOut = "0"
end if
set xmlResult = nothing
end if
set xmlDoc = nothing
getPageHits = strOut
End Function

Final Considerations

Now that you have the ability to extract statistics from analog, you can get creative. You can pass a URL into the function getPageHits(“/downloads/myfile.exe”) to extract the request count for other content that is not directly attached to the current URL of the page.

You should also consider how frequently you want to generate updated statistics. You could run Analog continually in a loop if you wanted to generate near real-time statistical data, every 2 hours or daily (or anything in between). Windows Task Scheduler or cron under unix is the easiest way to schedule updates in most cases. Don’t forget to balance the processing load requirements of the server against the size of the log file store that you are scanning. You may also want to consider limiting the amount of logs in the statistics to the “last year” or “last month”. The logs for C:Amie (not) Com go back to January 2002, HPC:Factor’s repository is even older than that and so it would be utterly impractical for a production web server to undertake real-time scanning activities.

With a dedicated log parser server however…?

See Also

View: Analog C:Amie Edition Configuration File Generator

View: Using Analog C:Amie Edition to provide automatic statistics on a multi-site production IIS 4.0, 5.0, 5.1, 6.0, 7.0, 7.5 or 8.0 web server

Using Analog CE to provide automatic statistics on a multi-site IIS Server

System Requirements:

  • Windows NT 4.0 Workstation
  • Windows 2000 Professional
  • Windows XP
  • Windows NT 4.0 Server
  • Windows 2000 Server
  • Windows Server 2003, R2
  • Windows Server 2008, R2
  • Windows Server 2012
  • Windows Server 2016
  • Windows Server 2019
  • Internet Information Services 4.0, 5.0, 5.1, 6.0, 7.0, 7.5, 8.0, 10
  • Analog CE

The Problem:

This article discusses how to automate the dissemination of statistics data to multiple web site clients on a typical, single instance production IIS web server. The article discusses the configuration of Analog and the file system considerations required to automate the delivery of statistics to all IIS web sites hosted on the server instance.

More Info

In the default Analog CE sample config file, Analog is configured to support a single web site through a single instance of the Analog executive.

In a larger IIS web host environment, a single IIS server may potentially host hundreds or even thousands of individual web sites. In this model, it is necessary to tweak the configuration of Analog to act as a single instance, multi-target model – instead of the default single instance, single target model.

How-to

The how-to is split into the following sections:

  1. What is not covered here
  2. Pre-requisites
  3. File System Considerations
  4. Prepare Analog
  5. Configure the Analog Server Settings
  6. Configure the Analog Site Settings
  7. Running Analog on a Schedule

What is not covered here

This guide is specifically written to demonstrate the configuration of Analog. To that end it does not cover the installation, setup or configuration of IIS in either a manual or an automatic capacity.

This guide will provide you with a scalable solution towards running Analog on a production IIS server, however most administrators will wish to integrate this process into the creation of web sites on IIS.

Pre-requisites

You will need to ensure that scripts run through an account with appropriate permissions to be able to write to the output folders on your web server. If you are using UAC on your server, ensure that you execute the scripts with elevation.

File System Considerations

By default, IIS provides a separation between the sites web root and the logs associated with that web site instance. By default IIS bases its web site root under c:\inetpub and the log files necessary for Analog to function are stored under c:\windows\system32\LogFiles\<Instance>.

Hosting providers should move these files to a different location, one that is preferably separate from the operating system volume. In most cases, these will be accessible by the tenant. How you wish to do this will depend primarily on whether you want your client users to gain access to the logs or not.

A suggestion for how to structure your web sites on disk in both scenarios is shown below. In both examples the client would FTP directly into d:\sites\<domain> to upload or download content.

User has access to Logs User does not have access to Logs
d:
- \sites
-   \www.domain.com
-      \web
-        \stats
-      \logs
-        \W3SVC1
-   \www.website.net
-      \web
-        \stats
-      \logs
-        \W3SVC2
d:
- \sites
-   \www.domain.com
-      \stats
-   \www.website.net
-      \stats
- \logs
-   \W3SVC1
-   \W3SVC2
In this example the client has access to their own log files (presumably over SFTP). This allows them to download their logs for offline analysis, however you will also generate an Analog report for them into the \stats folder under their web site root folder (\web). In this example, log files are available for internal users only, you plan to use these to generate a report for the client using Analog into the \stats folder.

Prepare Analog

To fit in with the file-system structure outlined above, this guide will assume that the following file system structure is in use:

d:
- \parser
-   \Analog
-   \ReportMagic
- \sites
-   \www.domain.com
-      \web
-        \stats
-      \logs
-        \W3SVC1
-   \www.website.net
-      \web
-        \stats
-      \logs
-        \W3SVC2

Copy the contents of the latest Analog CE binary zip file into d:\parser\Analog.

 

Configure the Analog Server Settings

Analog’s global analog.cfg file must be reconfigured to be site independent so that it becomes the global configuration file. This is achieved by removing all references to a particular web site to create a baseline configuration. In essence, you are creating the default values for all Analog executions that will be used unless overridden in the local config.

Open the analog.cfg file in notepad.

Using the Analog CE Configuration File Generator output as an example, you should remove the highlighted lines to form a baseline configuration.

# Analog CE Baseline Statistics Configuration

# Version 1.0.1

# See http://www.c-amie.co.uk/ for updates and more
LOGFILE c:\windows\system32\logfiles\w3svc1\*.logOUTPUT HTML

OUTFILE c:\inetpub\wwwroot\stats\index.html

HOSTNAME "Analog Test Site"

HOSTURL http://www.domain.com/

IMAGEDIR "images/"

STYLESHEET images/analog.css

# Reports Enabled/Disabled List
ALL ON
ALLCHART ON GENERAL ON #General Summary
YEARLY ON #Yearly Report
QUARTERLY ON #Quarterly Report
MONTHLY ON #Monthly Report
WEEKLY ON #Weekly Report
DAILYREP ON #Daily Report
DAILYSUM ON #Daily Summary
HOURLYREP ON #Hourly Report
HOURLYSUM ON #Hourly Summary
WEEKHOUR ON #Hour of the Week Summary
QUARTERREP ON #Quarter-Hour Report
QUARTERSUM ON #Quarter-Hour Summary
FIVEREP ON #Five-Minute Report
FIVESUM ON #Five-Minute Summary
HOST ON #Host Report
REDIRHOST ON #Host Redirection Report
FAILHOST ON #Host Failure Report
ORGANISATION ON #Organisation Report
DOMAIN ON #Domain Report
REQUEST ON #Request Report
DIRECTORY ON #Directory Report
FILETYPE ON #File Type Report
SIZE ON #File Size Report
PROCTIME ON #Processing Time Report
REDIR ON #Redirection Report
FAILURE ON #Failure Report
REFERRER ON #Referrer Report
REFSITE ON #Referring Site Report
SEARCHQUERY ON #Search Query Report
SEARCHWORD ON #Search Word Report
INTSEARCHQUERY ON #Internal Search Query Report
INTSEARCHWORD ON #Internal Search Word Report
REDIRREF ON #Redirected Referrer Report
FAILREF ON #Failed Referrer Report
BROWSERREP ON #Browser Report
BROWSERSUM ON #Browser Summary
OSREP ON #Operating System Report
VHOST ON #Virtual Host Report
REDIRVHOST ON #Virtual Host Redirection Report
FAILVHOST ON #Virtual Host Failure Report
USER ON #User Report
REDIRUSER ON #User Redirection Report
FAILUSER ON #User Failure Report
STATUS ON #Status Code Report

# Referring URL Report
REFLINKINCLUDE *
REFREPEXCLUDE http://domain.com/*
REFREPEXCLUDE http://www.domain.com/*
REFFLOOR 1r

# Referring Site
REFSITEEXCLUDE http://domain.com/
REFSITEEXCLUDE http://www.domain.com/

# Request Report
REQFLOOR 1r
REQEXCLUDE *.jpg
REQEXCLUDE *.gif
REQEXCLUDE *.png
REQEXCLUDE *.bmp
REQEXCLUDE *.class
REQEXCLUDE *.js

# Status Code Report
304ISSUCCESS ON # Includes 304 errors on the request report

# Redirected Referrers Report
REDIRREFLINKINCLUDE *

# Failed Referrers Report
FAILREFLINKINCLUDE *

# Browser Summary
SUBBROW */*

# Operating System Report
OSCHARTEXPAND Windows

# Default Documents
# PAGEINCLUDE *.shtml
PAGEINCLUDE *.asp
PAGEINCLUDE *.aspx
# PAGEINCLUDE *.jsp
# PAGEINCLUDE *.cfm
# PAGEINCLUDE *.pl
# PAGEINCLUDE *.php
# PAGEINCLUDE *.rb

# Custom Exclusions
#FILEEXCLUDE /admin/*

# Robots & Crawlers
ROBOTINCLUDE REGEXPI:robot
ROBOTINCLUDE REGEXPI:spider
ROBOTINCLUDE REGEXPI:crawler ROBOTINCLUDE Baiduspider/*
ROBOTINCLUDE bingbot/*
ROBOTINCLUDE Googlebot*
ROBOTINCLUDE Infoseek*
ROBOTINCLUDE msnbot*
ROBOTINCLUDE Scooter*
ROBOTINCLUDE *Slurp*
ROBOTINCLUDE Ultraseek*
ROBOTINCLUDE *Validator*
ROBOTINCLUDE YandexBot/*
ROBOTINCLUDE YodaoBot/* # Search Engines
SEARCHENGINE http://*altavista.*/* q
SEARCHENGINE http://*yahoo.*/* p
SEARCHENGINE http://*google.*/* q,as_q,as_epq,as_oq
SEARCHENGINE http://*bing.*/* q
SEARCHENGINE http://*lycos.*/* query
SEARCHENGINE http://*aol.*/* query
SEARCHENGINE http://*excite.*/* search
SEARCHENGINE http://*go2net.*/* general
SEARCHENGINE http://*metacrawler.*/* general
SEARCHENGINE http://*msn.*/* MT
SEARCHENGINE http://*hotbot.com/* MT
SEARCHENGINE http://*netscape.*/* search
SEARCHENGINE http://*looksmart.*/* key
SEARCHENGINE http://*infoseek.*/* qt
SEARCHENGINE http://*webcrawler.*/* search,searchText
SEARCHENGINE http://*goto.*/* Keywords
SEARCHENGINE http://*snap.*/* keyword
SEARCHENGINE http://*dogpile.*/* q
SEARCHENGINE http://*bbc.*/* q
SEARCHENGINE http://*askjeeves.*/* ask
SEARCHENGINE http://*ask.*/* ask
SEARCHENGINE http://*aj.*/* ask
SEARCHENGINE http://*directhit.*/* qry
SEARCHENGINE http://*alltheweb.*/* query
SEARCHENGINE http://*naver.*/* query
SEARCHENGINE http://*northernlight.*/* qr
SEARCHENGINE http://*nlsearch.*/* qr
SEARCHENGINE http://*dmoz.*/* search
SEARCHENGINE http://*newhoo.*/* search
SEARCHENGINE http://*netfind.*/* query,search,s
SEARCHENGINE http://*/netfind* query
SEARCHENGINE http://*/pursuit query
SEARCHENGINE http://*/mamma.*/* query
SEARCHENGINE http://*ixquick.*/* metasearch.pl
SEARCHENGINE http://*vivisimo.*/* search
SEARCHENGINE http://*mysearch.*/* searchfor

# Static Internet Documents
TYPEOUTPUTALIAS .html ".html [Hypertext Markup Language]"
TYPEOUTPUTALIAS .htm ".htm [Hypertext Markup Language]"
TYPEOUTPUTALIAS .shtml ".shtml [Server-parsed HTML]"
TYPEOUTPUTALIAS .ps ".ps [PostScript]"
TYPEOUTPUTALIAS .gz ".gz [Gzip compressed files]"
TYPEOUTPUTALIAS .tar.gz ".tar.gz [Compressed archives]"
TYPEOUTPUTALIAS .txt ".txt [Plain text Documents]"
TYPEOUTPUTALIAS .cdf ".cdf [Channel Definition File]"

# Scripting & Dynamic Internet Content Files
TYPEOUTPUTALIAS .asp ".asp [Active Server Pages]"
TYPEOUTPUTALIAS .aspx ".aspx [Active Server Pages .net]"
TYPEOUTPUTALIAS .cgi ".cgi [CGI scripts]"
TYPEOUTPUTALIAS .pl ".pl [Perl scripts]"
TYPEOUTPUTALIAS .css ".css [Cascading Style Sheets]"
TYPEOUTPUTALIAS .class ".class [Java class files]"
TYPEOUTPUTALIAS .hqx ".hqx [Macintosh archives]"
TYPEOUTPUTALIAS .jsp ".jsp [Java Server Pages]"
TYPEOUTPUTALIAS .cfm ".cfm [Cold Fusion]"
TYPEOUTPUTALIAS .php ".php [PHP Hypertext Processor]"
TYPEOUTPUTALIAS .js ".js [JavaScript code]"
TYPEOUTPUTALIAS .dll ".dll [Dynamic Link Library]"
TYPEOUTPUTALIAS .asa ".asa [Web Server Scripting Configuration]"
TYPEOUTPUTALIAS .url ".url [Windows Internet Shortcut]"
TYPEOUTPUTALIAS .lnk ".lnk [Windows Explorer Shortcut]"
TYPEOUTPUTALIAS .ini ".ini [Configuration Settings File]"
TYPEOUTPUTALIAS .log ".log [Log Files]"
TYPEOUTPUTALIAS .diz ".diz [DIZ Text File]"
TYPEOUTPUTALIAS .inc ".inc [SSI Inclusion File]"
TYPEOUTPUTALIAS .xml ".xml [eXtensible Markup Language File]"
TYPEOUTPUTALIAS .rdf ".rdf [Resource Description Framework File]"
TYPEOUTPUTALIAS .rb ".rb [Ruby script file]"

# Image Files
TYPEOUTPUTALIAS .jpg ".jpg [JPEG graphics]"
TYPEOUTPUTALIAS .jpeg ".jpeg [JPEG graphics]"
TYPEOUTPUTALIAS .jpe ".jpe [JPEG graphics]"
TYPEOUTPUTALIAS .gif ".gif [GIF graphics]"
TYPEOUTPUTALIAS .gfa ".gfa [GIF graphics]"
TYPEOUTPUTALIAS .png ".png [Portable Network Graphics]"
TYPEOUTPUTALIAS .bmp ".bmp [BitMap]"
TYPEOUTPUTALIAS .bmz ".bmz [BitMap]"
TYPEOUTPUTALIAS .dib ".dib [BitMap]"
TYPEOUTPUTALIAS .rle ".rle [BitMap]"
TYPEOUTPUTALIAS .2bp ".2bp [Windows CE 4 Tone BitMap]"
TYPEOUTPUTALIAS .ico ".ico [Icon File]"
TYPEOUTPUTALIAS .tif ".tif [Tag Image File Format]"
TYPEOUTPUTALIAS .tiff ".tiff [Tag Image File Format]"
TYPEOUTPUTALIAS .wmf ".wmf [Windows Metafile (ClipArt)]"
TYPEOUTPUTALIAS .pct ".pct [Macintosh PICT]"
TYPEOUTPUTALIAS .pict ".pict [Macintosh PICT]"
TYPEOUTPUTALIAS .pcz ".pcz [Macintosh PICT Compressed]"
TYPEOUTPUTALIAS .pcd ".pcd [Kodak Photo CD]"
TYPEOUTPUTALIAS .pcx ".pcx [PC Paintbrush]"
TYPEOUTPUTALIAS .cdr ".cdr [Corel Draw]"
TYPEOUTPUTALIAS .cgm ".cgm [COmputer Graphics Metafile]"
TYPEOUTPUTALIAS .eps ".eps [Encapsulated PostScript]"
TYPEOUTPUTALIAS .fpx ".fpx [FPX Format]"
TYPEOUTPUTALIAS .wpg ".wpg [WordPerfect Graphics]"
TYPEOUTPUTALIAS .mix ".mix [Picture IT! Format]"
TYPEOUTPUTALIAS .psd ".psd [Adobe Photoshop Document]"

# Multimedia Audio, Video & Misc
TYPEOUTPUTALIAS .wav ".wav [WAV sound files]"
TYPEOUTPUTALIAS .avi ".avi [AVI movies]"
TYPEOUTPUTALIAS .arc ".arc [Compressed archives]"
TYPEOUTPUTALIAS .mid ".mid [MIDI sound files]"
TYPEOUTPUTALIAS .midi ".midi [MIDI sound files]"
TYPEOUTPUTALIAS .rmi ".rmi [MIDI sound files]"
TYPEOUTPUTALIAS .ivf ".ivf [Indeo Video Format movie]"
TYPEOUTPUTALIAS .aif ".aif [AIFF sound files]"
TYPEOUTPUTALIAS .aifc ".aifc [AIFF sound files]"
TYPEOUTPUTALIAS .aiff ".aiff [AIFF sound files]"
TYPEOUTPUTALIAS .au ".au [AU sound files]"
TYPEOUTPUTALIAS .snd ".snd [AU sound files]"
TYPEOUTPUTALIAS .mp3 ".mp3 [MP3 sound files]"
TYPEOUTPUTALIAS .m4a ".m4a [MPEG 4 Audio]"
TYPEOUTPUTALIAS .m4v ".m4v [MPEG 4 Video]"
TYPEOUTPUTALIAS .mov ".mov [Quick Time movie]"
TYPEOUTPUTALIAS .mpg ".mpg [MPEG movie]"
TYPEOUTPUTALIAS .mpeg ".mpeg [MPEG movie]"
TYPEOUTPUTALIAS .m1v ".m1v [MPEG 1 Video]"
TYPEOUTPUTALIAS .mp2v ".mp2v [MPEG 2 Video]"
TYPEOUTPUTALIAS .mpe ".mpe [MPEG movie]"
TYPEOUTPUTALIAS .wax ".wax [Windows Media Audio Extension]"
TYPEOUTPUTALIAS .wvx ".wvx [Windows Media Video Extension]"
TYPEOUTPUTALIAS .m3u ".m3u [MPEG 3 Audio]"
TYPEOUTPUTALIAS .wma ".wma [Windows Media Audio]"
TYPEOUTPUTALIAS .wmv ".wmv [Windows Media Video]"
TYPEOUTPUTALIAS .ra ".ra [Real Audio File]"
TYPEOUTPUTALIAS .ram ".ram [Real Audio Media]"
TYPEOUTPUTALIAS .asf ".asf [Microsoft Advanced Streaming Format]"
TYPEOUTPUTALIAS .asx ".asx [Microsoft Advanced Streaming Extensions]"
TYPEOUTPUTALIAS .pdf ".pdf [Adobe Portable Document Format]"
TYPEOUTPUTALIAS .swf ".swf [Adobe Flash Object]"

# Microsoft Office & Pocket Office
TYPEOUTPUTALIAS .mdb ".mdb [Microsoft Access Database]"
TYPEOUTPUTALIAS .accdb ".accdb [Microsoft Access 2007 Database]"
TYPEOUTPUTALIAS .ppv ".ppv [PowerPoint Viewer File]"
TYPEOUTPUTALIAS .ppt ".ppt [PowerPoint File]"
TYPEOUTPUTALIAS .pptx ".ppxt [PowerPoint File]"
TYPEOUTPUTALIAS .xls ".xls [Excel SpreadSheet]"
TYPEOUTPUTALIAS .xlsx ".xlsx [Excel 2007 SpreadSheet]"
TYPEOUTPUTALIAS .doc ".doc [Microsoft Word Document]"
TYPEOUTPUTALIAS .docx ".docx [Microsoft Word 2007 Document]"
TYPEOUTPUTALIAS .pwd ".pwd [Microsoft Pocket Word Document]"
TYPEOUTPUTALIAS .pwt ".pwt [Microsoft Pocket Word Template]"
TYPEOUTPUTALIAS .pxl ".pxl [Microsoft Pocket Excel Document]"
TYPEOUTPUTALIAS .pxt ".pxt [Microsoft Pocket Excel Template]"
TYPEOUTPUTALIAS .rtf ".rtf [Rich Text Format]"
TYPEOUTPUTALIAS .pub ".pub [Publisher File]"
TYPEOUTPUTALIAS .mps ".mps [Microsoft Pocket Streets Map]"
TYPEOUTPUTALIAS .psm ".psm [Microsoft Pocket Automap Streets]"
TYPEOUTPUTALIAS .ics ".ics [vCalendar / iCalendar File]"

# Databases
TYPEOUTPUTALIAS .db ".db [DataBase File]"
TYPEOUTPUTALIAS .csv ".csv [CSV File]"
TYPEOUTPUTALIAS .dbf ".dbf [Database File]"

# Executables, Installations, Execution & Applications
TYPEOUTPUTALIAS .msi ".msi [Microsoft Installer Package]"
TYPEOUTPUTALIAS .cab ".cab [Cabnet Archive]"
TYPEOUTPUTALIAS .bat ".bat [Batch Files]"
TYPEOUTPUTALIAS .com ".com [Compiled Executable]"
TYPEOUTPUTALIAS .exe ".exe [Executables]"
TYPEOUTPUTALIAS .zip ".zip [Zip archives]"
TYPEOUTPUTALIAS .zip ".rar [Rar archives]"
TYPEOUTPUTALIAS .hlp ".hlp [Windows Help Files]"
TYPEOUTPUTALIAS .chm ".chm [Compiled HTML Help]"
TYPEOUTPUTALIAS .dat ".dat [Internet Explorer Installer Data File]"
TYPEOUTPUTALIAS .dll ".dll [Dynamic Link Library]"

# Misc
TYPEOUTPUTALIAS .bin ".bin [Binary File]"
TYPEOUTPUTALIAS .iso ".iso [CD-ROM/DVD-ROM Image file]"
TYPEOUTPUTALIAS .nrg ".nrg [Nero Burning ROM CD-ROM/DVD-ROM Image file]"
TYPEOUTPUTALIAS .ida ".ida [IIS default.ida - Code Red II Attack]"
TYPEOUTPUTALIAS .reg ".reg [Windows Registry META Data]"

SUBTYPE *.gz,*.Z

To save time, disk space and bandwidth it is recommended that you edit the style sheet location inside the baseline configuration to point to a common location

STYLESHEET images/analog.css

For example, you could create an administrative resource on a separate web to host these resources e.g.

STYLESHEET http://static.mycompany.com/analog.css

Note: This shared resource site will be equally useful if deploying Report Magic stats along with Analog (not covered in this guide).

 

Configure the Analog Site Settings

Now that you have created a common baseline for Analog’s global config, the next step is to configure Analog for each of your sites.

Navigating to d:\sites\www.domain.com create a file called analog.cfg

Open this configuration file and this time enter only the information that you removed from the baseline configuration, ensuring that you tailor the settings to this particular web site For example:

# Default Local Web Statistics Configuration

# Version 1.0.1
LOGFILE d:\sites\www.domain.com\logs\W3SVC1\*.log
OUTFILE d:\sites\www.domain.com\stats\index.html
HOSTNAME "Statistics For Domain.com"

#HOSTURL http://www.domain.com/

IMAGEDIR "images/"

# Referring URL
REFLINKINCLUDE *

# Refering Site

# Refering Site Alias
#REFALIAS http://domain.com/* http://www.domain.com/

#Request Report
REQFLOOR 10r
REQEXCLUDE *.jpg
REQEXCLUDE *.gif
REQEXCLUDE *.png
REQEXCLUDE *.bmp
REQEXCLUDE *.class
REQEXCLUDE *.js

# Custom Exclusions
FILEEXCLUDE /stats/*

You should automate the creation of this configuration file as part of your site creation activities. The fastest way to achieve this is to write the configuration file to the d:\sites\<web site>\analog.cfg as part of your website creation script.

In the above example, a default Analog CE HTML report will be generated in d:\sites\www.domain.com\stats\ (http://www.domain.com/stats/ to the user).

Ensure that you repeat the creation of this file for all web sites on your server.

Running Analog on a Schedule

Now that Analog has been configured, the last step is to schedule it to run. The simplest way to do this is to use a batch script and the Windows task scheduler.

With some creative scripting (exampled below) you can configure the scheduled job to crawl through the d:\sites folder looking for site-level analog.cfg files to parse through to the Analog CE executable.

The example below should be saved via Windows Notepad as d:\parser\GenerateStats.cmd

cls

@echo off

SET ALGROOT=d:\parser\Analog\

SET WEBROOT=d:\sitesFOR /f "tokens=*" %%A IN ('dir /b d:\sites') DO (echo Looking for: analog.cfg in %%A

IF EXIST "%WEBROOT%\%%A\analog.cfg" (
echo Found analog.cfg for %%A
md "%WEBROOT%\%%A\web\stats"
echo.
"%ALGROOT%analog.exe" +g%WEBROOT%\%%A\analog.cfg > %WEBROOT%\%%A\analog.log
echo.
) ELSE (
echo ANALOG CONFIG FILE NOT FOUND IN "%WEBROOT%\%%A\"
echo.
)
)

This script searches all sub-folders of d:\sites for the presence of an analog.cfg file e.g. d:\sites\www.mydomain.com\analog.cfg. If it finds one it ensures that there is an appropriate output directory (\web\stats) and then runs the Analog CE executable using the baseline configuration (this is implicit) as well as the local site-level analog.cfg configuration file (explicit) to produce the report.

The report will be output as specified in the site-level analog.cfg.

Any Analog parser errors will be written into the d:\sites\www.mydomain.com folder as analog.log, allowing for analysis by the client.

In Windows Task Scheduler, create a basic task that will run d:\parser\GenerateStats.cmd ever day, 12 hours, 6 hours (or at whatever interval you require).

Once complete, as you add new web sites (and their associated site-level analog.cfg files) or remove old ones they will automatically be picked-up and parsed through the statistics engine without any administrative intervention.

See Also

View: Analog CE Configuration File Generator