How to enable the Analog Internal Search Query Report for WordPress

Did you realise that Analog CE can report in searches made through internal website search engines as well as external ones? This article discusses how to connect the WordPress search function to the Analog CE Internal Search Reports.

 

Keeping an eye on your users

Knowing what your users are searching for should be a core part of your SEO strategy. The Analog CE Internal Search Query and Internal Search Word reports can be a useful tool in understanding user interest and habits.

Analog CE’s two Internal Search reports are disabled by default. To enable them, you must edit your site config file. If you are using a Global/Site config file structure, you should use the local site config file. Not the Global config file. The exception should be if your Analog CE install is being used for WordPress web-farming.

 

Edit your Analog CE configuration file

Before you can edit the config file you must know the path to your WordPress install. The path should be relative to the root domain of your website.

Tip: Locate the xmlrpc.php, wp-config.php and wp-cron.php files for your install on your web server. This is the root folder of your WordPress installation.

For example. Assuming that your root domain is www.mysite.com:

  • If your wp-config.php is located at www.mysite.com/wp-config.php, your Internal Search Engine Path is “/”.
  • Alternately, if your wp-config.php is found at www.mysite.com/blog/wp-config.php, your Internal Search Engine Path is “/blog/”.

 

Edit your Analog CE configuration file by adding the following lines:

INTSEARCHQUERY ON
INTSEARCHENGINE <Internal Search Engine Path> s
INTSEARCHQUERYFLOOR 10r
INTSEARCHWORDFLOOR 10r

For example:

INTSEARCHQUERY ON
INTSEARCHENGINE / s
INTSEARCHQUERYFLOOR 10r
INTSEARCHWORDFLOOR 10r

This config sample will:

  1. Enable the report
  2. Tells Analog CE how to find the search term (the user provided value of ‘s’)
  3. Sets the Search Query report to only show searches that occur 10 or more times
  4. Configures the Search Word report to only show keywords that occur 10 or more times

 

The next time Analog CE runs, you will have “Internal Search Query Report” and “Internal Search Word Report” sections. If you are running Report Magic, they will be at the bottom of the side-navigation.

Creating a Link Anonymiser Service for Analog CE’s ANONYMIZERURL setting

This article discusses how to create an link anonymiser service redirector to make use of the Analog CE 6.0.16+ ANONYMIZERURL setting.

 

Why use an anonymiser?

If a user clicks a link on an Analog CE “Requesting Site” or “Requesting URL” report. The users web browser will send a HTTP Referrer header with the request to download the web page; this request will include the full URL or your Analog CE report. The receiving server will likely log the request, allowing its owner to see where the request originated.

This may expose the Internet or Intranet URL or your stats page to the target website owner. They may in-turn inadvertantly publicise it via their own statistics page and/or link-back tracker service. This makes it possible for other agents, including competitors, search engines and malicious users to discover information about your website. Worse your web server may become the target of SEO spammers.

 

What is SEO spam?

SEO spam is the practice of attempting to improve a website/page position on a search engine by creating ‘false’ links into that website. If your referring site/URL report is public it is possible for a malicious actor to artificially position one or more URLs on the report. This is achieved through a manipulated HTTP GET request containing a HTTP Referrer header with the URL/site that they want to inject onto your report. After making several hundred requests in this fashion the spammer will wait for the report to be updated. After confirming that their site has appeared in the report, they submit your statistics page(s) to search engines.

Once compromised, it is likely that your exploitability will be recorded in one or more botnets and will see wider exploitation.

 

Why create your own anonymiser?

You can use public anonymiser services such as anonymizer.info or anon.to with Analog CE using one of the code samples below.

ANONYMIZERURL https://anon.to/?

ANONYMIZERURL https://anonymizer.info/?

This may not be acceptable to you, or your organisational security policy. Firstly because while the owner of the resultant web server will not discover the true origin of the request, the public anonymiser service will. Secondly, there is no contract assuring service availability or the privacy of its log files. Finally, it is inevitable that the service is going to profit from your transaction. Advertising placement is likely, creating a delay in the redirect.

 

Code your own Anonymous Link Redirector

The following code snippets can be used to program your own basic redirector using service side scripting technology.

 

ASP 3 / Classic ASP

<%@LANGUAGE="VBSCRIPT" CODEPAGE="65001" EnableSessionState="False"%>
<% Option Explicit %>
<%
  Response.Status = "302 Found"
  call Response.AddHeader("Location", Request.QueryString)
  Response.End()
%>

Save the file as redirector.asp and add the following to your Analog CE global configuration file:

ANONYMIZERURL http://my-server.domain.com/redirector.asp?

 

ASP.net

<%@ Page Language="C#" %>
<script runat="server">
  private void Page_Load(object sender, EventArgs e)
  {
    Response.Redirect(HttpContext.Current.Request.ServerVariables["QUERY_STRING"], true);
  }
</script>

Save the file as redirector.aspx and add the following to your Analog CE global configuration file:

ANONYMIZERURL http://my-server.domain.com/redirector.aspx?

 

PHP

<?php
  header('Location: ' . $_SERVER['QUERY_STRING'], true, 302);
  exit;
?>

Save the file as redirector.php and add the following to your Analog CE global configuration file:

ANONYMIZERURL http://my-server.domain.com/redirector.php?

 

Conclusion

The above code samples illustrate how to create a redirector in several different languages. The redirector URL will be sent to the destination server however the originating statistics page will now be protected. This protects your Analog CE stats pages from prying eyes while reducing the risk of SEO spamming.

Analog’s Search Query Report is empty or sparsely populated

If you are an Analog CE user, you may have noticed that the Search Query and Search Word reports suggest that your site has been receiving little to no search engine traffic. The Search reports may be empty or sparsely populated with results. This article discusses the cause of the problem.

 

What are the External Search Reports?

The premise of the external (rather than internal) search report is to show you what a visitor typed into a search engine before landing on your site. The Search Query report shows the entire search phrase used, while the Search Word report creates a word count report for anything de-marked by a space.

They are intended to help you identify what your users are searching for as part of your SEO activities. Unfortunately, over the last couple of years the data available on the report has been declining. On some sites, the report may have disappeared altogether as no data is available to Analog CE.

 

Fixing the empty or sparsely populated report

There are three reasons for the decline in report quality. They are outlined below in order of lest to most severe.

 

Missing Search Engines

The default config file in the Analog CE source release is a clone of the original Analog 6 config file. It (and your own) config files have not been updated with modern search engines. Consequently traffic from Bing and DuckDuck Go (and others) will be wholly absent.

I will be including an updated config file sample in Analog CE starting with Analog CE 6.0.16. Analog CE users will need to replace or merge this into your running configuration files. To perform this manually, you must add relevant SEARCHENGINE entries to your global/site config files.

 

HTTP > HTTPS

As a result of security issues and the gross mismanagement of private data transiting public networks by the CIA. The world made a rapid switch from HTTP to HTTPS in 2015. The default Analog 6 config file only included Search Engine detection definitions for unencrypted HTTP connections. Consequently Analog CE has stopped reporting traffic from any HTTPS search engine source.

Starting with Analog CE 6.0.16, I will fix this in the bundled sample file. You will need to merge the SEARCHENGINE changes into you respective config files.

 

Privacy & Commercialisation of Analytics

The crux of the problem stems from Search Engines no-longer send the search query in their referrer header. Some see this as being for legitimate privacy reasons, others for commercial benefit. DuckDuck Go would certainly argue the former. For Google, the reason is less clear. With up to 98% market share, Google is effectively forcing webmasters to use Google Search Console and/or Google Analytics to understand their user needs. In turn, this feeds Google with more accurate data on user activity for ad-profiling. It also ensures that you as a webmaster are creating the web that Google wants, above any other concern. Any webmaster threatened with de-listing due to mobile device incompatibility will understand this.

The following sample log line is from the 5th April 2015

2015-04-05 15:38:29 W3SVC5 web02 xxx.xxx.xxx.xxx GET /downloads/msie/ie60sp1/ - 80 - yyy.yyy.yyy.yyy HTTP/1.1 Mozilla/4.0+(compatible;+MSIE+4.01;+Windows+98) - http://www.google.nl/search?q=internet+explorer+download+windows+98&ie=ISO-8859-1&hl=nl&source=hp&gbv=1 www.hpcfactor.com 200 0 0 10747 331 406

This is a similar one from 4th June 2019

2019-06-04 00:17:31 W3SVC5 web01 xxx.xxx.xxx.xxx GET /hardware/devices/specification.asp d=142 443 - yyy.yyy.yyy.yyyy HTTP/2 Mozilla/5.0+(Linux;+Android+6.0;+HM-G552-FL+Build/MRA58K)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/56.0.2924.87+Mobile+Safari/537.36 - https://www.google.com/ www.hpcfactor.com 200 0 0 12534 491 495

If you scroll to the right, you will notice that in the earlier file, the user searched for “Internet Explorer Download Windows 98”. In the latter file, Google only informs the server that the request came from google.com. The context of the search is no longer offered.

As a consequence, the data required to create the Search Query report is not available to Analog CE. This is the reason why the Analog Search Query Report is empty or sparsely populated. It is not a bug. Rightly or wrongly, it is a symptom of the modern web.

Unable to update NuGet or Packages in Powershell due to “WARNING: Unable to download the list of available providers. Check your internet connection.”

When attempting to install or update PowerShell Modules, NuGet or NuGet packages in PowerShell 5. You receive one or more of the following errors

WARNING: Unable to resolve package source 'https://www.powershellgallery.com/api/v2/'.

The underlying connection was closed: An unexpected error occurred on a receive.

WARNING: Unable to download the list of available providers. Check your internet connection.

Equally, you may receive the same error when attempting to run a WGET or an Invoke-WebRequest command e.g.

wget https://www.google.com/

You are unable to install/update the software component or make an outbound internet connection.

This issue may be especially prevalent on IIS installations serving HTTPS websites.

The Fix

Conventional troubleshooting is fairly well documented on-line

  1. Ensure that you are actually able to open a https webpage in a web browser
  2. Ensure that your DNS is working correctly.
  3. Check to see whether wget can connect to a non-https site e.g.
    wget http://www.google.com/
  4. Check to see whether or not you need to use a Proxy server. If so, you must configure PowerShell to use your Proxy Server before you proceed. This may require you to to configure PowerShell with your Proxy Server credentials.
    $webclient=New-Object System.Net.WebClient
    $webclient.Proxy.Credentials = [System.Net.CredentialCache]::DefaultNetworkCredentials

A less obvious issue to explore related to the default operating system security configuration for using SSL.

More Info

By default, Windows Server and Windows client will allow SSL3, TLS 1.0, TLS 1.1 and TLS 1.2. The .net Framework is also configured to allow these protocols, and, by default, any outbound request for a SSL site will attempt to use SSL3/TLS 1.0 as its default protocol.

In secure environments, where system administrators have enabled recommended best practice on Windows systems to disable the use of SSL1, 2,3 and TLS 1.0. PowerShell is not currently clever enough to internally compare its configuration to that of the operating system. consequently, when attempting to make an outbound https request in such an environment. PowerShell will attempt to use one of the older protocols which has been disabled by the operating system’s networking stack. Instead of re-attempting the request using a higher protocol. PowerShell will fail the request with one of the error messages listed at the beginning of the article.

As NuGet and Update-Module both attempt to make connections to Microsoft servers using HTTPS, they too will fail.

Encountering this issue on a SSL enabled IIS install will be more common, as it is more likely that system administrators will have applied best practice and disabled legacy encryption protocols on these servers. their public facing, high visibility should demand such a response.

To fix the issue there are two options:

  1. Reconfigure and reboot the system to re-enable client use of TLS 1.0 (and possibly SSL3) via
    HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Protocols\<protocol>\Client

    DisabledByDefault = 0
    Enabled = ffffffff (hex)

  2. Alternatively, you must set-up each PowerShell environment so that the script itself knows not to use the legacy protocol versions. This is achieved via the following code which restricted PowerShell to only using TLS 1.1 and TLS 1.2.
    [System.Net.ServicePointManager]::SecurityProtocol = [System.Net.SecurityProtocolType]'Tls11,Tls12'