<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>

<channel>
	<title>eantics web design</title>
	<atom:link href="http://www.eantics.co.uk/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.eantics.co.uk</link>
	<description>The current website and blog of eantics Ltd web design and web application development agency in cheshire UK</description>
	<pubDate>Tue, 06 Jan 2009 21:13:36 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.6</generator>
	<language>en</language>
			<item>
		<title>SQL injection reinvented in 2008</title>
		<link>http://www.eantics.co.uk/sql-injection-reinvented-in-2008/</link>
		<comments>http://www.eantics.co.uk/sql-injection-reinvented-in-2008/#comments</comments>
		<pubDate>Tue, 06 Jan 2009 21:13:36 +0000</pubDate>
		<dc:creator>Oliver Phillips</dc:creator>
		
		<category><![CDATA[web design]]></category>

		<category><![CDATA[asprox]]></category>

		<category><![CDATA[sql injection]]></category>

		<guid isPermaLink="false">http://www.eantics.co.uk/?p=557</guid>
		<description><![CDATA[In 2008 a very clever type of SQL injection attack defaced many thousands of websites with malware that potentially had the capability to infect the computers of those who visited infected websites. It was christened ASProx.
SQL injection is an old and known threat, it involves inputting SQL into a form field, or passing it as [...]]]></description>
			<content:encoded><![CDATA[<p>In 2008 a very clever type of SQL injection attack defaced many thousands of websites with malware that potentially had the capability to infect the computers of those who visited infected websites. It was christened ASProx.</p>
<p>SQL injection is an old and known threat, it involves inputting SQL into a form field, or passing it as part of a querystring, with the aim of modifying the SQL statement itself. Whereas normally the input would form part of the criteria in the executed statement, an SQL injection attack changes the nature of the SQL statement itself which is executed in the website script.  Not a good thing!</p>
<p>Historically such attacks have been relatively easy to prevent.  Good validation of input on the client and server allows the syntax required to create such an attack to be intercepted, thereby rendering it harmless.  Any web designer worth their salt would have in place one or more routines that checked submitted input, before allowing it to become part of the SQL executed in their application, so it was a manageable threat.</p>
<p>Then came ASProx, a new form on injection attack to which ASP scripts and MS SQL Server where particularly vulnerable; it could bypass established validation routines and &#8220;appear&#8221; to be valid input, but once executed as part of your SQL script, could do pretty much what it wanted with your database, and usually your web pages built from data in your database.</p>
<p>A couple of things set this apart from traditional injection attacks.</p>
<ul>
<li>It was automated. The attack was delivered by a Bot, it searched for potentially vulnerable ASP pages, and would run until the payload (malicious code) was detected on the webpages. The Bot would then move on to the next website.  The big deal here is that the Bot would come back, day after day, so even if you cleaned your database up, without implementing some further protection in your script your database would be compromised again and again and again.</li>
<li>The attacks used Fast Fluxing, which basically means the source of the attack is difficult to identify as the Bot uses the infected PCs as proxies.</li>
<li>The syntax of of the injection attack was unusual.  It was encoded, so that quite a long SQL script could be passed in the querystring, but the usual injection checks could miss it.</li>
</ul>
<p>I wrote a basic function that will test for the existence of a couple of ASProx signature traits - it&#8217;s ideal to wrap around your database connection code/include as if ASProx can&#8217;t connect it cannot do any damage!</p>
<p>I&#8217;ve provided the syntax below as I noticed some people are charging for similar help!!!!!!!!!!!!!!!!!</p>
<p><code><span style="color: #000000;">Function noASProx(thequerystring)<br />
 &#8221; Querystring injection check for ASPROX<br />
  strquery=thequerystring<br />
  strquery = Replace(URLDecode(strquery), &#8221; &#8220;, &#8220;&#8221;)<br />
  If Not InStr(UCase(strquery),&#8221;EXEC(&#8221;) &gt; 0 OR Len(strquery) &gt; 500 Then<br />
    noASProx=1 &#8221; clean string<br />
  Else<br />
    noASProx=0 &#8221; infected string<br />
  End If<br />
End Function</span></code></p>
<p>The full querystring needs to be passed to this function using the request.servervariables (&#8221;querystring&#8221;) code.  If True/1 is returned allow access to your database connection code/include, if it&#8217;s not don&#8217;t!</p>
<p>If you&#8217;ve already been hit by ASProx, put the above in place. Then I&#8217;d recommend connecting to your SQL Server Database using an MS Access Project. In doing so, you can, without needing to write a complicated stored procedure, perform a &#8220;find and replace&#8221; on each table in your database.  Obviously you find the malicious payload code which (normally) starts with &lt;script&gt; and ends with &lt;/script&gt;, and you replace with nothing - thereby deleting it.</p>
<p>The people behind ASProx are clever - they found a weakness, they exploited it. I don&#8217;t condone it, but I often marvel at why people with intellect like these guys obviously have, choose to use it in this way. I wish I had their talent.  Legitimately applied to what I do - I&#8217;d be a millionaire!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.eantics.co.uk/sql-injection-reinvented-in-2008/feed/</wfw:commentRss>
		</item>
		<item>
		<title>VAT rate reduction woes</title>
		<link>http://www.eantics.co.uk/vat-rate-reduction-woes/</link>
		<comments>http://www.eantics.co.uk/vat-rate-reduction-woes/#comments</comments>
		<pubDate>Tue, 06 Jan 2009 20:15:20 +0000</pubDate>
		<dc:creator>Oliver Phillips</dc:creator>
		
		<category><![CDATA[blogging]]></category>

		<category><![CDATA[vat]]></category>

		<category><![CDATA[vat rate reduction]]></category>

		<category><![CDATA[web design]]></category>

		<guid isPermaLink="false">http://www.eantics.co.uk/?p=555</guid>
		<description><![CDATA[My first post of 2009, happy new year all.  So far I&#8217;ve spent the first 2 days back at work (full time) dealing with queries about revised monthly billing.  Some of it via eantics, most of it via IFA Portals my other business interest which specialises in websites for the financial services sector.
Talking to other [...]]]></description>
			<content:encoded><![CDATA[<p>My first post of 2009, happy new year all.  So far I&#8217;ve spent the first 2 days back at work (full time) dealing with queries about revised monthly billing.  Some of it via eantics, most of it via IFA Portals my other business interest which specialises in websites for the financial services sector.</p>
<p>Talking to other small business owners I&#8217;ve met, few, er well none actually,  would cite the December 1st 2008 VAT reduction as a good thing.  It has cost small businesses money to administer and account for the change.  At both eantics and IFA Portals we chose to implement the change and make sure that we passed the rate reductions back to clients - most of whom pay monthly and <strong>aren&#8217;t </strong>VAT registered so can&#8217;t claim back the input tax.  We&#8217;re not talking about a fortune, 2.5% of the average net monthly invoice is not much - but it was the right thing to do so we did it.</p>
<p>However, the costs involved in time and postage e.t.c are proving to be significant, not to mention the opportunity cost - how we might have spent the time otherwise i.e. marketing, on the phone to clients, pitching for new business. </p>
<p>Our clients are mostly paying via standing order, which we can&#8217;t control, but IFA Portals has a new direct debit facility so we wrote out to all our clients explaining the change, and that we needed the direct debit mandate back.  We also mentioned the standing order needed cancelling by them - as we can&#8217;t do it.</p>
<p>So we get the direct debits back and calculate their first payment for January which is a curious mix of VAT calculated at 15% for the month ahead, a rebate for the  month of December as they paid VAT at 17.5% (there was no way we could respond by December 1st 2008), and it&#8217;s prorata as the direct debits are collected on the 1st of each month but standing orders fell whenever their bank set them up.  Once we knew the charge we wrote to them to inform them as you must with direct debits.  We need to write to them again to say what their regular payment from February 1st will be, but that aside, job done we thought.</p>
<p>Not quite, we&#8217;re taking a lot of calls and emails querying the new amounts as they are not the usual amount, and in some cases taking calls and emails from people accusing us of charging them twice! Yes, you guessed it, they didn&#8217;t cancel their standing order despite the instructions we communicated, so now we&#8217;re into a round of issuing cheques for the overpayments and the associated bank costs that go with that. </p>
<p>I absolutely understand why people have forgetten to cancel their standing orders; it&#8217;s Christmas for one, but more importantly the financial services sector is badly hit by the current economic state - our clients have more important things on their mind.  So we&#8217;ll refund all the overpayments promptly for the same reason we thought it important to pass on the VAT reduction in the first place.</p>
<p>I anticipate much more of the same for the rest of January, and I&#8217;m left wondering what Darling thought he would achieve by reducing the rate of VAT.  Businesses like ourselves and IFA Portals that bill on a monthly basis have incurred significant costs and time out from the day job, at exactly the time when small businesses access to cash (credit) is more restricted than ever. Small businesses prospects for acquiring much new business in the short term look average to poor, which mean small business owners must work harder at marketing and client acquistion (and retention!!!!), but instead they&#8217;re messing about with VAT reductions!</p>
<p>Not a good recipe for kickstarting the economy in my book.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.eantics.co.uk/vat-rate-reduction-woes/feed/</wfw:commentRss>
		</item>
		<item>
		<title>When you can&#8217;t use MS Excel Web Query</title>
		<link>http://www.eantics.co.uk/when-you-cant-use-ms-excel-web-query/</link>
		<comments>http://www.eantics.co.uk/when-you-cant-use-ms-excel-web-query/#comments</comments>
		<pubDate>Tue, 23 Dec 2008 12:05:35 +0000</pubDate>
		<dc:creator>Oliver Phillips</dc:creator>
		
		<category><![CDATA[desktop]]></category>

		<category><![CDATA[scripting]]></category>

		<category><![CDATA[MS Excel]]></category>

		<category><![CDATA[web query]]></category>

		<guid isPermaLink="false">http://www.eantics.co.uk/?p=552</guid>
		<description><![CDATA[MS Excel Web Query is one of my favourite bits of kit, I used it extensively when I worked in Finance, and I still use it regularly at eantics Ltd for some applications. It basically lets you bring data off a web page and use it in your workbook. That could be internet, intranet, or [...]]]></description>
			<content:encoded><![CDATA[<p>MS Excel Web Query is one of my favourite bits of kit, I used it extensively when I worked in Finance, and I still use it regularly at eantics Ltd for some applications. It basically lets you bring data off a web page and use it in your workbook. That could be internet, intranet, or just a local html page.</p>
<p>Not all websites are set up in such a way that you can select just the data on the page you want.  Some will only let you grab all the page, and some frustratingly appear to let you grab the page but don&#8217;t return anything in your query.</p>
<p>If you need that data for use in your Excel application this is less than satisfactory, so it&#8217;s time to play hardball and leverage the power of VBScript with your own home grown scraper.</p>
<p>I&#8217;m not going to do a step by step, as anyone reading this is probably able to grab the concepts and do it themselves, so here are the fundamentals.</p>
<p>You may have heard of Ajax.  In windows ASP applications the engine room of ajax is the XMLHTTP Object and it&#8217;s this object that we will use to power our scraper.  This ActiveX object will grab an entire page of HTML and hold it in memory. </p>
<p>All you need to do is work out how to parse that data to get at the bits of data you need. This is where the split function comes in, which will split the data into an array based on a separator you choose. The first split you might want to make would split the page into rows and you do this by splitting on chr(10).</p>
<p><code>mynewarray = split(mypage,chr(10))</code></p>
<p>You then need to work with that array, a good first step is to learn how big it is</p>
<p><code>mynewarraysize = ubound(mynewarray)</code></p>
<p>The array starts from 0 when created via the split function, so if 12 is returned you know your array has 13 elements.</p>
<p>So what now.  Well you could output this into a worksheet, using a loop based on the array size to write to cells, or you could perform further split functions to get at the data you want.  You might also use replace, to strip HTML tags in the data that you don&#8217;t need outputted as part of your information.</p>
<p>I might post some more on this at some point, if people want to know more.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.eantics.co.uk/when-you-cant-use-ms-excel-web-query/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Accessibility in web design</title>
		<link>http://www.eantics.co.uk/accessibility-in-web-design/</link>
		<comments>http://www.eantics.co.uk/accessibility-in-web-design/#comments</comments>
		<pubDate>Fri, 19 Dec 2008 12:44:29 +0000</pubDate>
		<dc:creator>Oliver Phillips</dc:creator>
		
		<category><![CDATA[web design]]></category>

		<category><![CDATA[accessibility]]></category>

		<category><![CDATA[web standards]]></category>

		<guid isPermaLink="false">http://www.eantics.co.uk/?p=547</guid>
		<description><![CDATA[Website accessibility is a big area, legislation exists in both the USA and UK which mandates that websites should meet certain accessibility design criteria, so that they may be accessed by people with disabilities.  So far it&#8217;s not often enforced in the UK, but businesses small and large are taking note!
The good news is that if [...]]]></description>
			<content:encoded><![CDATA[<p>Website accessibility is a big area, legislation exists in both the USA and UK which mandates that websites should meet certain accessibility design criteria, so that they may be accessed by people with disabilities.  So far it&#8217;s not often enforced in the UK, but businesses small and large are taking note!</p>
<p>The good news is that if a website is designed to W3 standards both for it&#8217;s HTML markup and CSS markup, it is already a long way down the road to being accessible.  Here&#8217;s a useful tool you can use to evaluate just how <a title="How accessible is your website" href="http://sipt07.si.ehu.es/evalaccess2/index.html">accessible your website is</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.eantics.co.uk/accessibility-in-web-design/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Cliksee and Oopsydoorway live</title>
		<link>http://www.eantics.co.uk/cliksee-and-oopsydoorway-live/</link>
		<comments>http://www.eantics.co.uk/cliksee-and-oopsydoorway-live/#comments</comments>
		<pubDate>Fri, 19 Dec 2008 12:32:49 +0000</pubDate>
		<dc:creator>Oliver Phillips</dc:creator>
		
		<category><![CDATA[news]]></category>

		<category><![CDATA[free stuff]]></category>

		<category><![CDATA[web design]]></category>

		<guid isPermaLink="false">http://www.eantics.co.uk/?p=545</guid>
		<description><![CDATA[A while back we added the sandpit section to our website.  We&#8217;ve now included two projects in there, Cliksee and email marketing campaign tracking application and Oopsydoorway.  A doorway page spam detection tool, that allows you to identify competitor websites gaining unfair advantage in Googles index through the use of geographically tailored URIs.  Both are free [...]]]></description>
			<content:encoded><![CDATA[<p>A while back we added the sandpit section to our website.  We&#8217;ve now included two projects in there, <a title="Cliksee - Email Marketing Campaign Tracking" href="http://www.cliksee.co.uk">Cliksee</a> and email marketing campaign tracking application and <a title="Oopsyfdoorway Doorway Spam Detection" href="http://www.oopsydoorway.co.uk">Oopsydoorway</a>.  A doorway page spam detection tool, that allows you to identify competitor websites gaining unfair advantage in Googles index through the use of geographically tailored URIs.  Both are free to use.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.eantics.co.uk/cliksee-and-oopsydoorway-live/feed/</wfw:commentRss>
		</item>
		<item>
		<title>I need to update this more often!</title>
		<link>http://www.eantics.co.uk/i-need-to-update-this-more-often/</link>
		<comments>http://www.eantics.co.uk/i-need-to-update-this-more-often/#comments</comments>
		<pubDate>Fri, 19 Dec 2008 12:25:37 +0000</pubDate>
		<dc:creator>Oliver Phillips</dc:creator>
		
		<category><![CDATA[blogging]]></category>

		<guid isPermaLink="false">http://www.eantics.co.uk/?p=543</guid>
		<description><![CDATA[I&#8217;m conscious that an old blog is probably worse than no blog, so there&#8217;s a need to keep it fresh and keep posting.  I visit some blogs and there are pages and pages of info, they seem to post every day&#8230;.  Where do they find the time is what I want to know.
]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m conscious that an old blog is probably worse than no blog, so there&#8217;s a need to keep it fresh and keep posting.  I visit some blogs and there are pages and pages of info, they seem to post every day&#8230;.  Where do they find the time is what I want to know.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.eantics.co.uk/i-need-to-update-this-more-often/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Pay monthly for your website design</title>
		<link>http://www.eantics.co.uk/pay-monthly-for-your-website-design/</link>
		<comments>http://www.eantics.co.uk/pay-monthly-for-your-website-design/#comments</comments>
		<pubDate>Wed, 17 Dec 2008 19:01:59 +0000</pubDate>
		<dc:creator>Oliver Phillips</dc:creator>
		
		<category><![CDATA[news]]></category>

		<category><![CDATA[web design]]></category>

		<category><![CDATA[web design runcorn]]></category>

		<category><![CDATA[web design warrington]]></category>

		<guid isPermaLink="false">http://www.eantics.co.uk/?p=539</guid>
		<description><![CDATA[eantics is now offering a monthly payment option on new website design commissions.  We&#8217;ll charge 25% of our quote upfront, with the remainder invoiced over a period of 6 months.  Please contact us for more information.
]]></description>
			<content:encoded><![CDATA[<p>eantics is now offering a monthly payment option on new website design commissions.  We&#8217;ll charge 25% of our quote upfront, with the remainder invoiced over a period of 6 months.  Please <a title="Contact Us" href="http://www.eantics.co.uk/contact/">contact us</a> for more information.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.eantics.co.uk/pay-monthly-for-your-website-design/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Oopsydoorway - Doorway Page Detection</title>
		<link>http://www.eantics.co.uk/oopsydoorway-doorway-page-detection/</link>
		<comments>http://www.eantics.co.uk/oopsydoorway-doorway-page-detection/#comments</comments>
		<pubDate>Fri, 12 Dec 2008 09:12:58 +0000</pubDate>
		<dc:creator>admin</dc:creator>
		
		<category><![CDATA[sandpit]]></category>

		<category><![CDATA[work]]></category>

		<category><![CDATA[doorway page spam detection]]></category>

		<guid isPermaLink="false">http://www.eantics.co.uk/?p=531</guid>
		<description><![CDATA[Oopsydoorway is a simple Beta application to detect doorway page spam in Google&#8217;s index.  It uses the Google AJAX api, XMLHttp and a database of 1000 towns to test websites for geographically tailored URIs. Use it to report websites gaining unfair advantage by not following the Google Webmaster Guidelines.
]]></description>
			<content:encoded><![CDATA[<p>Oopsydoorway is a simple Beta application to detect doorway page spam in Google&#8217;s index.  It uses the Google AJAX api, XMLHttp and a database of 1000 towns to test websites for geographically tailored URIs. Use it to report websites gaining unfair advantage by not following the Google Webmaster Guidelines.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.eantics.co.uk/oopsydoorway-doorway-page-detection/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Update on Doorway Page Detection Application</title>
		<link>http://www.eantics.co.uk/update-on-doorway-page-detection-application/</link>
		<comments>http://www.eantics.co.uk/update-on-doorway-page-detection-application/#comments</comments>
		<pubDate>Thu, 11 Dec 2008 19:46:54 +0000</pubDate>
		<dc:creator>Oliver Phillips</dc:creator>
		
		<category><![CDATA[web design]]></category>

		<category><![CDATA[doorway page spam detection]]></category>

		<guid isPermaLink="false">http://www.eantics.co.uk/?p=525</guid>
		<description><![CDATA[I posted this recently http://www.eantics.co.uk/doorway-page-spam-detection/
The application has come on a bit since then, I&#8217;ve basically changed it so it searches on main towns; most doorway pages are created on main town names rather than place names which could include hamlets and villages.  This takes a database of 26,000 down to just under a 1000 and makes the load on [...]]]></description>
			<content:encoded><![CDATA[<p>I posted this recently <a rel="nofollow" href="http://www.eantics.co.uk/doorway-page-spam-detection/">http://www.eantics.co.uk/doorway-page-spam-detection/</a></p>
<p>The application has come on a bit since then, I&#8217;ve basically changed it so it searches on main towns; most doorway pages are created on main town names rather than place names which could include hamlets and villages.  This takes a database of 26,000 down to just under a 1000 and makes the load on my servers more bearable. The downside is it doesn&#8217;t show just how spammy some websites are!  Believe me, some have a doorway page created for everyone of those 26000 based on the results the spider returned!</p>
<p>It&#8217;s now also progressed beyond spider - it will be a web application you can use in the next few hours.  It&#8217;s quite simple - and rough at the edges,  but I intend to post the code and the database for others who want to build something like this or help make it better.  It&#8217;s written in ASP and uses a MySQL database on my server (now that&#8217;s what you call a mashup) - I&#8217;ve named it oopsydoorway.  You can see the <a title="Doorway Page detection" href="http://www.oopsydoorway.co.uk">doorway page detection application </a>here</p>
<p>Current issues</p>
<ul>
<li><strong>Badly configured websites.</strong>  If a page does not exist, a 404 error should be returned.  This would tell the application there was no page there.  But many websites (particularly blog apps) it seems return a 200 Ok  and not a 404 Not Found response.  This gets them listed as a potential doorway page,when they might not be, as the server might not be configured correctly.  On the other hand they might; returning a 200 Ok response is a known spam technique.</li>
<li><strong>Directories.</strong>  Of all the websites that might have a valid reason to serve up geographically tailored URIs, directories are probably top of the list.  Is it spam or is it good categorisation for the purposes of indexing their results? To be fair - probably the latter.  But Jo Smo Webhost, based in Accrington and working from his bedroom, isn&#8217;t running a directory so his doorway pages are spam.</li>
<li><strong>Encoded URLs</strong> - It&#8217;s missing a few don&#8217;t know why at the moment - will look when I get a moment.</li>
</ul>
<p>I&#8217;ll post the code and the database data once I&#8217;ve got the app live and made few tweaks.  I&#8217;d like the app to be able to score a website, i.e if it returns 2 geographically tailored URIs its hardly criminal, but 10, 15 or 20 that&#8217;s a different matter.  I&#8217;d also like the app to be able to complete the Google spam report.  I don&#8217;t think this can be done via their API, so might have to settle for cut and paste.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.eantics.co.uk/update-on-doorway-page-detection-application/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Doorway Page (spam) Detection?</title>
		<link>http://www.eantics.co.uk/doorway-page-spam-detection/</link>
		<comments>http://www.eantics.co.uk/doorway-page-spam-detection/#comments</comments>
		<pubDate>Tue, 09 Dec 2008 00:01:26 +0000</pubDate>
		<dc:creator>Oliver Phillips</dc:creator>
		
		<category><![CDATA[scripting]]></category>

		<category><![CDATA[web design]]></category>

		<category><![CDATA[doorway page]]></category>

		<category><![CDATA[seo]]></category>

		<guid isPermaLink="false">http://www.eantics.co.uk/?p=518</guid>
		<description><![CDATA[Anyone following my blog will know my frustration, shared by my clients, and many thousands of other webmasters no doubt, in relation to doorway page spamming.  It is against the Google Webmaster Guideines, yet it is dominant in many search results.
My stance is DON&#8217;T go that route  - instead build a better resource, write more unique and high quality content; [...]]]></description>
			<content:encoded><![CDATA[<p>Anyone following my blog will know my frustration, shared by my clients, and many thousands of other webmasters no doubt, in relation to doorway page spamming.  It is against the Google Webmaster Guideines, yet it is dominant in many search results.</p>
<p>My stance is DON&#8217;T go that route  - instead build a better resource, write more unique and high quality content;  it&#8217;ll get the links and give the search engines something they can&#8217;t find elsewhere.</p>
<p>But others in my sector, who are building their web marketing strategy around doorway pages (and given my &#8220;sector,&#8221;  they are probably doing it for their clients too)  are,  in many cases,  ranking higher than my site!  It doesn&#8217;t lend much credence to my theories - not in <strong>my </strong>Clients&#8217; eyes I can tell you!</p>
<p>I&#8217;ve submitted spam reports in the past, but in my opinion a problem with them is that you don&#8217;t get any feedback.  Certainly, the sites I&#8217;ve reported are still in the results, where they were before, as far as I can see, so at best I&#8217;m left assuming that my report contributed to the algorthym,  or at worst wondering if anyone acted on it, or even saw it.</p>
<p>So, I&#8217;ve just got hold of a database with 26000+  UK place names in it.   But what to do with it?</p>
<p>Well, I could punch out 26000 different doorway pages (and the rest) for my web design website and probably get much more traffic and enquiries than I do now. They say, if you can&#8217;t beat them join them, but I&#8217;ve no plans to.  I could more likely get bounced out of Google knowing my luck!!</p>
<p>Instead, I thought I&#8217;d tackle the problem head on.  Given than Google can&#8217;t identify doorway pages to my satisfaction, I thought I&#8217;d have a bash, my ultimate aim - a simple website that allowed anyone to search in their business and geographic sector, identify the spam results, and submit spam reports to Google.  </p>
<p>So far I have built a three spiders:</p>
<p>The first looked for links to doorway pages.  That didn&#8217;t fly. Doorway pages often don&#8217;t have any link out to them. The only links are inbound to the homepage.  This renders spidering ineffective as a means to determine the extent of the doorway pages.</p>
<p>Next I went straight to the sitemap. Another blank. Doorway pages, would you believe, often don&#8217;t appear in the websites sitemap.  So there are no clues in there either.</p>
<p>This is not easy&#8230;. change of tack I think.</p>
<p>Actually, my third spider works! It uses that database of 26000+ UK place names I mentioned earlier, it basically picks out geographically tailored URLs (at the moment based on a seed), and then looks for similar pages using place name substitution in that pages URL.  i.e take a search for &#8220;web design warrington&#8221;. It would check the database, determine that warrington is in Cheshire, then test that same page for all other Cheshire place names.</p>
<p>You can spot the problem? Other than it being resource intensive on my servers, it&#8217;s quite unfair to websites who are not employing spammy optimisation techniques - if they feature the seed (&#8221;warrington&#8221; in the above case) in their domain, directory or file names, they&#8217;ll be fed just as many URLs as the spammy ones, but they&#8217;ll return 404&#8217;s instead of 200 responses. Not Ideal!</p>
<p>However, so far I have uncovered a couple of websites that are absolutely taking the michael.   Yes if your logs show 200&#8217;s on all your pages in a 3 minute time frame - it was me.   Love the fake 404 pages that are return 200&#8217;s as if someone messed up the server configuration - very nice though if you mistype the placename - you get a proper 404! </p>
<p>I&#8217;m onto something though, it needs refining, it needs a way to determine the most likely placenames that will be used to spam based on the location already found, some sort of distance calculation, then I think It&#8217;d be fairer to the clean websites as it could give up after the obvious keywords had been tried, thereby not filling their logs with irrelevant 404s and it wouldn&#8217;t take my server away from me for minutes at a time!</p>
<p>I&#8217;m going to keep at this and I&#8217;d love to hear about any similar attempts to build spam busting spiders too.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.eantics.co.uk/doorway-page-spam-detection/feed/</wfw:commentRss>
		</item>
	</channel>
</rss>
