Archive for the ‘Software Development’ Category

Article: Introduction to Information Retrieval - Search Engines

Monday, January 8th, 2007

This article aims to provide readers with an overview of the very basics of information retrieval. Understanding these principles can help you to optimise your website content for the search engines and also help you to analyse search engine algorithm changes. However, the details in this article are not intended to describe how modern search engines work, as they use many additional factors, including link analysis.

Information retrieval (IR) is the science of searching for documents / within documents. Information retrieval techniques form some of the most fundamental elements of web search engine technology. This article will discuss information retrieval in the context of search engines.

(more…)

Calling SpamAssassin Programmatically Using C#

Thursday, December 21st, 2006

We have recently been developing a program which involves retrieving emails into a ASP.NET system written in C#.

It was quickly realised that we would also need a spam filter to avoid spam messages clogging up the system. Firstly we created a black list and a white list to allow us to pre-approve/disapprove emails from certain addresses.

Next, we needed some way of retrieving a spam score (or probability) for the remaining messages. We looked immpediately to SpamAssassin and easily got it working on our Windows server using the excellent how to guide: Using SpamAssassin with Win32.

However, we did not want the mail to be filtered at the mail server level, we wanted to spam check the messages in our system as they were retrieved and store their spam score.

(more…)