SomethingCool.com LLC Logo
            

About Us

Services

Articles

Support

Contact Us

 

 

Cool

Crash Course in using Linux Command Line to comb through DansGuardian Web Proxy Filter log files

 

First order of business is getting you a SSH client.  We have great luck with PuTTY, a free SSH client available at: http://www.chiark.greenend.org.uk/~sgtatham/putty/

Log into your proxy server... This example is all on a Mandrake Multi-Network Firewall 8.2. You should be able to adapt to your specific environemnt.

Become root:

Type “su”   (without quotes.  Unless specified otherwise, always leave out the quote marks when actually typing in the command.)

Input root’s password when prompted.

When you log in, the prompt changes from a $ symbol to a # symbol to signify you’re now root.

Now you will be able to browse your log files in command line.  I have given a few examples of what you can do, and how to do them:

A.  See in “real time” the web site traffic going through your firewall proxy server.

[root@localhost jallen]# cd /var/log/dansguardian/    (Tip: hit tab button while typing dansguardian after “dans” and it will “auto-complete” for you . This trick works with any command/file/path names- a real finger-saver!)

[root@localhost dansguardian]# ls -la

total 614128

drwx------    2 squid    squid        4096 May 19  2004 ./

drwxr-x--x   19 root     adm          4096 Jan 20 04:03 ../

-rw-------    1 squid    squid    40026 Jan 20 13:45 access.log

[root@localhost dansguardian]#

See here, there is a file called access.log.  This is where all the web proxy traffic is being logged to.

To see the traffic in real time:

[root@localhost dansguardian]# tail -f access.log

2005.1.20 13:55:59  10.0.4.255 http://search.ebay.com/72-chevy_W0QQfkrZ1QQfromZR8/  GET 85076

2005.1.20 13:56:00  10.0.4.255 http://include.ebaystatic.com/aw/pics/css/finding/borders.css/  GET 650

2005.1.20 13:56:00  10.0.4.255 http://ebay.doubleclick.net/adi/ebay.us.search/keywords;tile=1;dcopt=ist;list=all;kw=72+chevy;sz=468x60;ord=1106247624000;/  GET 392

….etc…..

To exit this view, type control-c  otherwise it will keep scrolling forever, showing you traffic as it goes through the firewall.

B:  Viewing ONLY the denied sites in real time.

[root@localhost dansguardian]# tail –f access.log |grep *DENIED*

(For clarity, it is “ tail –f access.log |grep *DENIED* ”   the | is a pipe symbol (usually above your enter key) )

Then it will sit there and only watch for traffic that is being denied. i.e.:

2005.1.20 14:00:46  10.0.0.15 http://webpdp.gator.com/4/placement/1236/ *DENIED* Banned site: gator.com GET 0

2005.1.20 14:01:02  10.0.0.15 http://webpdp.gator.com/4/placement/1236/ *DENIED* Banned site: gator.com GET 0

Control-c to exit this when you’re finished.

C:  Logging all the denied sites to a separate file.

[root@localhost dansguardian]# cat access.log |grep *DENIED* >denied-sites.txt  (This is a single line)

This will create a new file in the same directory called denied-sites.txt that will list ALL the *denied* stuff since the log file was first created up to the time the command was executed.

To view this file:

[root@localhost dansguardian]# less denied-sites.txt (or whatever filename you named the output to, it can be anything.txt J )

You can scroll this file using your arrow keys, and then type q to quit back to command line.

To exit the command line, type exit twice.  Once to exit the root account, then again to exit your account.

This concludes a quick lesson in viewing log files!  You can contact us if you have any questions.

APPENDIX A:  How to read the log file:

Here’s an example of a site that has been denied with the line broken down.  This is supposed to be a single line, but has wrapped because it is so long.  Every line starts with a date. Any line that does NOT have a date is a continuation of the previous line.

2005.1.20 14:24:25  10.0.0.141 http://dynamic.hotbar.com/dynamic/hotbar/disp/3.0/sitedisp.dll?GetSDF&Dom=simplysentiments.com&Path=%2f&SiteVer=16

/ *DENIED* Banned site: hotbar.com GET 0

Breakdown:

2005.1.20 14:24:25  This is the date and time.  Format is YYYY.MM.DD and 24 hour military time.

10.0.0.141  This is the internal IP address in question.

http://dynamic.hotbar.com/dynamic/hotbar/disp/3.0/sitedisp.dll?GetSDF&Dom=simplysentiments.com&Path=%2f&SiteVer=16     This is the URL that is going through the firewall that is being *DENIED* in this case.

/ *DENIED*   The action being taken.

Banned site: hotbar.com  The reason for being denied.  In this case, this is a site in the black list.  If it is being denied by a keyword combination being found, it will list them.  I hope you’re not easily offended by cuss words, because that’s what will be listed!

GET This means that it did NOT go through the proxy.  Any number other than zero is the proxy cache ID number for that specific object.


APPENDIX B:  Useful Linux Commands:

ls –la  lists all files in the current directory and shows permissions, file size in blocks.

cd /directory   Changes the current working directory to /directory.

pwd  “print working directory”  shows where you are.

cat  filename.txt output the text file to your screen, or to an output file. (“cat filename.txt > 2ndfile.txt” is an example of outputting to a file.)

          cat filename.txt |grep keyword   prints only the lines with the “keyword” found in it to your screen.

          cat filename.txt|grep keyword > keywords-found.txt  like above,  but outputs to a file called keywords-found.txt that you can read later.

less  filename.txt  read the filename.txt, and be able to scroll using cursor key.  q to quit.

tail  filename.txt shows the last few lines of filename.txt

tail –f filename.txt  opens filename.txt and  keeps it open showing you the live updates to the filename.txt as it happens.  Useful for viewing logfiles.  control-c  to quit.

exit  quits the command line session.  If you are root, it will quit the root and return you to your regular user account.  Type again to quit your user account and actually exit.

  ©SomethingCool.com LLC 2000-2003, All rights reserved. Do not reproduce or redistribute without prior consent of authors