This is cache of http://taosecurity.blogspot.com/2008/06/logging-web-traffic-with-httpry.html. Cache is the snapshot of article that we took when we index feed.
To see original page click here.
We are not affiliated with the authors of this article and not responsible for its content.
Logging Web Traffic with Httpry
2008-06-13 17:46:00 by Richard Bejtlich in TaoSecurity
 
I don't need to tell anyone that a lot of interesting command-and-control traffic is sailing through our Web proxies right now. I encourage decent logging for anyone using Web proxies. Below are three example entries from a Squid access.log. This is "squid" format with entries for user-agent and referer tacked to the end.

Incidentally here is a diff of my Squid configuration that shows how I set up Squid.

r200a# diff /usr/local/etc/squid/squid.conf /usr/local/etc/squid/squid.conf.orig
632,633c632,633
< acl our_networks src 10.0.0.0/8 172.16.0.0/12 192.168.0.0/16
< http_access allow our_networks
---
> #acl our_networks src 192.168.1.0/24 192.168.2.0/24
> #http_access allow our_networks
936c936
< http_port 172.16.2.1:3128
---
> http_port 3128
1990,1992d1989
< logformat squid-extended %ts.%03tu %6tr %>a %Ss/%03Hs %<st
%rm %ru %un %Sh/%<A %mt "%{Referer}>h" "%{User-Agent}>h"
<
<
2022c2019
< access_log /usr/local/squid/logs/access.log squid-extended
---
> access_log /usr/local/squid/logs/access.log squid
2216c2213
< strip_query_terms off
---
> # strip_query_terms on
3056d3052
< visible_hostname r200a.taosecurity.com

If you worry I'm exposing this to the world, don't worry too much. I find the value of having this information in a place I can find it outweighs the possibility someone will use this data to exploit me. There's much easier ways to do that, I think.

The first record shows a Google query for the term "dia", where the referer was a query for "fbi". The second record is a Firefox prefetch of the first record. The third record is a query for a .gif.

1213383786.614 255 192.168.2.103 TCP_MISS/200 9263
GET http://www.google.com/search?hl=en&client=firefox-a&rls=
com.ubuntu%3Aen-US%3Aofficial&hs=Hqt&q=dia&btnG=Search -
DIRECT/64.233.169.103 text/html "http://www.google.com/search
?q=fbi&ie=utf-8&oe=utf-8&aq=t&rls=com.ubuntu:en-US:official&client=firefox-a"
"Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.14) Gecko/20060601
Firefox/2.0.0.14 (Ubuntu-edgy)"

1213383786.704 76 192.168.2.103 TCP_MISS/200 2775
GET http://www.google.com/pfetch/dchart?s=DIA -
DIRECT/64.233.169.147 image/gif
"http://www.google.com/search?hl=en&client=firefox-a&rls=com.ubuntu%3A
en-US%3Aofficial&hs=Hqt&q=dia&btnG=Search" "Mozilla/5.0 (X11; U; Linux
i686; en-US; rv:1.8.1.14) Gecko/20060601 Firefox/2.0.0.14 (Ubuntu-edgy)"

1213383786.717 81 192.168.2.103 TCP_MISS/200 1146
GET http://www.google.com/images/blogsearch-onebox.gif -
DIRECT/64.233.169.99 image/gif "http://www.google.com/search?hl=en
&client=firefox-a&rls=com.ubuntu%3Aen-US%3Aofficial&hs=Hqt&q=dia&btnG=Search"
"Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.14) Gecko/20060601
Firefox/2.0.0.14 (Ubuntu-edgy)"

What if you're a security person who can't access Web logs, but you have a NSM sensor in the vicinity? You might use Bro to log this activity, but I found something last year that's much simpler by Jason Bittel: Httpry.

r200a# httpry -h
httpry version 0.1.3 -- HTTP logging and information retrieval tool
Copyright (c) 2005-2008 Jason Bittel
Usage: httpry [ -dhpq ] [ -i device ] [ -n count ] [ -o file ] [ -r file ]
[ -s format ] [ -u user ] [ 'expression' ]

-d run as daemon
-h print this help information
-i device listen on this interface
-n count set number of HTTP packets to parse
-o file write output to a file
-p disable promiscuous mode
-q suppress non-critical output
-r file read packets from input file
-s format specify output format string
-u user set process owner
expression specify a bpf-style capture filter

Additional information can be found at:
http://dumpsterventures.com/jason/httpry

In the following example I run Httpry against a trace of the traffic taken when I visited the site shown in the Squid logs earlier.

r200a# httpry -i bge0 -o /tmp/httprytest3.txt -q -u richard
-s timestamp,source-ip,x-forwarded-for,direction,dest-ip,method,host,
request-uri,user-agent,referer,status-code,http-version,reason-phrase
-r /tmp/test3.pcap
r200a# cat /tmp/httprytest3.txt

# httpry version 0.1.3
# Fields: timestamp,source-ip,x-forwarded-for,direction,dest-ip,method,host,
request-uri,user-agent,referer,status-code,http-version,reason-phrase

06/13/2008 15:03:06 68.48.240.186 - > 64.233.169.103
GET www.google.com /search?hl=en&client=firefox-a&rls=com.ubuntu
%3Aen-US%3Aofficial&hs=Hqt&q=dia&btnG=Search Mozilla/5.0
(X11; U; Linux i686; en-US; rv:1.8.1.14) Gecko/20060601 Firefox/2.0.0.14
(Ubuntu-edgy) http://www.google.com/search?q=fbi&ie=utf-8&
oe=utf-8&aq=t&rls=com.ubuntu:en-US:official&client=firefox-a -
HTTP/1.0 -

06/13/2008 15:03:06 64.233.169.103 - < 68.48.240.186
- - - - - 200 HTTP/1.0 OK

06/13/2008 15:03:06 68.48.240.186 192.168.2.103 > 64.233.169.147
GET www.google.com /pfetch/dchart?s=DIA Mozilla/5.0
(X11; U; Linux i686; en-US; rv:1.8.1.14) Gecko/20060601 Firefox/2.0.0.14
(Ubuntu-edgy) http://www.google.com/search?hl=en&client=
firefox-a&rls=com.ubuntu%3Aen-US%3Aofficial&hs=Hqt&q=dia&btnG=Search -
HTTP/1.0 -

06/13/2008 15:03:06 68.48.240.186 192.168.2.103 > 64.233.169.99
GET www.google.com /images/blogsearch-onebox.gif Mozilla/5.0
(X11; U; Linux i686; en-US; rv:1.8.1.14) Gecko/20060601 Firefox/2.0.0.14
(Ubuntu-edgy) http://www.google.com/search?hl=en&client=
firefox-a&rls=com.ubuntu%3Aen-US%3Aofficial&hs=Hqt&q=dia&btnG=Search -
HTTP/1.0 -

06/13/2008 15:03:06 64.233.169.147 - < 68.48.240.186
- - - - - 200 HTTP/1.0 OK
06/13/2008 15:03:06 64.233.169.99 - < 68.48.240.186
- - - - - 200 HTTP/1.0 OK

As you can see, the format here is request-reply, although the last four records are request,request,reply,reply.

Although I first tried Httpry straight from the source code, in this case I tested an upcoming FreeBSD port created by my friend WXS. If you give Httpry a try, let me know what you think and how you like to invoke it on the command line. I plan to daemonize it in production and run it against a live interface, not traces.
 
 
 
 
 
 
RELATED VIDEO
Expand / Minimize
SecurityRatty FAQ
Sergey Zarubin, 31yo
CISSP, CCSP
Moscow, Russia