Here's a little something I wrote in C called "gethtml" that gets html
documents from a server. It's quite simple, actually - just connect to
port 80 on your victim's machine, then issue the "GET ..." command. For
example, to get "/iwin/us/allwarnings.html" from iwin.nws.noaa.gov (this
is a page listing all the current watches and warnings issued by NWS),
you'd connect to port 80 on iwin.nws.noaa.gov and say "GET
/iwin/us/allwarnings.html". Capture the resulting HTML.
The implementation in VB (VERY easy!) is left as an exercise for the
student ;)
This could be used to, say, fetch pages from a server, parse the page for
<A HREF> references to other pages, then fetch those pages, etc. A
simple robot. Of course, you have to make sure you haven't fetched that
page before (getting yourself in a fetch loop), you have to save away
and/or index the text, etc.
This code is for linux, but I've used it on Solaris and other systems before.
----------------------------------- cut here -------------------------------
/*
* gethtml - get HTML document (specified in argv[2]) from port 80 at site argv[1]
* Copyright 1996, Ed Carp (ecarp@netcom.com). Commercial use prohibited
* without prior arrangement. Non-commercial use permitted.
*/
#include <stdio.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netdb.h>
#include <netinet/in.h>
#include <errno.h>
int sendbyuucp, sockfd;
int readcnt;
char cmd[128], *nameptr;
struct sockaddr_in address;
struct hostent *hi;
struct in_addr *aptr;
FILE *in, *log;
int i;
main (argc, argv)
int argc;
char **argv;
{
if (argc < 3)
{
printf ("usage: %s site document\n", argv[0]);
exit (1);
}
hi = gethostbyname (argv[1]);
if (hi == NULL)
{
printf ("can't get host by name '%s' - skipped\n", nameptr);
exit (1);
}
sockfd = socket (AF_INET, SOCK_STREAM, 0);
if (sockfd == EOF)
{
perror ("socket");
printf ("can't do socket for '%s' - skipped - error code=%d\n", nameptr, errno);
exit (1);
}
address.sin_family = AF_INET;
address.sin_port = htons (80);
aptr = (struct in_addr *) *(hi->h_addr_list);
address.sin_addr = *aptr;
if (connect (sockfd, (struct sockaddr *) &address, sizeof (address)) == EOF)
{
perror ("connect");
printf ("can't do connect for '%s' - skipped - error code=%d\n", nameptr, errno);
exit (1);
}
sprintf (cmd, "GET %s\r\n", argv[2]);
write (sockfd, cmd, strlen (cmd));
while ((readcnt = read (sockfd, cmd, 127)) > 0)
write (1, cmd, readcnt);
close (sockfd);
fclose (in);
exit (0);
}
-- Ed Carp, N7EKG Ed.Carp@linux.org, ecarp@netcom.com 214/993-3935 voicemail/digital pager Finger ecarp@netcom.com for PGP 2.5 public key an88744@anon.penet.fi"Past the wounds of childhood, past the fallen dreams and the broken families, through the hurt and the loss and the agony only the night ever hears, is a waiting soul. Patient, permanent, abundant, it opens its infinite heart and asks only one thing of you ... 'Remember who it is you really are.'"
-- "Losing Your Mind", Karen Alexander and Rick Boyes
The mark of a good conspiracy theory is its untestability. -- Andrew Spring