第二章
先发一段数据,我们来分析它<登陆百度的包>
POST /?login HTTP/1.1
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/x-shockwave-flash, application/vnd.ms-excel, application/vnd.ms-powerpoint, application/msword, application/x-ms-application, application/x-ms-xbap, application/vnd.ms-xpsdocument, application/xaml+xml, */*
Referer: https://passport.baidu.com/?login&tpl=mn
Accept-Language: zh-cn
Content-Type: application/x-www-form-urlencoded
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; QQDownload 663; .NET CLR 2.0.50727; .NET CLR 3.0.04506.648; .NET CLR 3.5.21022)
Host: passport.baidu.com
Content-Length: 236
Connection: Keep-Alive
Cache-Control: no-cache
Cookie:
tpl_ok=&next_target=&tpl=mn&skip_ok=&aid=&need_pay=&need_coin=&pay_method=&u=http%3A%2F%2Fwww.baidu.com%2F&return_method=get&more_param=&return_type=&psp_tt=0&password=123465&safeflg=0&username=sunshinebean&verifycode=&mem_pass=on
关于Http头的构成我不阐述,详见:
http://hi.baidu.com/absky_cxb/blog/item/f28065017032760a738da5cb.html
这里主要讲Post包的构成及比较重要的Http头参数
1. Http头里的Referer参数,简单的说就是来路,然后目标服务器能知道你这个Http请求是哪个页面请求过去的,有些服务器就是判断来路的所以这个参数比较重要
2. Http头里的Content-Type参数,提交的是网页信息可以是application/x-www-form-urlencoded,假如提交图片信息这个参数也会随之变成data
3. Post的包参数全部用&符号隔开的,=号后面的是参数的值。有些参数是固定不变的有些却是随机改变的,这些参数80%能在网页源码里找到,10%能在cookie里找到,10%能在JS里构造(这个比较麻烦要会分析JS),在上面这段数据里变动的就账号跟密码,别的参数都是固定的,当然这个结论要多次分析获得。比如这里的包, username=sunshinebean,password=123465就对应百度的账号跟密码
第三章
一:VB创建Xmlhttp对象
Private Function GetHtmlStr$(StrUrl$, switch%) ‘获取源码
Dim XmlHttp
Set XmlHttp = CreateObject("Microsoft.XMLHTTP")
XmlHttp.Open "GET", StrUrl, True
XmlHttp.send
stime = Now ‘获取当前时间
While XmlHttp.ReadyState <> 4
DoEvents
ntime = Now ‘获取循环时间
If DateDiff("s", stime, ntime) > 3 Then GetHtmlStr = "": Exit Function ‘判断超出3秒即超时退出过程
Wend
GetHtmlStr = StrConv(XmlHttp.ResponseBody, vbUnicode)
Set XmlHttp = Nothing
End Function
这个是我自己写的一个函数,主要作用是获取指定网页的源码
XmlHttp.Open "GET", StrUrl, True
"GET"是数据的请求方式,还有种就是POST提交数据写成"POST"
StrUrl是指定的网址,GET请求时写要GET的目的网址,POST时写提交的目的地址
True是异步,False是同步,区别就是同步是等待数据全部获取后再显示,所以会卡,而异步则不会卡,所以推荐用异步比较好
XmlHttp.setRequestHeader "Referer", RefUrl
指定来路,上章已经提到过Referer参数的重要性了, RefUrl就写截包截来的那个来路
XmlHttp.setRequestHeader "CONTENT-TYPE", "application/x-www-form-urlencoded"
这个上章也提到过,是类型,一般都是按截的包里面的写,网页的话直接写成这样就好:"application/x-www-form-urlencoded"
XmlHttp.send "XXX"
这里的XXX就是post的包内容,Get的话这个是空的,POST的话把包的内容写在这里
然后该函数返回POST的返回信息,我们一般可以在返回值里提取出特定的东西来判断执行某样东西是否成功
二:webbrowser
先看段代码:
Private Sub Command1_Click()
Dim URL$, Flags&, TargetFrame$, PostData() As Byte, Headers$
URL = "https://passport.baidu.com/?login"
Flags = 0
TargetFrame = ""
PostData = "tpl_ok=&next_target=&tpl=mn&skip_ok=&aid=&need_pay=&need_coin=&pay_method=&u=http%3A%2F%2Fwww.baidu.com%2F&return_method=get&more_param=&return_type=&psp_tt=0&password=123456&safeflg=0&username=sunshinebean&verifycode="
PostData = StrConv(PostData, vbFromUnicode)
Headers = "Content-Type: application/x-www-form-urlencoded" & vbCrLf
WebBrowser1.Navigate URL, Flags, TargetFrame, PostData, Headers
End Sub
Private Sub Form_Load()
WebBrowser1.Navigate "http://baidu.com"
End Sub
Webbrowser有个Navigate方法,参数是这样的:
object.Navigate URL [Flags,] [TargetFrameName,] [PostData,] [Headers]
拿出MSDN,查之,见下页:
URL:
Required. A string expression that evaluates to the URL, full path, or Universal Naming Convention (UNC) location and name of the resource to display.
Flags:
Optional. A constant or value that specifies whether to add the resource to the history list, whether to read from or write to the cache, and whether to display the resource in a new window. It can be a combination of the following constants or values. Constant Value Meaning
navOpenInNewWindow 1 Open the resource or file in a new window.
navNoHistory 2 Do not add the resource or file to the history list. The new page replaces the current page in the list.
navNoReadFromCache 4 Do not read from the disk cache for this navigation.
navNoWriteToCache 8 Do not write the results of this navigation to the disk cache.
TargetFrameName:
Optional. String expression that evaluates to the name of an HTML frame in URL to display in the browser window. The possible values for this parameter are: _blank Load the link into a new unnamed window.
_parent Load the link into the immediate parent of the document the link is in.
_self Load the link into the same window the link was clicked in.
_top Load the link into the full body of the current window.
<window_name> A named HTML frame. If no frame or window exists that matches the specified target name, a new window is opened for the specified link.
PostData:
Optional. Data to send to the server during the HTTP POST transaction. For example, the POST transaction is used to send data gathered by an HTML form to a program or script. If this parameter does not specify any post data, the Navigate method issues an HTTP GET transaction. This parameter is ignored if URL is not an HTTP URL.
Headers
Optional. A value that specifies additional HTTP headers to send to the server. These headers are added to the default Internet Explorer headers. The headers can specify things like the action required of the server, the type of data being passed to the server, or a status code. This parameter is ignored if URL is not an HTTP URL.
URL 指定需要使用的网页。
flags 指定是否将该资源添加到历史列表、或通过高速缓存读写,将该资源显示在一个新窗口中、或这些方式的组合。
targetframename 指定目标显示区的名称。
postdata 指定需要发送到 HTTP的Post的数据。
headers 指定需要发送的 HTTP 标题。
整合了下就是这样。PostData构造方法一样,就是要特别注意的是数据要转成vbFromUnicode才能提交
三,INET控件
我想这个控件大家应该不陌生吧,很多VB写的程序啊软件都用到了这个控件,这家伙封装了Http和Ftp协议使用起来很方便所以用的人很多。代码也很简洁呵呵~~
Dim PostDate As String
If Inet1.StillExecuting = True Then Exit Sub
PostDate = "tpl_ok=&next_target=&tpl=mn&skip_ok=&aid=&need_pay=&need_coin=&pay_method=&u=http%3A%2F%2Fwww.baidu.com%2F&return_method=get&more_param=&return_type=&psp_tt=0&password=123456&safeflg=0&username=sunshinebean&verifycode="
Inet1.Execute "https://passport.baidu.com/?login", "POST", PostDate, "Referer: https://passport.baidu.com/?login&tpl=mn" & vbCrLf & "Content-Type: application/x-www-form-urlencoded"
Inet和xmlhttp的好处就是能提交Referer
Inet的Execute方法:
URL是Post的目标地址, operation是Get方法或者Post方法,data 是Post的数据, requestHeader是Referer来路和Content-Type
下面讲下数据的返回:
在inet的StateChanged事件里返回数据,判断state是不是12,原因:
0 未报告状态icHostResolvingHost
1 控件正在寻找指定主机的IP地址icHostResolved
2 控件已成功找到指定主机的IP地址icConnecting
3 控件正在与指定主机进行连接icConnected
4 控件已成功与指定主机连接icRequesting
5 控件正在向主机发出请求icRequestSent
6 控件已成功向主机发出请求icReceivingResponse
7 控件正在从主机接收反馈信息icResponseReceived
8 控件已成功从主机接受反馈信息icDisconnecting
9 控件正在与主机断开icDisconnected
10 控件已与主机断开icError
11 在与主机通信的过程中发生了错误icResponseCompleted
12 请求结束且数据已经接收到
Dim strData$
Dim bDone As Boolean: bDone = False
vtData = Inet1.GetChunk(1024, icString)
Do While Not bDone
strData = strData & vtData
DoEvents
vtData = Inet1.GetChunk(1024, icString)
If Len(vtData) = 0 Then
bDone = True
End If
Loop
这里返回string类型的源码。。要是二进制或者UTF8的话还要简单
定义一个byte数组就行了
Dim Buff() As Byte
Buff = Inet1.GetChunk(0, icByteArray)
获取到的图图保存在路径下面再用picture加载就是图图了。UTF8的源码用解码函数进行解码即可解决乱码的问题,UTF8解码函数:
Private Declare Function MultiByteToWideChar Lib "kernel32" (ByVal CodePage As Long, ByVal dwFlags As Long, ByVal lpMultiByteStr As Long, ByVal cchMultiByte As Long, ByVal lpWideCharStr As Long, ByVal cchWideChar As Long) As Long
Private Const CP_UTF8 = 65001
Function Utf8ToUnicode(ByRef Utf() As Byte) As String
Dim lRet As Long
Dim lLength As Long
Dim lBufferSize As Long
lLength = UBound(Utf) – LBound(Utf) + 1
If lLength <= 0 Then Exit Function
lBufferSize = lLength * 2
Utf8ToUnicode = String$(lBufferSize, Chr(0))
lRet = MultiByteToWideChar(CP_UTF8, 0, VarPtr(Utf(0)), lLength, StrPtr(Utf8ToUnicode), lBufferSize)
If lRet <> 0 Then
Utf8ToUnicode = Left(Utf8ToUnicode, lRet)
Else
Utf8ToUnicode = ""
End If
End Function
调用Utf8ToUnicode(Buff)即可!