|
Hi. I am trying to get the source code of a web page in order to analyze it. And here is the code for code retrieving:
BOOL CCIFDoc::GetRemoteFile(CString sServer, CString sRemotePath, CString sParameter, CStringArray& saData)
{
BOOL bRet = FALSE;
CInternetSession ISession;
CInternetFile* pIFile = NULL;
CHttpFile* pHttpFile = NULL;
LPCTSTR lpszAccept[] = {_T("*/*"), NULL};
try
{
pIFile = (CInternetFile*)ISession.OpenURL(_T("http://") + sServer + sRemotePath + sParameter, 1, INTERNET_FLAG_TRANSFER_ASCII | INTERNET_FLAG_RELOAD);
if(NULL != pIFile)
{
CString sTemp;
while(pIFile->ReadString(sTemp))
saData.Add(sTemp + _T("\n"));
}
bRet = (saData.GetSize() > 0);
}
catch(CInternetException* pException)
{
pException->GetErrorMessage(m_sError.GetBuffer(_MAX_PATH), _MAX_PATH);
m_sError.ReleaseBuffer();
pException->Delete();
}
catch(CMemoryException* pMemException)
{
pMemException->GetErrorMessage(m_sError.GetBuffer(_MAX_PATH), _MAX_PATH);
m_sError.ReleaseBuffer();
pMemException->Delete();
}
if(NULL != pIFile)
{
pIFile->Close();
delete pIFile;
}
if(NULL != pHttpConnect)
{
pHttpConnect->Close();
delete pHttpConnect;
}
ISession.Close();
return bRet;
}
Classic code.
But there is some web pages which is retrieving without html body, or with other body than I have seen in my browser. Let's take an example: type www.bnr.ro in your browser, and see the page. But when I retrieve this address with code from above, here is the result:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
<title>Eroare/ Error</title>
<style type="text/css">
* { margin: 0; padding: 0; border: 0; outline: none; font-size: 100%; }
body { font-family: Arial, sans-serif; }
#wrapper { text-align: center; margin: 3em auto;}
p { padding: 0.5em 0; }
a { color: #0039a6; text-decoration: none; }
a:hover, a:focus { color: #e9994a; }
</style>
</head>
<body>
<div id="wrapper">
<a href="http://www.bnr.ro"><img src="http://www.bnr.ro/images/logo.png"></a>
<p>Eroare neprevazută / Unexpected error.</p>
<p><a href="http://www.bnr.ro">Înapoi pe prima pagină / Back to the homepage</a>.</p>
</div>
</body>
</html>
How can I retrieve the source code just I seen in my browser ?
Thank you.
modified 14-May-18 7:34am.
|
|
|
|
|
That is an error message. The server was not able to answer the request.
If you enter that address in your browser you should have noticed that it redirects to an ASPX page.
A possible reason might be that the server script tries to access a request header which is not present and the code does not handle that. Candidates are (among others) User-Agent , Content-Type , and Accept . But finally only the administrator of that web site can tell you.
|
|
|
|
|
Ok, and if I know them (User-Agent, Content-Type, etc.), how can I use it in OpenURL method ?
|
|
|
|
|
Fourth parameter of OpenURL() .
|
|
|
|
|
|
I have tried something like this:
_T("Cache-Control: no-store\r\nPragma: no-cache\r\nExpires: 0\r\nContent-Type: text/html; charset=utf-8\r\nX-UA-Compatible: IE=edge,chrome=1\r\nauthor: Ministerul Finantelor Publice\r\ndescription: \r\nkeywords: \r\n\r\n")
The same result … very strange …
|
|
|
|
|
And a similar problem, where I get only headers, without body web source: http://mfinante.ro/infocodfiscal.html[^]
and when I have tried to get the web source, where is the result:
<!DOCTYPE html>
<html><head>
<meta http-equiv="Pragma" content="no-cache"/>
<meta http-equiv="Expires" content="-1"/>
<meta http-equiv="CacheControl" content="no-cache"/>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
<link rel="shortcut icon" href="data:;base64,iVBORw0KGgo="/>
<script>
(function(){
var securemsg;
var dosl7_common;
window.KBov=!!window.KBov;try{(function(){try{var ll,Ll,Ol=1,Sl=1,il=1,jl=1,Jl=1;for(var ZL=0;ZL<Ll;++ZL)Ol+=2,Sl+=2,il+=2,jl+=2,Jl+=3;ll=Ol+Sl+il+jl+Jl;window.JL===ll&&(window.JL=++ll)}catch(sL){window.JL=ll}var iL=!0;function lo(l){!l||document.visibilityState&&"visible"!==document.visibilityState||(iL=!1,document.cookie="brav=ad");return iL}function Lo(){}lo(window[Lo.name]===Lo);lo("function"!==typeof ie9rgb4);lo(/\x3c/.test(function(){return"\x3c"})&!/x3d/.test(function(){return"'x3'+'d';"}));
var Oo=window.attachEvent||/mobi/i.test(window["\x6e\x61vi\x67a\x74\x6f\x72"]["\x75\x73e\x72A\x67\x65\x6et"]),Io=+new Date+6E5,lO,LO,oO,zO=setTimeout,ZO=Oo?3E4:6E3;function SO(){if(!document.querySelector)return!0;var l=+new Date,z=l>Io;if(z)return lo(!1);z=LO&&!oO&&lO+ZO<l;z=lo(z);lO=l;LO||(LO=!0,zO(function(){LO=!1},1));return z}SO();
document.addEventListener&&document.addEventListener("visibilitychange",function(l){document.visibilityState&&("hidden"===document.visibilityState&&l.isTrusted?oO=!0:"visible"===document.visibilityState&&(lO=+new Date,oO=!1,SO()))});var iO=[17795081,27611931586,1558153217];function jO(l){l="string"===typeof l?l:l.toString(36);var z=window[l];if(!z.toString)return;var s=""+z;window[l]=function(l,s){LO=!1;return z(l,s)};window[l].toString=function(){return s}}for(var JO=0;JO<iO.length;++JO)jO(iO[JO]);
})();
</script>
<script type="text/javascript" src="/TSPD/08b919fd7aab200047a61aa409c2e6d600e069e74eb7044d6800f6e68db33d85b37ab015d70c1c5d?type=8"></script>
<script>
(function(){
var securemsg;
var dosl7_common;
window["blobfp"] = "1111111110112000003e825d0550f830000004a71d70c295b19a7f2005afaa33b00001c20eac9549cdc897b01acfc003937e6e0f02eceda17300000020http://re.security.f5aas.com/re/";
})();
</script>
<script type="text/javascript" src="/TSPD/08b919fd7aab200047a61aa409c2e6d600e069e74eb7044d6800f6e68db33d85b37ab015d70c1c5d?type=11"></script>
<noscript>Please enable JavaScript to view the page content.</noscript>
</head><body>
</body></html>
headers, without body ...
|
|
|
|
|
It is a valid reply and the important parts are the script tags which are executed by web browsers showing the final content. If JavaScript is disabled in a browser, the content of the noscript tag is shown:
<noscript>Please enable JavaScript to view the page content.</noscript>
That is the problem with todays web sites: They don't use plain HTML anymore.
Even well known tools like wget don't support JavaScript and can't be therefore used to download the "visible" content (which may vary with attributes like media type and screen resolution). You would have to use a client that is able to do all the stuff that a web browser can do besides the final rendering.
|
|
|
|
|
Thank you Jochen. I had tried to create an html client with CHtmlView, and here is the result:
void CTestHTMLView::OnInitialUpdate()
{
CHtmlView::OnInitialUpdate();
Navigate2(_T("http://www.mfinante.ro/pagina.html"), NULL, NULL);
}
result is blank page … is there impossible to solve this programmatically, in this case ?
|
|
|
|
|
Use the CHtmlView GetSource() method to inspect the source. It will probably like those you have already got and should contain hints about what is missing.
Note also that the IE settings are used which might be too restrictive.
|
|
|
|
|
I had tried to use CHtmlView::GetSource(), but the string was exactly like the one from CInternetSession::OpenURL() method … no difference between them …
void CTestHTMLView::OnHelpGetsource()
{
CString s;
GetSource(s);
}
and the result is:
<!DOCTYPE html>
<html><head>
<meta http-equiv="Pragma" content="no-cache"/>
<meta http-equiv="Expires" content="-1"/>
<meta http-equiv="CacheControl" content="no-cache"/>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"/>
<link rel="shortcut icon" href="data:;base64,iVBORw0KGgo="/>
<script>
(function(){
var securemsg;
var dosl7_common;
window["bobcmn"] = "1111101111101020000000220000000520000000025705570d200000096300000000300000000300000006/TSPD/300000008TSPD_101300000004http200000000200000000";
window.SqNs=!!window.SqNs;try{(function(){try{var ss,Ss,is=1,Js=1,Ls=1,zs=1,Zs=1;for(var jS=0;jS<Ss;++jS)is+=2,Js+=2,Ls+=2,zs+=2,Zs+=3;ss=is+Js+Ls+zs+Zs;window.lJ===ss&&(window.lJ=++ss)}catch(JS){window.lJ=ss}var oS=!0;function ZS(s){!s||document.visibilityState&&"visible"!==document.visibilityState||(oS=!1,document.cookie="brav=ad");return oS}function s_(){}ZS(window[s_.name]===s_);ZS("function"!==typeof ie9rgb4);ZS(/\x3c/.test(function(){return"\x3c"})&!/x3d/.test(function(){return"'x3'+'d';"}));
var __=window.attachEvent||/mobi/i.test(window["\x6e\x61vi\x67a\x74\x6f\x72"]["\x75\x73e\x72A\x67\x65\x6et"]),o_=+new Date+6E5,Z_,si,Si,ii=setTimeout,Ii=__?3E4:6E3;function ji(){if(!document.querySelector)return!0;var s=+new Date,I=s>o_;if(I)return ZS(!1);I=si&&!Si&&Z_+Ii<s;I=ZS(I);Z_=s;si||(si=!0,ii(function(){si=!1},1));return I}ji();
document.addEventListener&&document.addEventListener("visibilitychange",function(s){document.visibilityState&&("hidden"===document.visibilityState&&s.isTrusted?Si=!0:"visible"===document.visibilityState&&(Z_=+new Date,Si=!1,ji()))});var li=[17795081,27611931586,1558153217];function Li(s){s="string"===typeof s?s:s.toString(36);var I=window[s];if(!I.toString)return;var l=""+I;window[s]=function(s,l){si=!1;return I(s,l)};window[s].toString=function(){return l}}for(var oi=0;oi<li.length;++oi)Li(li[oi]);
ZS(!1!==window.SqNs);
(function(){var s=-1,s={o:++s,oL:"false"[s],S:++s,iI:"false"[s],jS:++s,_5:"[object Object]"[s],Ij:(s[s]+"")[s],jI:++s,ij:"true"[s],_S:++s,LS:++s,OL:"[object Object]"[s],L:++s,sS:++s,ljS:++s,JjS:++s};try{s._I=(s._I=s+"")[s.LS]+(s.Oi=s._I[s.S])+(s.LL=(s.oi+"")[s.S])+(!s+"")[s.jI]+(s.zi=s._I[s.L])+(s.oi="true"[s.S])+(s.Jj="true"[s.jS])+s._I[s.LS]+s.zi+s.Oi+s.oi,s.LL=s.oi+"true"[s.jI]+s.zi+s.Jj+s.oi+s.LL,s.oi=s.o[s._I][s._I],s.oi(s.oi(s.LL+'"\\'+s.S+s.LS+s.S+s.oL+"\\"+s._S+s.o+"("+s.zi+"\\"+s.S+s.sS+
s.S+"\\"+s.S+s.L+s.o+s.ij+s.Oi+s.oL+"\\"+s._S+s.o+"\\"+s.S+s.L+s.sS+"\\"+s.S+s.LS+s.S+"\\"+s.S+s.LS+s.L+s.Ij+s.Oi+"\\"+s.S+s.L+s.sS+"['\\"+s.S+s.L+s.o+s.iI+"\\"+s.S+s.sS+s.S+"false"[s.jS]+s.Oi+s.iI+s.Ij+"']\\"+s._S+s.o+"===\\"+s._S+s.o+"'\\"+s.S+s.L+s.jI+s.zi+"\\"+s.S+s.L+s.jS+"\\"+s.S+s.LS+s.S+"\\"+s.S+s.LS+s.L+"\\"+s.S+s._S+s.sS+"')\\"+s._S+s.o+"{\\"+s.S+s.jS+"\\"+s.S+s.S+"\\"+s.S+s.L+s.L+s.iI+"\\"+s.S+s.L+s.jS+"\\"+s._S+s.o+s.ij+s.Ij+"\\"+s.S+s.L+s.L+s.OL+"\\"+s.S+s.sS+s.S+s.Jj+"\\"+s.S+s.LS+s.jS+
"\\"+s.S+s.LS+s.jI+"\\"+s.S+s.L+s.o+"\\"+s._S+s.o+"=\\"+s._S+s.o+"\\"+s.S+s.L+s.sS+"\\"+s.S+s.LS+s.S+"\\"+s.S+s.LS+s.L+s.Ij+s.Oi+"\\"+s.S+s.L+s.sS+"['\\"+s.S+s.L+s.o+s.iI+"\\"+s.S+s.sS+s.S+"false"[s.jS]+s.Oi+s.iI+s.Ij+"'].\\"+s.S+s.L+s.jS+s.ij+"\\"+s.S+s.L+s.o+"false"[s.jS]+s.iI+s.OL+s.ij+"(/.{"+s.S+","+s._S+"}/\\"+s.S+s._S+s.sS+",\\"+s._S+s.o+s.oL+s.Jj+"\\"+s.S+s.LS+s.L+s.OL+s.zi+"\\"+s.S+s.LS+s.S+s.Oi+"\\"+s.S+s.LS+s.L+"\\"+s._S+s.o+"(\\"+s.S+s.sS+s.o+")\\"+s._S+s.o+"{\\"+s.S+s.jS+"\\"+s.S+s.S+
"\\"+s.S+s.S+"\\"+s.S+s.S+"\\"+s.S+s.L+s.jS+s.ij+s.zi+s.Jj+"\\"+s.S+s.L+s.jS+"\\"+s.S+s.LS+s.L+"\\"+s._S+s.o+"(\\"+s.S+s.sS+s.o+"\\"+s._S+s.o+"+\\"+s._S+s.o+"\\"+s.S+s.sS+s.o+").\\"+s.S+s.L+s.jI+s.Jj+s._5+"\\"+s.S+s.L+s.jI+s.zi+"\\"+s.S+s.L+s.jS+"("+s.jS+",\\"+s._S+s.o+s._S+")\\"+s.S+s.jS+"\\"+s.S+s.S+"\\"+s.S+s.S+"});\\"+s.S+s.jS+"}\\"+s.S+s.jS+'"')())()}catch(I){s%=5}})();var zi=30;window.So={jo:"080c70a28c017800df126fc5e763aaf9338270119b7950166cbb6e56b597e768e5e278db788dbc7975b1b4e7f01572f8c45dd88a0fc0064f2f58c95147c7ac3e2ec9a1d2d31bef21ada6d4c4457b7952a09b154f45faedb6b7c45fea179cdc7fbfbe2f459f2384aa10132a8b8cceb151037fa9827fa8bb61c6331f26872b1b47"};function S(s){return 752>s}
function _(s){var I=arguments.length,l=[];for(var O=1;O<I;++O)l.push(arguments[O]-s);return String.fromCharCode.apply(String,l)}function J(s,I){s+=I;return s.toString(36)}(function(s){s||setTimeout(function(){if(!ji())return;var s=setTimeout(function(){},250);for(var l=0;l<=s;++l)clearTimeout(l);ji()},500)})(oS);})();}catch(x){document.cookie='brav=oex'+x;}finally{ie9rgb4=void(0);};function ie9rgb4(a,b){return a>>b>>0};
})();
</script>
<script type="text/javascript" src="/TSPD/08b919fd7aab20003e4c389b6b9d3a377ba58d046d5860c6a5f51ac0cfcb4999230dafe4bf087322?type=8"></script>
<script>
(function(){
var securemsg;
var dosl7_common;
})();
</script>
<script type="text/javascript" src="/TSPD/08b919fd7aab20003e4c389b6b9d3a377ba58d046d5860c6a5f51ac0cfcb4999230dafe4bf087322?type=11"></script>
<noscript>Please enable JavaScript to view the page content.</noscript>
</head><body>
</body></html>
No body part … I really don't know hot to overcome this …
modified 15-May-18 6:58am.
|
|
|
|
|
Like before: All output is done by linked JavaScript.
If that is not shown you have to change the IE settings which probably block the execution.
But I don't know if CHtmlView supports all IE features.
|
|
|
|
|
_Flaviu wrote: I really don't know hot to overcome this
The javascript and/or webassembly needs to execute. Websites are not flat files anymore.
You will need to use a complete browser engine to parse the DOM. In other words most websites today are generating dynamic content via javascript. You need to think outside the box here... that javascript you see needs to execute in order to generate the page.
One quick way to do this would be using a hidden Internet Explorer window as the backend. Creating a Web Browser-Style MFC Application[^]. You could set the CHtmlView to load the site and dump the top document after javascript has modified the DOM.
Some of my tools are using a custom webkit[^] as the backend to do this. You can also use Chromium Embedded[^]. You could probably spend less than a day modifying cefsimple[^] to load the website and dump the top document to file after javascript has executed.
Good Luck.
Best Wishes,
-David Delaune
|
|
|
|
|
I've always wondered this -why can't you declare variables after a case label in a switch statement? In C++ you can declare variables pretty much anywhere (and declaring them close to first use is obviously a good thing) but the following still won't work
|
|
|
|
|
You can, you just need to put them, and the code that uses them inside curly braces, thus:
switch (number)
{
case 1:
{
int i; }
}
Variables not inside such a block are considered to have the scope of the entire switch block. But if they are declared in a single case statement there is a possibility that they would not get initialised safely. To make a variable available to multiple case statements it must be declared before the switch .
|
|
|
|
|
in the below program it i have used both ways i.e. passed arguments by reference and pointers.
1. Arguments by reference:
#include <iostream>
using namespace std;
template<class T>
void swap(T &x, T &y){
T temp = x;
x = y;
y = temp;
}
void fun(int m, int n, float a, float b){
cout << "m & n before swap: " << m << " " << n << endl;
swap(m, n);
cout << "m & n after swap: " << m << " " << n << endl;
cout << "i & j before swap: " << a << " " << b << endl;
swap(a, b);
cout << "i & j after swap: " << a << " " << b << endl;
}
int main(){
fun(100, 200, 11.22, 33.44);
return 0;
}
2. Arguments by pointers:
#include <iostream>
using namespace std;
template<class T>
void swap(T *x, T *y){
T temp = x;
x = y;
y = temp;
}
void fun(int m, int n, float a, float b){
cout << "m & n before swap: " << m << " " << n << endl;
swap(m, n);
cout << "m & n after swap: " << m << " " << n << endl;
cout << "i & j before swap: " << a << " " << b << endl;
swap(a, b);
cout << "i & j after swap: " << a << " " << b << endl;
}
int main(){
fun(100, 200, 11.22, 33.44);
return 0;
}
also if i pass the arguments by reference but make it a constant then also i works, why ?
#include <iostream>
using namespace std;
template<class T>
void swap(const T &x, const T &y){
T temp = x;
x = y;
y = temp;
}
void fun(int m, int n, float a, float b){
cout << "m & n before swap: " << m << " " << n << endl;
swap(m, n);
cout << "m & n after swap: " << m << " " << n << endl;
cout << "i & j before swap: " << a << " " << b << endl;
swap(a, b);
cout << "i & j after swap: " << a << " " << b << endl;
}
int main(){
fun(100, 200, 11.22, 33.44);
return 0;
}
Thank you.
|
|
|
|
|
You have a name clash with std::swap . Try
#include <iostream>
using std::cout;
using std::endl;
template<class T>
void swap(T &x, T &y){
T temp = x;
x = y;
y = temp;
}
void fun(int m, int n, float a, float b){
cout << "m & n before swap: " << m << " " << n << endl;
swap(m, n);
cout << "m & n after swap: " << m << " " << n << endl;
cout << "i & j before swap: " << a << " " << b << endl;
swap(a, b);
cout << "i & j after swap: " << a << " " << b << endl;
}
int main()
{
fun(100, 200, 11.22, 33.44);
}
|
|
|
|
|
|
|
here in the below program i have tried to pass the values of 1d matrix to the object of class Vector by using constructor explicit call.
#include <iostream>
using namespace std;
const int size = 3;
class Vector {
int *v;
public:
Vector(){
v = new int[size];
for(int i=0; i<size; i++)
v[i] = 0;
}
Vector(int *a){
for(int i=0; i<size; i++){
v[i] = a[i] ;
}
}
int operator * (Vector &y){
int sum=0;
for(int i=0; i<size; i++)
sum += this->v[i] * y.v[i] ;
return sum;
}
void display(){
for(int i=0; i<size; i++)
cout << v[i] << " ";
cout << endl;
}
};
int main(){
int x[3] = {1, 2, 3};
int y[3] = {6, 3, 9};
Vector v1, v2;
v1 = y;
v2 = x;
cout << "v1 = ";
v1.display();
cout << "v2 = ";
v2.display();
cout << "v1 x v2 = " << v1 * v2 << endl ;
return 0;
}
Quote: cout << "v1 = ";
v1.display();
cout << "v2 = ";
v2.display();
the above portion of code doesn't works as expected it gives,
Output:
Quote:
6 3 9
6 3 9
v1 = 6 3 9
v2 = 6 3 9
v1 x v2 = 126
Expected:
Quote: Quote:
1 2 3
6 3 9
v1 = 1 2 3
v2 = 6 3 9
v1 x v2 = 39
Thank you
|
|
|
|
|
The program shouldn't run as is. In your second constructor, you aren't allocating the memory for the vector. (You also aren't deleting that memory in a destructor.)
Also note that you construct the v object in the default constructor. Since there is no assignment operator AND the second constructor is not marked explicit, it is creating a second instance of Vector and then member-wise copying that to the first, clobbering the allocation of v in the first instance. (Why your compiler is letting you get away with this is a mystery to me.)
So, add a destructor and then put "explicit" in front of the second constructor.
Then, remove the assignments and replace the declaration line with: v1(x), v2(y).
(It's still messy code, but in a debugger, you can now see how you are explicitly calling the second constructor and never the first.)
|
|
|
|
|
|
Since you know at compile time the array size, why don't you allocate it on the stack?
#include <iostream>
#include <array>
using namespace std;
constexpr size_t SIZE = 3;
class Vector
{
array<int,SIZE> x{};
public:
Vector (const array<int,SIZE> & a)
{
for (size_t n=0; n<SIZE; ++n)
x[n] = a[n];
}
int operator * (const Vector & v) const
{
int result = 0;
for (size_t n=0; n<SIZE; ++n)
result += x[n] * v.x[n];
return result;
}
friend ostream & operator << ( ostream & os, const Vector & v);
};
ostream & operator << ( ostream & os, const Vector & v)
{
for (const auto & i : v.x)
os << i << " ";
return os;
}
int main()
{
array<int, SIZE> x{1,2,3};
array<int, SIZE> y{6,3,9};
Vector v1{x};
Vector v2{y};
cout << "v1 " << v1 << endl;
cout << "v2 " << v2 << endl;
cout << "v1*v2 = " << (v1*v2) << endl;
}
|
|
|
|
|
I'm converting a project from VS2005 to VS2017 and have many errors:
I'm getting many errors for items not being members of System
Windows
Forms
Drawing
Panel
Button
etc....
Is there something missing from my old code or is this not available in VS2017 or is there something that wasn't included?
I think the app creates a form....
Any help to debug would be much appreciated.
Jim
|
|
|
|
|
Is it a managed C++ project type?
|
|
|
|
|