Click here to Skip to main content
15,887,350 members
Home / Discussions / C / C++ / MFC
   

C / C++ / MFC

 
Questionhow to write a display program or a reader program?? Pin
mr bard215-Oct-09 1:50
mr bard215-Oct-09 1:50 
AnswerRe: how to write a display program or a reader program?? Pin
Cedric Moonen15-Oct-09 1:54
Cedric Moonen15-Oct-09 1:54 
GeneralRe: how to write a display program or a reader program?? Pin
mr bard215-Oct-09 2:36
mr bard215-Oct-09 2:36 
GeneralRe: how to write a display program or a reader program?? Pin
Cedric Moonen15-Oct-09 2:45
Cedric Moonen15-Oct-09 2:45 
AnswerRe: how to write a display program or a reader program?? Pin
Richard MacCutchan15-Oct-09 2:09
mveRichard MacCutchan15-Oct-09 2:09 
QuestionRe: how to write a display program or a reader program?? Pin
David Crow15-Oct-09 3:17
David Crow15-Oct-09 3:17 
AnswerRe: how to write a display program or a reader program?? Pin
Rajesh R Subramanian15-Oct-09 3:43
professionalRajesh R Subramanian15-Oct-09 3:43 
QuestionIHTMLDocument2 for Table Par [modified] Pin
NaveenHS15-Oct-09 1:19
NaveenHS15-Oct-09 1:19 
Hello Everyone,


This is my 2nd Post regarding the Text extraction from the WebPages.

In my Previous post David Crow suggested me use IHTMLDocument2 interface,
In the code project depository I found this application

Parsing HTML using MSHTML [] by Philip Patrick ..Extracts all the links in the WebPages Using HREF Tag,

Can I use the same application to extract the Text from the Table from the WebPages ?

I searched for Table tag from which I can extract the text, I did not find any information. Can anyone please tell me is it possible to use MSHTML using IHTMLDocument2 interface can I extract the Text from the <table> Tag.

Thanking you,
Naveen HS.



void CTestDlg::OnBgo() 
{
	UpdateData();
	CWaitCursor wait;
	if(m_csFilename.IsEmpty()){
		AfxMessageBox(_T("Please specify the file to parse"));
		return;
	}
	CFile f;

	//let's open file and read it into CString (u can use any buffer to read though
	if (f.Open(m_csFilename, CFile::modeRead|CFile::shareDenyNone)) {
		m_wndLinksList.ResetContent();
		CString csWholeFile;
		f.Read(csWholeFile.GetBuffer(f.GetLength()), f.GetLength());
		csWholeFile.ReleaseBuffer(f.GetLength());
		f.Close();

		//declare our MSHTML variables and create a document
		MSHTML::IHTMLDocument2Ptr pDoc;
		MSHTML::IHTMLDocument3Ptr pDoc3;
		MSHTML::IHTMLElementCollectionPtr pCollection;
		MSHTML::IHTMLElementPtr pElement;

		HRESULT hr = CoCreateInstance(CLSID_HTMLDocument, NULL, CLSCTX_INPROC_SERVER, 
			IID_IHTMLDocument2, (void**)&pDoc);
		
		//put the code into SAFEARRAY and write it into document
		SAFEARRAY* psa = SafeArrayCreateVector(VT_VARIANT, 0, 1);
		VARIANT *param;
		bstr_t bsData = (LPCTSTR)csWholeFile;
		hr = SafeArrayAccessData(psa, (LPVOID*)&param);
		param->vt = VT_BSTR;
		param->bstrVal = (BSTR)bsData;
		
		hr = pDoc->write(psa);
		hr = pDoc->close();
		
		SafeArrayDestroy(psa);

		//I'll use IHTMLDocument3 to retrieve tags. Note it is available only in IE5+
		//If you don't want to use it, u can just run through all tags in HTML
		//(IHTMLDocument2->all property)
		pDoc3 = pDoc;
		
		//display HREF parameter of every link (A tag) in ListBox
		pCollection = pDoc3->getElementsByTagName(L"A");
		for(long i=0; i<pCollection->length; i++){
			pElement = pCollection->item(i, (long)0);
			if(pElement != NULL){
				//second parameter says that you want to get text inside attribute as is
				m_wndLinksList.AddString((LPCTSTR)bstr_t(pElement->getAttribute("href", 2)));
			}
		}
	}
}


modified on Thursday, October 15, 2009 8:30 AM

AnswerRe: IHTMLDocument2 for Table Par Pin
«_Superman_»15-Oct-09 6:40
professional«_Superman_»15-Oct-09 6:40 
GeneralRe: IHTMLDocument2 for Table Par Pin
NaveenHS16-Oct-09 1:06
NaveenHS16-Oct-09 1:06 
GeneralRe: IHTMLDocument2 for Table Par Pin
«_Superman_»16-Oct-09 5:01
professional«_Superman_»16-Oct-09 5:01 
GeneralRe: IHTMLDocument2 for Table Par Pin
NaveenHS19-Oct-09 1:02
NaveenHS19-Oct-09 1:02 
GeneralRe: IHTMLDocument2 for Table Par Pin
«_Superman_»19-Oct-09 5:32
professional«_Superman_»19-Oct-09 5:32 
GeneralRe: IHTMLDocument2 for Table Par Pin
NaveenHS20-Oct-09 2:21
NaveenHS20-Oct-09 2:21 
QuestionC++ Pin
john curtin14-Oct-09 23:46
john curtin14-Oct-09 23:46 
AnswerRe: C++ Pin
Richard MacCutchan15-Oct-09 0:00
mveRichard MacCutchan15-Oct-09 0:00 
AnswerRe: C++ Pin
CPallini15-Oct-09 0:06
mveCPallini15-Oct-09 0:06 
GeneralRe: C++ Pin
Richard MacCutchan15-Oct-09 0:35
mveRichard MacCutchan15-Oct-09 0:35 
GeneralRe: C++ Pin
Tim Craig15-Oct-09 19:06
Tim Craig15-Oct-09 19:06 
QuestionI/O files Pin
programmer20214-Oct-09 23:26
programmer20214-Oct-09 23:26 
AnswerRe: I/O files Pin
Richard MacCutchan14-Oct-09 23:38
mveRichard MacCutchan14-Oct-09 23:38 
QuestionRe: I/O files Pin
David Crow15-Oct-09 3:20
David Crow15-Oct-09 3:20 
QuestionMaximum client count of TCP/IP Pin
includeh1014-Oct-09 23:13
includeh1014-Oct-09 23:13 
AnswerRe: Maximum client count of TCP/IP Pin
Richard MacCutchan14-Oct-09 23:40
mveRichard MacCutchan14-Oct-09 23:40 
GeneralRe: Maximum client count of TCP/IP Pin
includeh1014-Oct-09 23:44
includeh1014-Oct-09 23:44 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.