Click here to Skip to main content
15,889,096 members
Home / Discussions / C / C++ / MFC
   

C / C++ / MFC

 
AnswerRe: Handles Pin
nmap.terren17-Dec-06 14:19
nmap.terren17-Dec-06 14:19 
QuestionHow to get the CEdit box to update Pin
FredrickNorge17-Dec-06 5:32
FredrickNorge17-Dec-06 5:32 
AnswerRe: How to get the CEdit box to update Pin
BlitzPackage17-Dec-06 6:52
BlitzPackage17-Dec-06 6:52 
GeneralRe: How to get the CEdit box to update Pin
FredrickNorge17-Dec-06 7:05
FredrickNorge17-Dec-06 7:05 
Questionpointers to functions Pin
emrah.a17-Dec-06 3:58
emrah.a17-Dec-06 3:58 
AnswerRe: pointers to functions Pin
Chris Losinger17-Dec-06 4:43
professionalChris Losinger17-Dec-06 4:43 
AnswerRe: pointers to functions Pin
prasad_som17-Dec-06 18:22
prasad_som17-Dec-06 18:22 
GeneralNeed Article Co-Author Pin
Jeffrey Walton17-Dec-06 3:09
Jeffrey Walton17-Dec-06 3:09 
Hi All,

I'm looking to team with someone who has written an HTML parser (or can write an HTML parser). My scanners\pasrsers are generally LALR (from my Compiler Theory days in college), which is a bit different from HTML.

The requirements are loose - I don't need a DOM. Something similar to below would work well (written in C\C++):
while( EOF != document )
{
    Element = GetNextElement( document )

    // do something with Element
}

Element should look as follows:
struct _E
{
    string element              // "P", "H2", etc
    vector<string> attributes   // "size=0", etc
    string value                // What ever is between < P > and < /P >
} Element

I'm interested in the html 'primitives': < P > tag, < H# > tags, < TITLE >, < TABLE > (no need to break out the < TD >s and < TR >s. As I said, I am flexible. There is no need to return < HEAD > or < BODY > (hence the request for 'primitive' elements). To summarize, I want the 'leaves' of the tree (leaf nodes) - not the stuff encountered on the way down (branches). For proof of concept, attributes can be empty (they may be required later).

There is no need to convert between entity codes and characters. For example, &nbsp; does not need to be converted to it's corresponding white space (but it may in the future). Same with the C\C++ '\t' - character 0x09 can stay that way (for now).

Additionally, the co-author will be responsible for file rotation. Think of it as a log file for this purpose. Assume there will be at least 8 files to rotate (first in, first out). The data to be read and written will be a vector< string >:
vector< string >: ReadFile( Some sort of Time identifier )
void WriteFile( vector< string >: )

The algorithm does need to be deterministic (duh) - run on the same document, it must produce the same results each time.

So, the co-author should:
* create GetNextElement( )
* file I/O
* file rotation
* well document it - the byte scanner and tokenizer should take at least 3 pages. Aho, Sethi, and Ullman managed to produce 350 pages on this portion of a front end in Compilers - Principles, Techniques, and Tools[^].

I will:
* add the usage code of Element
* remaster screen shots in Photoshop
* coordinate the publication

I'm anal about article write ups. I toss out 5's for three sentence articles with a pretty Screen shot, but that is not what I expect of myself. Please see here[^] for my articles (so you can get a feel for what I expect).

I generally post to two sites: Code Project and Code Guru. It would be nice (but definetly not required) if the co-author had a Code Guru account.

Any takers? If more than one taker, I'll ask that you fight it out amongst yourselves, or take on an additional co-author. I' don't want to have to choose. I'm actually more concerned no one will step up to the plate, so don't be shy.

If successful, I want to move the Project to SourceForge. At that time the co-author can share in the Administrative responsibilities. The project will be called WebGrits. You'll understand later when the poetry is in motion.

BTW, my portion is complete Wink | ;) It is another Crypto++ project based on hashing.

Jeff
QuestionDouble buffering in a wmp Visualization Pin
ceejeeb17-Dec-06 3:08
ceejeeb17-Dec-06 3:08 
QuestionBest way to interact with a C program Pin
Luís Brás17-Dec-06 1:59
Luís Brás17-Dec-06 1:59 
AnswerRe: Best way to interact with a C program Pin
Mark Salsbery17-Dec-06 8:58
Mark Salsbery17-Dec-06 8:58 
QuestionSend enter key? Pin
Larsson17-Dec-06 1:35
Larsson17-Dec-06 1:35 
AnswerRe: Send enter key? Pin
Daniel Kanev17-Dec-06 2:48
Daniel Kanev17-Dec-06 2:48 
QuestionDLL Coding Dilemma... Pin
Shy Agam16-Dec-06 22:52
Shy Agam16-Dec-06 22:52 
AnswerRe: DLL Coding Dilemma... Pin
peterchen17-Dec-06 0:04
peterchen17-Dec-06 0:04 
GeneralRe: DLL Coding Dilemma... Pin
Shy Agam17-Dec-06 0:18
Shy Agam17-Dec-06 0:18 
GeneralRe: DLL Coding Dilemma... Pin
peterchen17-Dec-06 0:46
peterchen17-Dec-06 0:46 
GeneralRe: DLL Coding Dilemma... Pin
Shy Agam17-Dec-06 1:56
Shy Agam17-Dec-06 1:56 
QuestionHow to save HBITMAP into *.png?? Pin
314159265316-Dec-06 18:05
314159265316-Dec-06 18:05 
AnswerRe: How to save HBITMAP into *.png?? Pin
Hadi Dayvary16-Dec-06 18:41
professionalHadi Dayvary16-Dec-06 18:41 
QuestionRom loaders. Pin
asp.netProgrammer16-Dec-06 17:26
asp.netProgrammer16-Dec-06 17:26 
AnswerRe: Rom loaders. Pin
Jeffrey Walton17-Dec-06 3:31
Jeffrey Walton17-Dec-06 3:31 
Questionregistry clearners Pin
locoone16-Dec-06 16:21
locoone16-Dec-06 16:21 
QuestionON_WM_MOUSEMOVE in CFrameWnd Pin
gokings16-Dec-06 16:08
gokings16-Dec-06 16:08 
AnswerRe: ON_WM_MOUSEMOVE in CFrameWnd Pin
includeh1016-Dec-06 20:17
includeh1016-Dec-06 20:17 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.