Click here to Skip to main content
15,891,607 members
Articles / Artificial Intelligence / Big Data

Parsing Wikipedia XML Dump

,
Rate me:
Please Sign up or sign in to vote.
4.94/5 (11 votes)
10 Apr 2021CPOL8 min read 28.6K   962   14  
Parser for Wikipedia pages from XML dump is presented. Extraction of biographical data and categories with their parents is shown as an example.

Views

Daily Counts

Downloads

Weekly Counts

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
United States United States
Programmer, developer, and researcher. Worked for several software companies, including Microsoft.
PhD in Theoretical Physics, D.Sc. in Solid State Physics.
Enjoys discovering patterns in nature, history and society, which break public opinion or are hidden from it.

Written By
Instructor / Trainer National Technical University of Ukraine "Kiev Pol
Ukraine Ukraine
Professor Vladimir Shatalov works on National Technical University of Ukraine 'Kyiv Polytechnic Institute', Slavutych branch, teaches students to Computer Science. Research interests include Data Mining, Artificial Intelligence, Theoretical Physics and Biophysics.
Research activity also concerns investigations of mechanisms of non-thermal electromagnetic and acoustic fields impacts on bio-liquids, effects of irradiations on physical and chemical properties of water.

Comments and Discussions