Skip to content

bfaure/WikiClassify2.0

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

590 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WikiClassify2.0

Instructions

Command Line User Interface

cd into the WikiClassify2.0 base directory and run python main.py to start the automated workflow in the following order:

  • Download latest Wikipedia data dump bz2 archive
  • Extract archive into .xml format
  • Compile C++ parser files
  • Parse .xml data, sending bursts to remove server at 1000 article increments
  • Train word2vec and LDA models

After a model is present in the working directory, a subsequent call to python main.py will open the interface created to interact with the models (including A* path search and A* convene functions). A call to python main.py with a -c launch parameter will clean the working directory of models and downloaded data.

Graphic User Interface

Run python main.py with a -g launch parameter to open the user interface main menu.
Alt text

[WikiServer]

Enter server credentials.
Alt text

View articles database.
Alt text

Control database actions.
Alt text

WikiParse

Configure parser launch parameters.
Alt text

WikiLearn

Live A* Path Search
Alt text

Dependencies

Python

  • Python 2.7
  • numpy (pip install numpy)
  • g++
  • gensim (pip install gensim)
  • sklearn (pip install sklearn))
  • PyQt4 (apt-get install python-qt4)
  • psycopg2(pip install psycopg2)

C++

Both of the following packages can be install via the command line using package manager such as apt-get on Ubuntu.

  • libpq-dev (apt-get install libpq-dev)
  • libpqxx-dev(apt-get install libpqxx-dev)

Related Repositories

Chrome Extension Project Website Former Repo

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •