It allows you to explore the English Wikipedia with a few added
benefits from our proprietary semantic and relations analysis
method, so that you can see similar pages (based on text content or
links), see the most relevant words for a page, and other stuff.
Spark is used for the processing of the English Wikipedia, and for
the computation. It takes about 30 minutes for three iterations of
our method on the whole 4.4M documents * 2.1M words matrix, on a
smallish cluster of 7 nodes with 4 core, 32GB RAM.
Any feedback is welcome (except on the aesthetic aspect, we already
know the UI is really bad)
Enjoy exploring Wikipedia in your spare time :)
+33(0)6 25 48 86 80
S.A.S. 41, rue Périer -
92120 Montrouge - FRANCE
Tel +33(0)1 84 16 36 77 / Fax +33(0)9 72 28 37